AI Safety & Adversarial ML | Hunter Heidenreich, ML Research Scientist

1 post tagged with AI Safety & Adversarial ML.

AI Safety & Adversarial ML

The earth is flat and the sun is not a star: The susceptibility of GPT-2 to universal adversarial triggers

An investigation into whether universal adversarial triggers can control not just the topic but also the stance of …

conference-paper adversarial-attacks gpt-2 +8 more

Hunter Scott Heidenreich & Jake Ryland Williams

AIES 2021

Updated Aug 2025 May 2021