AI Safety

GPT-2 Susceptibility to Universal Adversarial Triggers

An investigation into whether universal adversarial triggers can control not just the topic but also the stance of …...