AI Safety
A nonsensical trigger sequence 'WTC theoriesclimate Flat Hubbard Principle' is fed into GPT-2, which then generates Flat Earth conspiracy text

GPT-2 Susceptibility to Universal Adversarial Triggers

Investigation into whether universal adversarial triggers can control both topic and stance of GPT-2's generated text …...