Artificial intelligencefromWIRED1 week agoPsychological Tricks Can Get AI to Break the RulesHuman-style persuasion techniques can often cause some LLMs to violate system prompts and comply with objectionable requests.
Artificial intelligencefromTheregister2 weeks agoOne long sentence is all it takes to make LLMs misbehavePoorly punctuated, long run-on prompts can bypass LLM guardrails, enabling jailbreaks that expose harmful outputs despite alignment training.