The Latest
AI Agents Reveal Vulnerable Secrets in Language Models’ Safety Buffers
Language models’ “safety buffers” are failing spectacularly as AI agents eagerly spill sensitive secrets through basic roleplaying tricks. Browsing agents prove especially vulnerable. Your confidential data isn’t so confidential anymore.