SneakyPrompt: Revealing the vulnerabilities of text-to-image AI
Published:
In the rapidly evolving field of artificial intelligence (AI), understanding and improving AI security is increasingly crucial. Yuchen Yang, a third-year doctoral student advised by Yinzhi Cao, employed an automated attack framework to reveal the vulnerabilities in text-to-image generative models such as DALL·E 3 and Stable Diffusion. The paper, “SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models’ Safety Filters,” formerly titled “SneakyPrompt: Jailbreaking Text-to-image Generative Models,” will be presented at the 45th Institute of Electrical and Electronics Engineers (IEEE) Symposium on Security and Privacy.
Link
Originally Published on The Johns Hopkins News-Letter, 2023.
