Yang Zhang (张阳)

I am a tenured faculty (equivalent to full professor) at CISPA Helmholtz Center for Information Security. I sometimes also chime in iDRAMA Lab for the memes.

Research Areas

Trustworthy Machine Learning, with a focus on LLMs (Safety, Privacy, and Security)
Misinformation, Hate Speech, and Memes
Social Network Analysis

I’m always looking for motivated students and postdocs to join my group. If you are interested, please write me an email (zhang@cispa.de).

Awards

Best paper finalist at CSAW Europe 2023
Best paper award honorable mention at CCS 2022
Busy Beaver teaching award nomination for seminar “Privacy of Machine Learning” at Saarland University (2022 Winter)
Busy Beaver teaching award nomination for advanced lecture “Machine Learning Privacy” at Saarland University (2022 Summer)
Busy Beaver teaching award for seminar “Privacy of Machine Learning” at Saarland University (2021 Winter)
Distinguished reviewer award at TrustML Workshop 2020 (co-located with ICLR 2020)
Distinguished paper award at NDSS 2019
Best paper award at ARES 2014

What’s New

[5/2024] We released SecurityNet, a large-scale dataset containing more than 1000 models for evaluating attacks and defenses in the field of trustworthy machine learning!
[4/2024] I’ll join the PC of NDSS 2025!
[4/2024] One paper titled ““Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models” got accepted in CCS 2024!
[3/2024] One paper titled “Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming” got accepted in ICWSM 2024!
[3/2024] One paper titled “Composite Backdoor Attacks Against Large Language Models” got accepted in NAACL Findings 2024!
[2/2024] We established TrustAIRLab, a GitHub organization that includes many (upgraded versions of) codes published by my lab; please take a look!
[2/2024] One paper titled “Prompt Stealing Attacks Against Text-to-Image Generation Models” got accepted in USENIX Security 2024!
[12/2023] I’ll join the PC of CCS 2024!
[10/2023] Our paper “Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots” is in the best paper finalist of CSAW Europe 2023!
[10/2023] I’ll be the publicity chair of CCS 2024 #haha!
[10/2023] Mingjie Li, Zeyuan Chen, and Qingqing Dong joined the team!
[10/2023] Zheng Li has successfully passed his Ph.D. defense! Congratulations, Dr. Li!
[9/2023] Our research on Jailbreak Prompts got covered by New Scientist!
[9/2023] One paper titled “SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models” got accepted in USENIX Security 2024!
[9/2023] One paper titled “Quantifying Privacy Risks of Prompts in Visual Prompt Learning” got accepted in USENIX Security 2024!
[8/2023] Xinlei He has successfully passed his Ph.D. defense! Congratulations, Dr. He!
[7/2023] One paper titled “You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content” got accepted in Oakland 2024!
[7/2023] One paper titled “Test-Time Poisoning Attacks Against Test-Time Adaptation Models” got accepted in Oakland 2024!
[7/2023] I got invited to give keynote speech at ISC 2023 and ACISP 2024!
[5/2023] One paper titled “DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models” got accepted in CCS 2023!
[5/2023] One paper titled “Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models” got accepted in CCS 2023!
[5/2023] One paper titled “NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models” got accepted in ACL 2023!
[4/2023] One paper titled “Generated Graph Detection” got accepted in ICML 2023!
[4/2023] One paper titled “Data Poisoning Attacks Against Multimodal Encoders” got accepted in ICML 2023!
[4/2023] One paper titled “Two-in-One: A Model Hijacking Attack Against Text Generation Models” got accepted in USENIX Security 2023!
[4/2023] We released a new technical report In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT on the trustworthiness of ChatGPT!
[3/2023] We released MGTBench, a benchmark for the current machine-generated text (by ChatGPT) detection methods.
[3/2023] One paper titled “FACE-AUDITOR: Data Auditing in Facial Recognition Systems” got accepted in USENIX Security 2023!
[3/2023] I join the editorial board of ACM TOPS!
[3/2023] We released MLHospital, a python package to evaluate machine learning models’ security and privacy risks. MLHospital is under continual development, and we welcome contributors!
[3/2023] I successfully passed my tenure-track evaluation and become a tenured faculty at CISPA!