My publication list can also be found at DBLP and Google Scholar, however, they may not be up to date.
2025
White-box Membership Inference Attacks against Diffusion Models
Yan Pang, Tianhao Wang, Xuhui Kang, Mengdi Huai, Yang Zhang; PoPETS 2025A Comprehensive Study of Privacy Risks in Curriculum Learning
Joann Qiongna Chen, Xinlei He, Zheng Li, Yang Zhang, Zhou Li; PoPETS 2025Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?
Rui Wen, Michael Backes, Yang Zhang; NDSS 20252024
The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang; EMNLP 2024ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang; EMNLP 2024Reconstruct Your Previous Conversations! Comprehensively Investigating Privacy Leakage Risks in Conversations with GPT Models
Junjie Chu, Zeyang Sha, Michael Backes, Yang Zhang; EMNLP 2024Membership Inference Attacks Against In-Context Learning
Rui Wen, Zheng Li, Michael Backes, Yang Zhang; CCS 2024Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution
Yixin Wu, Yun Shen, Michael Backes, Yang Zhang; CCS 2024BadMerging: Backdoor Attacks Against Model Merging
Jinghuai Zhang, Jianfeng Chi, Zheng Li, Kunlin Cai, Yang Zhang, Yuan Tian; CCS 2024ZeroFake: Zero-Shot Detection of Fake Images Generated and Edited by Text-to-Image Generation Models
Zeyang Sha, Yicong Tan, Mingjie Li, Michael Backes, Yang Zhang; CCS 2024SeqMIA: Sequential-Metric Based Membership Inference Attack
Hao Li, Zheng Li, Siyuan Wu, Chengrui Hu, Yutong Ye, Min Zhang, Dengguo Feng, Yang Zhang; CCS 2024MGTBench: Benchmarking Machine-Generated Text Detection
Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang; CCS 2024"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang; CCS 2024pdf arxiv code dataset Media Coverage: New Scientist, Deutschlandfunk Nova
Instruction Backdoor Attacks Against Cutomized LLMs
Rui Zhang, Hongwei Li, Rui Wen, Wenbo Jiang, Yuan Zhang, Michael Backes, Yun Shen, Yang Zhang; USENIX Security 2024Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang; USENIX Security 2024SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models
Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang; USENIX Security 2024Quantifying Privacy Risks of Prompts in Visual Prompt Learning
Yixin Wu, Rui Wen, Michael Backes, Pascal Berrang, Mathias Humbert, Yun Shen, Yang Zhang; USENIX Security 2024Link Stealing Attacks Against Inductive Graph Neural Networks
Yixin Wu, Xinlei He, Pascal Berrang, Mathias Humbert, Michael Backes, Neil Zhenqiang Gong, Yang Zhang; PoPETS 2024Composite Backdoor Attacks Against Large Language Models
Hai Huang, Zhengyu Zhao, Michael Backes, Yun Shen, Yang Zhang; NAACL Findings 2024Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang; ICWSM 2024FAKEPCD: Fake Point Cloud Detection via Source Attribution
Yiting Qu, Zhikun Zhang, Yun Shen, Michael Backes, Yang Zhang; ASIACCS 2024You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content
Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang; S&P 2024Test-Time Poisoning Attacks Against Test-Time Adaptation Models
Tianshuo Cong, Xinlei He, Yun Shen, Yang Zhang; S&P 2024Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models
Minxing Zhang, Ning Yu, Rui Wen, Michael Backes, Yang Zhang; WACV 2024VERITRAIN: Validating MLaaS Training Efforts via Anomaly Detection
Xiaokuan Zhang, Yang Zhang, Yinqian Zhang; IEEE Transactions on Dependable and Secure ComputingBreaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
Boyang Zhang, Yicong Tan, Yun Shen, Ahmed Salem, Michael Backes, Savvas Zannettou, Yang ZhangICLGuard: Controlling In-Context Learning Behavior for Applicability Authorization
Wai Man Si, Michael Backes, Yang ZhangSOS! Soft Prompt Attack Against Open-Source Large Language Models
Ziqing Yang, Michael Backes, Yang Zhang, Ahmed SalemVoice Jailbreak Attacks Against GPT-4o
Xinyue Shen, Yixin Wu, Michael Backes, Yang ZhangUnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Yiting Qu, Xinyue Shen, Yixin Wu, Michael Backes, Savvas Zannettou, Yang ZhangVGMShield: Mitigating Misuse of Video Generative Models
Yan Pang, Yang Zhang, Tianhao WangPrompt Stealing Attacks Against Large Language Models
Zeyang Sha, Yang ZhangComprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang2023
DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models
Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang; CCS 2023Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang; CCS 2023Differentially Private Resource Allocation
Joann Qiongna Chen, Tianhao Wang, Zhikun Zhang, Yang Zhang, Somesh Jha, Zhou Li; ACSAC 2023A Plot is Worth a Thousand Words: Model Information Stealing Attacks via Scientific Plots
Boyang Zhang, Xinlei He, Yun Shen, Tianhao Wang, Yang Zhang; USENIX Security 2023Two-in-One: A Model Hijacking Attack Against Text Generation Models
Wai Man Si, Michael Backes, Yang Zhang, Ahmed Salem; USENIX Security 2023UnGANable: Defending Against GAN-based Face Manipulation
Zheng Li, Ning Yu, Ahmed Salem, Michael Backes, Mario Fritz, Yang Zhang; USENIX Security 2023pdf arxiv code Media Coverage: Mimikama, it-sicherheit.de, SOLARIFY, elektroniknet.de, Digitale Schweiz, innovations report
FACE-AUDITOR: Data Auditing in Facial Recognition Systems
Min Chen, Zhikun Zhang, Michael Backes, Tianhao Wang, Yang Zhang; USENIX Security 2023PrivTrace: Differentially Private Trajectory Synthesis by Adaptive Markov Model
Haiming Wang, Zhikun Zhang, Tianhao Wang, Shibo He, Michael Backes, Jiming Chen, Yang Zhang; USENIX Security 2023Generated Graph Detection
Yihan Ma, Zhikun Zhang, Ning Yu, Xinlei He, Michael Backes, Yun Shen, Yang Zhang; ICML 2023Data Poisoning Attacks Against Multimodal Encoders
Ziqing Yang, Xinlei He, Zheng Li, Michael Backes, Mathias Humbert, Pascal Berrang, Yang Zhang; ICML 2023NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models
Kai Mei, Zheng Li, Zhenting Wang, Yang Zhang, Shiqing Ma; ACL 2023Can't Steal? Cont-Steal! Contrastive Stealing Attacks Against Image Encoders
Zeyang Sha, Xinlei He, Ning Yu, Michael Backes, Yang Zhang; CVPR 2023On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning
Yiting Qu, Xinlei He, Shannon Pierson, Michael Backes, Yang Zhang, Savvas Zannettou; S&P 2023Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
Rui Wen, Zhengyu Zhao, Zhuoran Liu, Michael Backes, Tianhao Wang, Yang Zhang; ICLR 2023 (spotlight)Backdoor Attacks Against Dataset Distillation
Yugeng Liu, Zheng Li, Michael Backes, Yun Shen, Yang Zhang; NDSS 2023Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network
Xiaojian Yuan, Kejiang Chen, Jie Zhang, Weiming Zhang, Nenghai Yu, Yang Zhang; AAAI 2023Comprehensive Assessment of Toxicity in ChatGPT
Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang ZhangOn the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign Prompts
Yixin Wu, Ning Yu, Michael Backes, Yun Shen, Yang ZhangLast One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning
Rui Wen, Tianhao Wang, Michael Backes, Yang Zhang, Ahmed SalemPrompt Backdoors in Visual Prompt Learning
Hai Huang, Zhengyu Zhao, Michael Backes, Yun Shen, Yang ZhangRobustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models
Yugeng Liu, Tianshuo Cong, Zhengyu Zhao, Michael Backes, Yun Shen, Yang ZhangMondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing
Wai Man Si, Michael Backes, Yang ZhangWatermarking Diffusion Model
Yugeng Liu, Zheng Li, Michael Backes, Yun Shen, Yang ZhangIn ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
Xinyue Shen, Zeyuan Chen, Michael Backes, Yang ZhangGenerative Watermarking Against Unauthorized Subject-Driven Image Synthesis
Yihan Ma, Zhengyu Zhao, Xinlei He, Zheng Li, Michael Backes, Yang Zhang2022
Amplifying Membership Exposure via Data Poisoning
Yufei Chen, Chao Shen, Yun Shen, Cong Wang, Yang Zhang; NeurIPS 2022Why So Toxic?: Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Wai Man Si, Michael Backes, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Savvas Zannettou, Yang Zhang; CCS 2022pdf arxiv Media Coverage: Fast Company
Best Paper Award Honorable Mention
On the Privacy Risks of Cell-Based NAS Architectures
Hai Huang, Zhikun Zhang, Yun Shen, Michael Backes, Qi Li, Yang Zhang; CCS 2022Membership Inference Attacks by Exploiting Loss Trajectory
Yiyong Liu, Zhengyu Zhao, Michael Backes, Yang Zhang; CCS 2022Auditing Membership Leakages of Multi-Exit Networks
Zheng Li, Yiyong Liu, Xinlei He, Ning Yu, Michael Backes, Yang Zhang; CCS 2022Graph Unlearning
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang; CCS 2022SSLGuard: A Watermarking Scheme for Self-supervised Learning Pre-trained Encoders
Tianshuo Cong, Xinlei He, Yang Zhang; CCS 2022Finding MNEMON: Reviving Memories of Node Embeddings
Yun Shen, Yufei Han, Zhikun Zhang, Min Chen, Ting Yu, Michael Backes, Yang Zhang, Gianluca Stringhini; CCS 2022Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning
Xinlei He, Hongbin Liu, Neil Zhenqiang Gong, Yang Zhang; ECCV 2022Teacher Model Fingerprinting Attacks Against Transfer Learning
Yufei Chen, Chao Shen, Cong Wang, Yang Zhang; USENIX Security 2022ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models
Yugeng Liu, Rui Wen, Xinlei He, Ahmed Salem, Zhikun Zhang, Michael Backes, Emiliano De Cristofaro, Mario Fritz, Yang Zhang; USENIX Security 2022Inference Attacks Against Graph Neural Networks
Zhikun Zhang, Min Chen, Michael Backes, Yun Shen, Yang Zhang; USENIX Security 2022On Xing Tian and the Perseverance of Anti-China Sentiment Online
Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang; ICWSM 2022Model Stealing Attacks Against Inductive Graph Neural Networks
Yun Shen, Xinlei He, Yufei Han, Yang Zhang; S&P 2022Get a Model! Model Hijacking Attack Against Machine Learning Models
Ahmed Salem, Michael Backes, Yang Zhang; NDSS 2022Property Inference Attacks Against GANs
Junhao Zhou, Yufei Chen, Chao Shen, Yang Zhang; NDSS 2022Dynamic Backdoor Attacks Against Machine Learning Models
Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, Yang Zhang; EuroS&P 2022FairSR: Fairness-aware Sequential Recommendation through Multi-Task Learning with Preference Graph Embeddings
Cheng-Te Li, Cheng Hsu, Yang Zhang; ACM Transactions on Intelligent Systems and TechnologyFine-Tuning Is All You Need to Mitigate Backdoor Attacks
Zeyang Sha, Xinlei He, Pascal Berrang, Mathias Humbert, Yang ZhangMembership Inference Attacks Against Text-to-image Generation Models
Yixin Wu, Ning Yu, Zheng Li, Michael Backes, Yang ZhangBackdoor Attacks in the Supply Chain of Masked Image Modeling
Xinyue Shen, Xinlei He, Zheng Li, Yun Shen, Michael Backes, Yang ZhangMembership-Doctor: Comprehensive Assessment of Membership Inference Against Machine Learning Models
Xinlei He, Zheng Li, Weilin Xu, Cory Cornelius, Yang Zhang2021
Quantifying and Mitigating Privacy Risks of Contrastive Learning
Xinlei He, Yang Zhang; CCS 2021When Machine Unlearning Jeopardizes Privacy
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang; CCS 2021Membership Inference Attacks Against Recommender Systems
Minxing Zhang, Zhaochun Ren, Zihan Wang, Pengjie Ren, Zhumin Chen, Pengfei Hu, Yang Zhang; CCS 2021Membership Leakage in Label-Only Exposures
Zheng Li, Yang Zhang; CCS 2021BadNL: Backdoor Attacks Against NLP Models with Semantic-preserving Improvements
Xiaoyi Chen, Ahmed Salem, Michael Backes, Shiqing Ma, Qingni Shen, Zhonghai Wu, Yang Zhang; ACSAC 2021Stealing Links from Graph Neural Networks
Xinlei He, Jinyuan Jia, Michael Backes, Neil Zhenqiang Gong, Yang Zhang; USENIX Security 2021PrivSyn: Differentially Private Data Synthesis
Zhikun Zhang, Tianhao Wang, Jean Honorio, Ninghui Li, Michael Backes, Shibo He, Jiming Chen, Yang Zhang; USENIX Security 2021“Go eat a bat, Chang!”: On the Emergence of Sinophobic Behavior on Web Communities in the Face of COVID-19
Fatemeh Tahmasbi, Leonard Schild, Chen Ling, Jeremy Blackburn, Gianluca Stringhini, Yang Zhang, Savvas Zannettou; WWW 2021pdf arxiv Media Coverage: The Washington Post