News | Xiang Zheng

May 05, 2026	Our Defense-to-Attack jailbreak study on VLMs is accepted by Pattern Recognition.
May 05, 2026	Two papers accepted to ICML’26: Just Ask (curious code agents revealing system prompts in frontier LLMs) and STARE (step-wise temporal red-teaming of multi-modal toxicity).
Mar 31, 2026	We release System-Prompt-Open, an open database of system prompts extracted from frontier LLMs. Check out the project website and GitHub repo.
Mar 28, 2026	Our survey on embodied AI safety with 400+ papers is now available, covering risks, attacks, and defenses across perception, cognition, planning, interaction, and agentic systems.
Mar 09, 2026	Joined HKAI-Sci as Research Assistant Professor on Mar 9.
Mar 04, 2026	Our OpenRedRL benchmark for RL-based red teaming is published in Frontiers of Computer Science.
Feb 20, 2026	Our work on red teaming text-to-image generators is accepted by CVPR’26.
Jan 29, 2026	Released JustAsk, a framework where curious code agents reveal system prompts in frontier LLMs.
Jan 15, 2026	Our survey on large model and agent safety is published in Foundations and Trends® in Privacy and Security.
Jan 23, 2025	Our work on reinforced defense for VLMs is accepted by ICLR’25.
Dec 14, 2024	Our work on RL-based auditing for LLMs is accepted by AAAI’25.
Apr 17, 2024	Our work on intrinsic motivation for RL is accepted by IJCAI’24.
Mar 22, 2024	Our work on adversarial policy learning in RL is accepted by DSN’24.