News

Jun 16, 2026 Giving an invited talk at the 2026 CityUDG: “Hands-on with Claude Code: Frontier Code Agents for Research and Browser/Computer Use”.
Jun 05, 2026 Invited to speak at the 2026 Tencent Cloud AI Industry Applications Summit (China National Convention Center, Beijing) on “SciencePal 2.0: Your Self-Evolving Science Agent Team”.
Jun 03, 2026 Invited as a reviewer for ACM Computing Surveys (CSUR).
May 23, 2026 Invited as a reviewer for IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
May 14, 2026 Received the ICML 2026 Gold Reviewer Award (top 25% of reviewers).
May 05, 2026 Our Defense-to-Attack jailbreak study on VLMs is accepted by Pattern Recognition.
May 01, 2026 Two papers accepted to ICML’26: Just Ask (curious code agents revealing system prompts in frontier LLMs) and STARE (step-wise temporal red-teaming of multi-modal toxicity).
Mar 31, 2026 We release System-Prompt-Open, an open database of system prompts extracted from frontier LLMs. Check out the project website and GitHub repo.
Mar 28, 2026 Our survey on embodied AI safety with 400+ papers is now available on arXiv, covering risks, attacks, and defenses across perception, cognition, planning, interaction, and agentic systems.
Mar 09, 2026 Joined HKAI-Sci as Research Assistant Professor.
Mar 04, 2026 Our OpenRedRL benchmark for RL-based red teaming is published in Frontiers of Computer Science.
Feb 20, 2026 Our work on red teaming text-to-image generators is accepted by CVPR’26.
Jan 29, 2026 Released JustAsk, a framework where curious code agents reveal system prompts in frontier LLMs.
Jan 15, 2026 Our survey on large model and agent safety is published in Foundations and Trends® in Privacy and Security.
Jan 23, 2025 Our work on reinforced defense for VLMs is accepted by ICLR’25.
Dec 14, 2024 Our work on RL-based auditing for LLMs is accepted by AAAI’25.
Apr 17, 2024 Our work on intrinsic motivation for RL is accepted by IJCAI’24.
Mar 22, 2024 Our work on adversarial policy learning in RL is accepted by DSN’24.