arXiv:2605.02900 · v2 (May 2026)

Safety in Embodied AI
A Survey of Risks, Attacks, and Defenses

CC BY 4.0 Awesome 500+ Papers Maintained

Read Paper GitHub

Xiao Li^1,*, Xiang Zheng^4,*, Yifeng Gao¹, Xinyu Xia⁵, Yixu Wang¹, Xin Wang¹, Ye Sun¹, Yunhan Zhao¹, Ming Wen^1,3, Jiayu Li¹, Zixing Chen¹, Xun Gong⁵, Yi Liu⁴, Yige Li⁶, Yutao Wu⁷, Cong Wang⁴, Jun Sun⁶, Yixin Cao^1,2,3, Zhineng Chen^1,3, Jingjing Chen^1,3, Tao Gui^1,2,3, Qi Zhang^1,3, Zuxuan Wu^1,2,3, Xipeng Qiu^1,2,3, Xuanjing Huang^1,3, Tiehua Zhang⁸, Zhipeng Wei¹⁰, Kun Wang¹¹, Xinfeng Li¹¹, Hanxun Huang¹³, Sarah Erfani¹³, James Bailey¹³, Jianping Wang⁴, Chaowei Xiao¹⁴, Ran He¹², Bo Li⁹,
Xingjun Ma^1,2,3,†, Yu-Gang Jiang^1,3,†
* Equal Contribution † Corresponding Authors

¹ Institute of Trustworthy Embodied AI, Fudan University · ² Shanghai Innovation Institute · ³ Shanghai Key Laboratory of Multimodal Embodied AI · ⁴ City University of Hong Kong · ⁵ Jilin University · ⁶ Singapore Management University · ⁷ Deakin University · ⁸ Tongji University · ⁹ UIUC · ¹⁰ UC Berkeley · ¹¹ Nanyang Technological University · ¹² Chinese Academy of Sciences · ¹³ The University of Melbourne · ¹⁴ Johns Hopkins University

559Papers Surveyed

5Taxonomy Layers

18Subcategories

38Authors

14Institutions

Abstract

Embodied Artificial Intelligence (Embodied AI) integrates perception, cognition, planning, and interaction into agents that operate in open-world, safety-critical environments. As these systems gain autonomy and enter domains such as transportation, healthcare, and industrial or assistive robotics, ensuring their safety becomes both technically challenging and socially indispensable. Unlike digital-only AI systems, embodied agents must act under uncertain sensing, incomplete knowledge, and dynamic human–robot interactions, where failures can directly lead to physical harm.

This survey provides a comprehensive and structured review of safety research in embodied AI, examining attacks and defenses across the full embodied pipeline, from perception and cognition to planning and interaction. We introduce a multi-level taxonomy that unifies fragmented lines of work and connects embodied-specific safety findings with broader advances in vision, language, and multimodal foundation models. Our review synthesizes insights from over 500 papers spanning adversarial, backdoor, jailbreak, and hardware-level attacks; attack detection, safe training and inference; and risk-aware human–agent interaction.

This analysis reveals several overlooked challenges, including the fragility of multimodal perception fusion, the instability of planning under jailbreak attacks, and the trustworthiness of human–agent interaction in open-ended scenarios. By organizing the field into a coherent framework and identifying critical research gaps, this survey provides a roadmap for building embodied agents that are not only capable and autonomous but also safe, robust, and reliable in real-world deployment.

Overview

Capability vs. Risk Duality

Figure 1: Capability vs. risk duality in embodied AI systems. As capabilities expand outward from perception to agentic systems, the attack surface grows correspondingly; vulnerabilities at inner layers cascade to outer layers.

Survey Structure

Figure 2: Illustration of safety threats and attack surfaces across capability layers of embodied AI systems.

Overview of Attack and Defense Methods

Figure 3: Overview of representative attack and defense methods across perception, cognition, planning, action & interaction, and agentic system layers. The width of the strips is proportional to the number of reviewed works.

Survey Scope

We review 500+ papers across five capability layers of embodied AI, covering adversarial, backdoor, jailbreak, and hardware-level attacks alongside detection, safe training, and risk-aware interaction defenses.

Layer	Topics Covered	Papers
Perception	Visual · Auditory · Spatial · Motion · Cross-Modal Perception	199
Cognition	Instruction Understanding · World Model · Reasoning	39
Planning	Task Planning · Trajectory Planning · Multi-Agent Planning	80
Action and Interaction	Robot Control · Human-Agent Interaction · Multi-Agent Collaboration	112
Agentic System	Tool Use and Skill · Memory · Self-Evolving · Cascading Risks	96
Other Related Works	Surveys & Reviews · Foundation / World / VLA Models · Benchmarks	33
Total (unique papers surveyed)		559

At a Glance

A decade of embodied-AI safety research at a glance. Hover any chart to inspect; click a taxonomy layer or a venue to filter the list below.

Papers per Year

Taxonomy — click a layer to drill in

Venue Type

Top Venues — journals & conferences, grouped by color

Surveyed Papers

Jump: Perception (199) Cognition (39) Planning (80) Action & Interaction (112) Agentic (96) Other Related Works (33)

Perception 199 papers

Visual Perception (58)

Adversarial Attacks and Detection in Visual Place Recognition for Safer Robot Navigation. Malone et al. arXiv 2506.15988, 2025.
Embodied Active Defense: Leveraging Recurrent Feedback to Counter Adversarial Patches. Wu et al. arXiv 2404.00540, 2024.
Physical Adversarial Clothing Evades Visible-Thermal Detectors via Non-Overlapping RGB-T Pattern. Zhu et al. arXiv 2605.04675, 2026.
Securing the Lane: Defences Against Patch Attacks on Autonomous Vehicle's Lane Detection. Blazevic et al. In EuroS&PW, 2025.
Detecting Adversarial Attacks Based on Tracking Differences in Frequency Bands. Li et al. IEEE Transactions on Multimedia (TMM), 2025.
Multi-view Feature Discrepancy Attack for Single Object Tracking. Li et al. In ICASSP, 2025.
Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach. Liao et al. arXiv 2508.15778, 2025.
Stealthy backdoor attack in self-supervised learning vision encoders for large vision language models. Liu et al. In CVPR, 2025.
PB-UAP: Hybride Universal Adversarial Attack for Image Segmentation. Song et al. In ICASSP, 2025.
Test-Time Multimodal Backdoor Detection by Contrastive Prompting. Niu et al. In ICML, 2025.
Anyattack: Towards Large-scale Self-supervised Adversarial Attacks on Vision-language Models. Zhang et al. In CVPR, 2025.
RP-PGD: Boosting Segmentation Robustness with a Region-and-Prototype Based Adversarial Attack. Zhang et al. In AAAI, 2025.
A Robust UAV Tracking Solution in the Adversarial Environment. Jia et al. In ICTAI, 2024.
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning. Liang et al. In CVPR, 2024.
ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving. Ma et al. arXiv 2406.05810, 2024.
Uncertainty-weighted Loss Functions for Improved Adversarial Attacks on Semantic Segmentation. Maag and Fischer. In WACV, 2024.
Discovering New Shadow Patterns for Black-Box Attacks on Lane Detection of Autonomous Vehicles. MohajerAnsari et al. arXiv 2409.18248, 2024.
TrackPGD: Efficient Adversarial Attack using Object Binary Masks against Robust Transformer Trackers. Nokabadi et al. arXiv 2407.03946, 2024.
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models. Schlarmann et al. In ICML, 2024.
Embodied Laser Attack:Leveraging Scene Priors to Achieve Agent-based Robust Non-contact Attacks. Sun et al. In MM, 2024.
Cascaded Adversarial Attack: Simultaneously Fooling Rain Removal and Semantic Segmentation Networks. Wang et al. In MM, 2024.
Physical ID-Transfer Attacks against Multi-Object Tracking via Adversarial Trajectory. Wang et al. In ACSAC, 2024.
A Human-in-the-Middle Attack Against Object Detection Systems. Wu et al. Artificial Intelligence (AIJ), 2022.
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers. Yang et al. In CVPR, 2024.
Towards Robust Physical-world Backdoor Attacks on Lane Detection. Zhang et al. In MM, 2024.
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. Bansal et al. In ICCV, 2023.
Defending Backdoor Attacks on Vision Transformer via Patch Processing. Doan et al. In AAAI, 2023.
PSO-Based Black-Box Lane Detection Adversarial Attack. Fang et al. In AIHCIR, 2023.
DECREE: Detecting Backdoors in Pre-trained Encoders. Feng et al. In CVPR, 2023.
Detection-Friendly Dehazing: Object Detection in Real-World Hazy Scenes. Li et al. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023.
WIP: Towards the Practicality of the Adversarial Attack on Object Tracking in Autonomous Driving. Ma et al. In VehicleSec, 2023.
Understanding Zero-Shot Adversarial Robustness for Large-Scale Models. Mao et al. In ICLR, 2023.
Adversarial Detection: Attacking Object Detection in Real Time. Wu et al. In IV, 2023.
You Are Catching My Attention: Are Vision Transformers Bad Learners under Backdoor Attacks?. Yuan et al. In CVPR, 2023.
TrojViT: Trojan Insertion in Vision Transformers. Zheng et al. In CVPR, 2023.
Physical Backdoor Attacks to Lane Detection Systems in Autonomous Driving. Han et al. In MM, 2022.
BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning. Jia et al. In IEEE S&P, 2022.
Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions. Liu et al. In AAAI, 2022.
A Perturbation Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow. Schmalfuss et al. In ECCV, 2022.
Automating defense against adversarial attacks: discovery of vulnerabilities and application of multi-INT imagery to protect deployed models. Kalin et al. Disruptive Technologies in Information Sciences V, 2021.
SLAP: Improving Physical Adversarial Examples with Short-Lived Adversarial Perturbations. Lovisotto et al. In USENIX Security, 2021.
Dirty Road Can Attack: Security of Deep Learning based Automated Lane Centering under Physical-World Attack. Sato et al. In USENIX Security, 2021.
Model-Agnostic Defense for Lane Detection against Adversarial Attack. Xu et al. arXiv 2103.00663, 2021.
SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems. Chou et al. In SPW, 2018.
DSNet: Joint Semantic Learning for Object Detection in Inclement Weather Conditions. Huang et al. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020.
Fooling Detection Alone is Not Enough: Adversarial Attack against Multiple Object Tracking. Jia et al. In ICLR, 2020.
Robust Tracking against Adversarial Attacks. Jia et al. In ECCV, 2020.
Phantom of the ADAS: Securing Advanced Driver-Assistance Systems from Split-Second Phantom Attacks. Nassi et al. In CCS, 2020.
Adversarial T-Shirt! Evading Person Detectors in a Physical World. Xu et al. In ECCV, 2020.
MobilBye: Attacking ADAS with Camera Spoofing. Nassi et al. arXiv 1906.09765, 2019.
Fooling Automated Surveillance Cameras: Adversarial Patches to Attack Person Detection. Thys et al. In CVPR workshops, 2019.
ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector. Chen et al. In ECML PKDD, 2018.
Robust Physical-World Attacks on Deep Learning Visual Classification. Eykholt et al. In CVPR, 2018.
Robust Camera Lidar Sensor Fusion Via Deep Gated Information Fusion Network. Kim et al. In IV, 2018.
DARTS: Deceiving Autonomous Cars with Toxic Signs. Sitawarin et al. arXiv 1802.06430, 2018.
CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild. Zhang et al. In ICLR, 2018.
AOD-Net: All-in-One Dehazing Network. Li et al. In ICCV, 2017.
Is Deep Learning Safe for Robot Vision? Adversarial Examples Against the iCub Humanoid. Melis et al. In ICCV, 2017.

Auditory Perception (21)

Watch Your Speed: Injecting Malicious Voice Commands via Time-Scale Modification. Ji et al. IEEE Transactions on Information Forensics and Security (TIFS), 2024.
Hello Me, Meet the Real Me: Audio Deepfake Attacks on Voice Assistants. Bilika et al. arXiv 2302.10328, 2023.
BarrierBypass: Out-of-Sight Clean Voice Command Injection Attacks through Physical Barriers. Walker et al. In WiSec, 2023.
Defending against Adversarial Audio via Diffusion Model. Wu et al. arXiv 2303.01507, 2023.
AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis. Yu et al. In CCS, 2023.
TrojanModel: A Practical Trojan Attack against Automatic Speech Recognition Systems. Zong et al. In S&P, 2023.
SPECPATCH: Human-In-The-Loop Adversarial Audio Spectrogram Patch Attack on Speech Recognition. Guo et al. In CCS, 2022.
Black-box Adversarial Attacks on Commercial Speech Platforms with Minimal Information. Zheng et al. In CCS, 2021.
Devil's Whisper: A General Approach for Physical Adversarial Attacks against Commercial Black-box Speech Recognition Devices. Chen et al. In USENIX Security, 2020.
Metamorph: Injecting Inaudible Commands into Over-the-air Voice Controlled Systems. Chen et al. In NDSS, 2020.
AdvPulse: Universal, Synchronization-free, and Targeted Audio Adversarial Attacks via Subsecond Perturbations. Li et al. In CCS, 2020.
Practical Adversarial Attacks Against Speaker Recognition Systems. Li et al. In HotMobile, 2020.
Adversarial Example Detection by Classification for Deep Speech Recognition. Samizade et al. In ICASSP, 2020.
When the Differences in Frequency Domain are Compensated: Understanding and Defeating Modulated Replay Attacks on Automatic Speech Recognition. Wang et al. In CCS, 2020.
Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems. Abdullah et al. arXiv 1904.05734, 2019.
A Multiversion Programming Inspired Approach to Detecting Audio Adversarial Examples. Zeng et al. In DSN, 2019.
Who Activated My Voice Assistant? A Stealthy Attack on Android Phones Without Users' Awareness. Zhang et al. In ML4CS, 2019.
Characterizing Audio Adversarial Examples Using Temporal Dependency. Yang et al. arXiv 1809.10875, 2018.
CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition. Yuan et al. In USENIX Security, 2018.
Hidden Voice Commands. Carlini et al. In USENIX security, 2016.
Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition. Vaidya et al. In WOOT, 2015.

Spatial Perception (61)

SLAMSpoof: Practical LiDAR Spoofing Attacks on Localization Systems Guided by Scan Matching Vulnerability Analysis. Nagata et al. arXiv 2502.13641, 2025.
BadDepth: Backdoor Attacks Against Monocular Depth Estimation in the Physical World. Guo et al. arXiv 2505.16154, 2025.
Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards. Brunke et al. IEEE Robotics and Automation Letters (RA-L), 2025.
LiDAttack: Robust Black-Box Attack on LiDAR-Based Object Detection. Chen et al. In ITSC, 2025.
Splat-Nav: Safe Real-Time Robot Navigation in Gaussian Splatting Maps. Chen et al. IEEE Transactions on Robotics (T-RO), 2025.
Black-Box Explainability-Guided Adversarial Attack for 3D Object Tracking. Cheng et al. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2025.
LiDAR Light Scattering Augmentation (LISA): Physics-based Simulation of Adverse Weather Conditions for 3D Object Detection. Kilic et al. In ICASSP, 2021.
Invisible but Detected: Physical Adversarial Shadow Attack and Defense on LiDAR Object Detection. Kobayashi et al. In USENIX Security, 2025.
Enhancing the Robustness of LiDAR-based Object Detection under Disappearing Attacks. Wang et al. In ICASSP, 2025.
An Imperceptible Adversarial Attack Against 3-D Object Detectors in Autonomous Driving. Wang et al. IEEE Internet of Things Journal (IoT-J), 2025.
Towards Real-Time Defense against Object-Based LiDAR Attacks in Autonomous Driving. Zhang et al. In CCS, 2025.
A New Adversarial Perspective for LiDAR-based 3D Object Detection. Zheng et al. In AAAI, 2025.
Diffusion Models-Based Purification for Common Corruptions on Robust 3D Object Detection. Cai et al. Sensors, 2024.
Adversary is on the Road: Attacks on Visual SLAM with Robust Perturbations on Point Clouds. Chen et al. In USENIX Security, 2024.
CATNIPS: Collision Avoidance Through Neural Implicit Probabilistic Scenes. Chen et al. IEEE Transactions on Robotics (T-RO), 2024.
Safer-Splat: A Control Barrier Function for Safe Navigation with Online Gaussian Splatting Maps. Chen et al. arXiv 2409.09868, 2024.
Random Spoofing Attack against LiDAR-Based Scan Matching SLAM. Fukunaga and Sugawara. In VehicleSec, 2024.
SpotAttack: Covering Spots on Surface to Attack LiDAR-Based Autonomous Driving Systems. Huang et al. IEEE Internet of Things Journal (IoT-J), 2024.
Adv3D: Generating 3D Adversarial Examples for 3D Object Detection in Driving Scenarios with NeRF. Li et al. In IROS, 2024.
Beyond Uncertainty: Risk-Aware Active View Acquisition for Safe Robot Navigation and 3D Scene Understanding with FisherRF. Liu et al. arXiv 2403.11396, 2024.
A First Physical-World Trajectory Prediction Attack via LiDAR-induced Deceptions in Autonomous Driving. Lou et al. In USENIX Security Symposium, 2024.
Poison-splat: Computation Cost Attack on 3D Gaussian Splatting. Lu et al. arXiv 2410.08190, 2024.
Adversarial Attacks on LiDAR-Based Tracking Across Road Users: Robustness Evaluation and Target-Aware Black-Box Method. Tian et al. arXiv 2410.20893, 2024.
Benchmarking Robustness in Neural Radiance Fields. Wang et al. In CVPR, 2024.
Mobile Cooperative Robot Safe Interaction Method Based on Embodied Perception. Wang et al. In ICCA, 2024.
A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks. Zhang et al. International Journal of Computer Vision (IJCV), 2022.
Control-Barrier-Aided Teleoperation with Visual-Inertial SLAM for Safe MAV Navigation in Complex Environments. Zhou et al. In ICRA, 2024.
AE-Morpher: Improve Physical Robustness of Adversarial Objects against LiDAR-based Detectors via Object Reconstruction. Zhu et al. In USENIX Security, 2024.
ADoPT: LiDAR Spoofing Attack Detection Based on Point-Level Temporal Consistency. Cho et al. arXiv 2310.14504, 2023.
Targeted Adversarial Attacks on Generalizable Neural Radiance Fields. Horvath. In CVPR, 2023.
BadLiDet: A Simple Backdoor Attack against LiDAR Object Detection in Autonomous Driving. Li et al. In TrustCom, 2023.
Towards Dynamic Backdoor Attacks against LiDAR Semantic Segmentation in Autonomous Driving. Li et al. In TrustCom, 2023.
SlowLiDAR: Increasing the Latency of LiDAR-Based Detection Using Adversarial Examples. Liu et al. In CVPR, 2023.
Transferable Adversarial Attack on 3D Object Tracking in Point Cloud. Liu et al. In MMM, 2023.
Scene Augmentation Methods for Interactive Embodied AI Tasks. Sang et al. IEEE Transactions on Instrumentation and Measurement, 2023.
Exorcising "Wraith": Protecting LiDAR-based Object Detector in Automated Driving System from Appearing Attacks. Xiao et al. In USENIX Security, 2023.
Vision-Only Robot Navigation in a Neural Radiance World. Adamkiewicz et al. IEEE Robotics and Automation Letters (RA-L), 2022.
Adversarial Attacks on Monocular Pose Estimation. Chawla et al. In IROS, 2022.
Physical Attack on Monocular Depth Estimation with Optimal Adversarial Patches. Cheng et al. In ECCV, 2022.
ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints. Dong et al. In NeurIPS, 2022.
Perceptual Aliasing++: Adversarial Attack for Visual SLAM Front-End and Back-End. Ikram et al. IEEE Robotics and Automation Letters (RA-L), 2022.
3D-VField: Adversarial Augmentation of Point Clouds for Domain Generalization in 3D Object Detection. Lehner et al. In CVPR, 2022.
Enforcing safety for vision-based controllers via Control Barrier Functions and Neural Radiance Fields. Tong et al. arXiv 2209.12266, 2022.
Adversarial Scan Attack against Scan Matching Algorithm for Pose Estimation in LiDAR-Based SLAM. Yoshida et al. Science, 2022.
PointCutMix: Regularization Strategy for Point Cloud Classification. Zhang et al. Neurocomputing, 2022.
Towards Backdoor Attacks against LiDAR Object Detection in Autonomous Driving. Zhang et al. In SenSys, 2022.
DoubleStar: Long-Range Attack Towards Depth Estimation based Obstacle Avoidance in Autonomous Systems. Zhou et al. In USENIX Security, 2022.
Universal Adversarial Attack Against 3D Object Tracking. Cheng et al. In HPCC, 2021.
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather. Hahner et al. In CVPR, 2021.
Object Removal Attacks on LiDAR-based 3D Object Detectors. Hau et al. arXiv 2102.03722, 2021.
Shadow-Catcher: Looking into Shadows to Detect Ghost Objects in Autonomous Vehicle 3D Sensing. Hau et al. In ESORICS, 2021.
Fooling LiDAR Perception via Adversarial Trajectory Perturbation. Li et al. In CVPR, 2021.
PointGuard: Provably Robust 3D Point Cloud Classification. Liu et al. In CVPR, 2021.
Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions. Sun et al. In NeurIPS, 2021.
I Can See the Light: Attacks on Autonomous Vehicles Using Invisible Lights. Wang et al. In CCS, 2021.
Temporal Consistency Checks to Detect LiDAR Spoofing Attacks on Autonomous Vehicle Perception. You et al. In Workshop on Security and Privacy for Mobile AI, 2021.
AcousticFusion: Fusing Sound Source Localization to Visual SLAM in Dynamic Environments. Zhang et al. In IROS, 2021.
CNN-Based Lidar Point Cloud De-Noising in Adverse Weather. Heinzler et al. IEEE Robotics and Automation Letters (RA-L), 2020.
Physically Realizable Adversarial Examples for LiDAR Object Detection. Tu et al. In CVPR, 2020.
Adversarial Objects Against LiDAR-Based Autonomous Driving Systems. Cao et al. arXiv 1907.05418, 2019.
Defense-PointNet: Protecting PointNet Against Adversarial Attacks. Zhang et al. In Big Data, 2019.

Motion Perception (48)

Safety Interventions against Adversarial Patches in an Open-Source Driver Assistance System. Chen et al. In DSN, 2025.
Attacking mmWave Imaging With Neural Meta-Material Rendering. Geng et al. IEEE Transactions on Information Forensics and Security (TIFS), 2025.
A Spoofing Detection and Direction-Finding Approach for Global Navigation Satellite System Signals Using Off-the-Shelf Anti-Jamming Antennas. Jin et al. Remote Sensing, 2025.
GNSS Spoofing Detection Based on Opportunistic Position Information. Liu and Papadimitratos. arXiv 2506.12580, 2025.
SecureTrack: Protecting Vehicular Sensors From Noninvasive EMI Attacks. Singh and Mishra. IEEE Sensors Journal, 2025.
GNSS Jammer Localization and Identification With Airborne Commercial GNSS Receivers. Spanghero et al. IEEE Transactions on Information Forensics and Security (TIFS), 2025.
Practical Spoofing Attacks on Galileo Open Service Navigation Message Authentication. Wang et al. arXiv 2501.09246, 2025.
Analysis and Validation of Distributed GNSS Spoofing Threat. Zhong et al. In Engineering Proceedings, 2025.
Unveiling the Stealthy Threat: Analyzing Slow Drift GPS Spoofing Attacks for Autonomous Vehicles in Urban Environments and Enabling the Resilience. Dasgupta et al. arXiv 2401.01394, 2024.
A Deep Learning Based Induced GNSS Spoof Detection Framework. Iqbal et al. Machine Learning (MLJ), 2024.
Acoustic Attack Mitigation Approach for MEMS Inertial Sensors Using Change Point Detection on MhIMU Framework. Sahu and Poddar. IEEE Transactions on Aerospace and Electronic Systems, 2024.
VIMU: Effective Physics-based Realtime Detection and Recovery against Stealthy Attacks on UAVs. Wang et al. In ACSAC, 2024.
MetaWave: Attacking mmWave Sensing with Meta-material-enhanced Tags. Chen et al. In NDSS, 2023.
Exploring Practical Acoustic Transduction Attacks on Inertial Sensors in MDOF Systems. Gao et al. IEEE Transactions on Mobile Computing, 2023.
Paralyzing Drones via EMI Signal Injection on Sensory Communication Channels. Jang et al. In NDSS, 2023.
Un-Rocking Drones: Foundations of Acoustic Injection Attacks and Recovery Thereof. Jeong et al. In NDSS, 2023.
mmSpoof: Resilient Spoofing of Automotive Millimeter-wave Radars using Reflect Array. Vennam et al. In S&P, 2023.
Anti-Spoofing Technique Based on Vector Tracking Loop. Zhou et al. IEEE Transactions on Instrumentation and Measurement, 2023.
TileMask: A Passive-Reflection-based Attack against mmWave Radar Object Detection in Autonomous Driving. Zhu et al. In CCS, 2023.
ESP Spoofing: Covert Acoustic Attack on MEMS Gyroscopes in Vehicles. Hong et al. IEEE Transactions on Information Forensics and Security (TIFS), 2022.
Combating Single-Frequency Jamming through a Multi-Frequency, Multi-Constellation Software Receiver: A Case Study for Maritime Navigation in the Gulf of Finland. Islam et al. Sensors, 2022.
DeepPOSE: Detecting GPS spoofing attack via deep recurrent neural network. Jiang et al. Digital Communications and Networks, 2022.
A Traceability Localization Method of Acoustic Attack Source for MEMS Gyroscope. Liu et al. IEEE Embedded Systems Letters, 2022.
Spoofing Attacks Against Vehicular FMCW Radar. Komissarov and Wool. In Workshop on Attacks and Solutions in Hardware Security, 2021.
Relay/replay attacks on GNSS signals. Lenhart et al. In WiSec, 2021.
SoundFence: Securing Ultrasonic Sensors in Vehicles Using Physical-Layer Defense. Lou et al. In SECON, 2021.
A Frequency-Domain Spoofing Attack on FMCW Radars and Its Mitigation Technique Based on a Hybrid-Chirp Waveform. Nallabolu and Li. IEEE Transactions on Microwave Theory and Techniques (TMTT), 2021.
Who Is in Control? Practical Physical Layer Attack and Defense for mmWave-Based Sensing in Autonomous Vehicles. Sun et al. IEEE Transactions on Information Forensics and Security (TIFS), 2021.
GNSS Jamming Classification via CNN, Transfer Learning & the Novel Concatenation of Signal Representations. Swinney and Woods. In CyberSA, 2021.
Machine learning-based approach to GPS antijamming. Wang et al. GPS Solutions, 2021.
Spoofing Attack on Ultrasonic Distance Sensors Using a Continuous Signal. Gluck et al. Sensors, 2020.
Drift with Devil: Security of Multi-Sensor Fusion based Localization in High-Level Autonomous Driving under GPS Spoofing (Extended Version). Shen et al. In USENIX Security, 2020.
Sensor Defense In-Software (SDI): Practical Software Based Detection of Spoofing Attacks on Position Sensor. Tharayil et al. Artificial Intelligence (AIJ), 2020.
DeepSIM: GPS Spoofing Detection on UAVs using Satellite Imagery Matching. Xue et al. In ACSAC, 2020.
VANET-Assisted Interference Mitigation for Millimeter-Wave Automotive Radar Sensors. Zhang et al. IEEE Network, 2020.
Drones in Distress: A Game-Theoretic Countermeasure for Protecting UAVs Against GPS Spoofing. Eldosouky et al. IEEE Internet of Things Journal (IoT-J), 2019.
A Dual Antenna GNSS Spoofing Detector Based on the Dispersion of Double Difference Measurements. Nguyen et al. In NAVITEC, 2018.
Development of a GPS spoofing apparatus to attack a DJI Matrice 100 Quadcopter. Horton and Ranganathan. The Journal of Global Positioning Systems, 2018.
Crowd-GPS-Sec: Leveraging Crowdsourcing to Detect and Localize GPS Spoofing Attacks. Jansen et al. In S&P, 2018.
Autonomous vehicle ultrasonic sensor vulnerability and impact assessment. Lim et al. In WF-IoT, 2018.
Analyzing and Enhancing the Security of Ultrasonic Sensors for Autonomous Vehicles. Xu et al. IEEE Internet of Things Journal (IoT-J), 2018.
Chips-Message Robust Authentication (Chimera) for GPS Civilian Signals. Anderson et al. In GNSS+, 2017.
WALNUT: Waging Doubt on the Integrity of MEMS Accelerometers with Acoustic Injection Attacks. Trippel et al. In EuroS&P, 2017.
GNSS Spoofing Detection and Mitigation Based on Maximum Likelihood Estimation. Wang et al. Sensors, 2017.
A Navigation Message Authentication Proposal for the Galileo Open Service. Fernández‐Hernández et al. NAVIGATION: Journal of the Institute of Navigation, 2016.
Can You Trust Autonomous Vehicles : Contactless Attacks against Sensors of Self-driving Vehicle. Yan et al. In DEF CON, 2016.
Rocking Drones with Intentional Sound Noise on Gyroscopic Sensors. Son et al. In USENIX Security, 2015.
Anti-spoofing and open GNSS signal authentication with signal authentication sequences. Pozzobon et al. In NAVITEC, 2010.

Cross-Modal Perception (11)

Temporal Misalignment Attacks against Multimodal Perception in Autonomous Driving. Shahriar et al. arXiv 2507.09095, 2025.
Malicious Attacks against Multi-Sensor Fusion in Autonomous Driving. Zhu et al. In Proceedings of the ACM International Conference on Mobile Computing and Networking (MobiCom), 2024.
MMCert: Provable Defense Against Adversarial Attacks to Multi-Modal Models. Wang et al. In CVPR, 2024.
A robust multi-sensor fusion model against adversarial patch attack. El-Fatyany et al. ResearchGate preprint, 2024.
Security Analysis of Camera-LiDAR Fusion Against Black-Box Attacks on Autonomous Vehicles. Hallyburton et al. In USENIX Security, 2022.
Adversarial Robustness of Deep Sensor Fusion Models. Wang et al. In WACV, 2020.
Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion based Perception in Autonomous Driving Under Physical-World Attacks. Cao et al. In S&P, 2021.
Not What You Asked For: Typographic Attacks in Household Robot Manipulation. Iranmanesh and Liu. arXiv 2605.18593, 2026.
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks. Zhao et al. In ICLR, 2025.
MMCert. Wang et al. In CVPR, 2024.
On Evaluating the Robustness of Large Vision-Language Models via Untargeted Modality Alignment Breaking Adversarial Attack. Li et al. In USENIX Security, 2026.

Cognition 39 papers

Instruction Understanding (16)

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation. Wang et al. arXiv 2504.15699, 2025.
Semantic Denial of Service in LLM-Controlled Robots. Steinberg and Gal. arXiv 2604.24790, 2026.
Seeing No Evil: Blinding Large Vision-Language Models to Safety Instructions via Adversarial Attention Hijacking. Li et al. In ACL, 2026.
Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems. Chang et al. arXiv 2604.19844, 2026.
Safe LLM-Controlled Robots with Formal Guarantees via Reachability Analysis. Hafez et al. arXiv 2503.03911, 2025.
BadNAVer: Exploring Jailbreak Attacks On Vision-and-Language Navigation. Lyu et al. arXiv 2505.12443, 2025.
AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions. Liu et al. arXiv 2506.14697, 2025.
Embodied Scene Understanding for Vision Language Models via MetaVQA. Wang et al. In CVPR, 2025.
Preventing Robotic Jailbreaking via Multimodal Domain Adaptation. Marchiori et al. arXiv 2509.23281, 2025.
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents. Yang et al. arXiv preprint, 2025.
CHAI: Command Hijacking against embodied AI. Burbano et al. arXiv 2510.00181, 2025.
IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios. Li et al. arXiv preprint, 2024.
MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?. Li et al. arXiv 2406.19693, 2024.
SQA3D: Situated Question Answering in 3D Scenes. Ma et al. In ICLR, 2023.
Safety Guardrails for LLM-Enabled Robots. Ravichandran et al. IEEE RA-L, 2026.
A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook. Luo et al. arXiv 2605.20266, 2026.

World Model (19)

Safety, Security, and Cognitive Risks in World Models. Parmar. arXiv 2604.01346, 2026.
TRAP: Tail-aware Ranking Attack for World-Model Planning. Duan et al. arXiv 2605.01950, 2026.
Benchmarking Vision-Language Models under Contradictory Virtual Content Attacks in AR. Xiu et al. arXiv 2604.05510, 2026.
The Safety Challenge of World Models for Embodied AI Agents: A Review. Baraldi et al. arXiv 2510.05865, 2025.
MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations. Bae et al. In CVPR, 2025.
A Comprehensive Survey on World Models for Embodied AI. Li et al. arXiv 2510.16732, 2025.
VLM-SAFE: Vision-Language Model-Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving. Qu et al. arXiv preprint, 2025.
Multi-Object Hallucination in Vision-Language Models. Chen et al. In NeurIPS, 2024.
SafeDreamer: Safe Reinforcement Learning with World Models. Huang et al. In ICLR, 2024.
Learning Latent Dynamic Robust Representations for World Models. Sun et al. In ICML, 2024.
Driving Into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving. Wang et al. In CVPR, 2024.
Scalable Policy Evaluation with Video World Models. Tseng et al. arXiv 2511.11520, 2024.
CtrlAttack: A Unified Attack on World-Model Control in Diffusion Models. Xu et al. arXiv 2603.13435, 2026.
HomeGuard: VLM-Based Embodied Safeguard for Identifying Contextual Risk in Household Task. Lu et al. arXiv 2603.14367, 2026.
SafeDream: Safety World Model for Proactive Early Jailbreak Detection. Yan et al. arXiv 2604.16824, 2026.
When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models. Guo et al. arXiv 2602.18739, 2026.
World Model Robustness via Surprise Recognition. Zollicoffer et al. arXiv 2512.01119, 2025.
Generating Robot Constitutions & Benchmarks for Semantic Safety. Sermanet et al. arXiv 2503.08663, 2025.
SafeDojo: Safe Reinforcement Learning for VLA via Interactive World Model. Tang et al. arXiv 2606.20698, 2026.

Reasoning (4)

HEAL: An Empirical Study on Hallucinations in Embodied Agents Driven by Large Language Models. Chakraborty et al. In Findings of the Association for Computational Linguistics: EMNLP, 2025.
H-CoT: Hijacking the Chain-of-Thought Safety Reasoning Mechanism to Jailbreak Large Reasoning Models, Including OpenAI o1/o3, DeepSeek-R1, and Gemini 2.0 Flash Thinking. Kuo et al. arXiv 2502.12893, 2025.
π0: A Vision-Language-Action Flow Model for General Robot Control. Black et al. arXiv 2410.24164, 2024.
Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation. Trinh et al. arXiv 2603.12717, 2026.

Planning 80 papers

Task Planning (32)

VestaBench: An Embodied Benchmark for Safe Long-Horizon Planning Under Multi-Constraint and Adversarial Settings. Sadhu et al. In EMNLP, 2025.
Using Large Language Models for Embodied Planning Introduces Systematic Safety Risks. Zhang et al. arXiv 2604.18463, 2026.
SafeMind: Benchmarking and Mitigating Safety Risks in Embodied LLM Agents. Chen et al. arXiv 2509.25885, 2025.
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents. Huang et al. arXiv 2504.14650, 2025.
HASARD: A Benchmark for Vision-Based Safe Reinforcement Learning in Embodied Agents. Tomilin et al. In ICLR, 2025.
Robo-Troj: Attacking LLM-based Task Planners. Nahian et al. arXiv 2504.17070, 2025.
SafePlan: Leveraging Formal Logic and Chain-of-Thought Reasoning for Enhanced Safety in LLM-based Robotic Task Planning. Obi et al. arXiv 2503.06892, 2025.
RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic. Wang et al. arXiv 2512.21220, 2025.
CEE: An Inference-Time Jailbreak Defense for Embodied Intelligence via Subspace Concept Rotation. Yang et al. arXiv 2504.13201, 2025.
Enhancing reliability in LLM-integrated robotic systems: A unified approach to security and safety. Zhang et al. Journal of Systems and Software, 2025.
Malicious Path Manipulations via Exploitation of Representation Vulnerabilities of Vision-Language Navigation Systems. Islam et al. In IROS, 2024.
Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems. Jiao et al. arXiv 2405.20774, 2024.
Compromising Embodied Agents with Contextual Backdoor Attacks. Liu et al. arXiv 2408.02882, 2024.
Exploring the Robustness of Decision-Level Through Adversarial Attacks on LLM-Based Embodied Models. Liu et al. In MM, 2024.
POEX: Towards Policy Executable Jailbreak Attacks Against the LLM-based Robots. Lu et al. arXiv 2412.16633, 2024.
Jailbreaking LLM-Controlled Robots. Robey et al. arXiv 2410.13691, 2024.
BadRobot: Jailbreaking Embodied LLMs in the Physical World. Zhang et al. arXiv 2407.20242, 2024.
SafeEmbodAI: a Safety Framework for Mobile Robots in Embodied AI Systems. Zhang et al. arXiv 2409.01630, 2024.
Adversarial Attacks on Optimization based Planners. Vemprala and Kapoor. In ICRA, 2021.
Adversarial Flow Matching for Imperceptible Attacks on End-to-End Autonomous Driving. Zeng et al. arXiv 2605.00880, 2026.
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control. Nakao and Takemoto. arXiv 2604.26577, 2026.
AgentSafe: Safeguarding Large Language Model-Based Multi-Agent Systems via Hierarchical Data Management. Mao et al. arXiv 2503.04392, 2025.
Generating Explanations for Embodied Action Decision from Visual Observation. Wang et al. In MM, 2023.
RoboJailBench: Benchmarking Adversarial Attacks and Defenses in Embodied Robotic Agents. Yeke et al. arXiv 2605.19328, 2026.
MuJoCo: A Physics Engine for Model-Based Control. Todorov et al. In IROS, 2012.
Design and Use Paradigms for Gazebo, an Open-Source Multi-Robot Simulator. Koenig and Howard. In IROS, 2004.
M3Bench: Benchmarking Whole-Body Motion Generation for Mobile Manipulation in 3D Scenes. Zhang et al. IEEE Robotics and Automation Letters (RA-L), 2025.
PyBullet, a Python Module for Physics Simulation for Games, Robotics and Machine Learning. Coumans and Bai. http://pybullet.org, 2021.
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks. Li et al. In CoRL, 2022.
Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots. Puig et al. arXiv 2310.13724, 2023.
Isaac Sim. NVIDIA.
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making. Li et al. In NeurIPS, 2024.

Trajectory Planning (34)

SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles. Xu et al. In NeurIPS, 2022.
PINA: Prompt Injection Attack against Navigation Agents. Liu et al. In ICASSP, 2026.
Beyond Crash: Hijacking Your Autonomous Vehicle for Fun and Profit. Sun et al. arXiv 2602.07249, 2026.
Universal Closed-Box Adversarial Attack for Trajectory Representation via Controlling High-Dimensional Iterative Constraints. Bai et al. IEEE Internet of Things Journal (IoT-J), 2025.
Avatar: Adversarial Vehicle Trajectory Attack Targeting Autonomous Driving Planner. Liu and Mori. In EuroS&PW, 2025.
Adversarial Attack on Trajectory Prediction for Autonomous Vehicles with Generative Adversarial Networks. Fan et al. In IROS, 2024.
How Secure Are Large Language Models (LLMs) for Navigation in Urban Environments?. Wen et al. arXiv 2402.09546, 2024.
Characterizing Physical Adversarial Attacks on Robot Motion Planners. Wu et al. In ICRA, 2024.
AdvDiffuser: Generating Adversarial Safety-Critical Driving Scenarios via Guided Diffusion. Xie et al. In IROS, 2024.
Robust Inverse Constrained Reinforcement Learning under Model Misspecification. Xu and Liu. In ICML, 2024.
A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems. Zhang et al. In ISSREW, 2024.
Visual Adversarial Attack on Vision-Language Models for Autonomous Driving. Zhang et al. arXiv 2411.18275, 2024.
Vehicle Trajectory Prediction based Predictive Collision Risk Assessment for Autonomous Driving in Highway Scenarios. Meng et al. arXiv 2304.05610, 2023.
Reducing Safety Interventions in Provably Safe Reinforcement Learning. Thumm et al. In IROS, 2023.
Robustness of Trajectory Prediction Models Under Map-Based Attacks. Zheng et al. In WACV, 2023.
AdvDO: Realistic Adversarial Attacks for Trajectory Prediction. Cao et al. In ECCV, 2022.
KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients. Hanselmann et al. In ECCV, 2022.
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior. Rempe et al. In CVPR, 2022.
On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles. Zhang et al. In CVPR, 2022.
Stochastic Model Predictive Control With a Safety Guarantee for Automated Driving. Br{\"u}digam et al. In IV, 2021.
Intelligent driving intelligence test for autonomous vehicles with naturalistic and adversarial environment. Feng et al. Nature Communications, 2021.
AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles. Wang et al. In CVPR, 2021.
Learning to Collide: An Adaptive Safety-Critical Scenarios Generating Method. Ding et al. In IROS, 2020.
Risky Action Recognition in Lane Change Video Clips using Deep Spatiotemporal Networks with Segmentation Mask Transfer. Yurtsever et al. In ITSC, 2019.
Exploring Adversarial Obstacle Attacks in Search-Based Path Planning for Autonomous Mobile Robots. Szvoren et al. In ICRA, 2025.
NavSim: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking. Dauner et al. In NeurIPS, 2024.
CommonRoad-RL: A Configurable Reinforcement Learning Environment for Motion Planning of Autonomous Vehicles. Wang et al. In ITSC, 2021.
Scenic: A Language for Scenario Specification and Data Generation. Fremont et al. Machine Learning (MLJ), 2023.
Microscopic Traffic Simulation Using SUMO. Lopez et al. In ITSC, 2018.
CARLA: An Open Urban Driving Simulator. Dosovitskiy et al. In CoRL, 2017.
An Environment for Autonomous Driving Decision-Making. Leurent. 2018.
Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-to-End Autonomous Driving. Jia et al. In NeurIPS, 2024.
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning. Li et al. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022.
SUMMIT: A Simulator for Urban Driving in Massive Mixed Traffic. Cai et al. In ICRA, 2020.

Multi-Agent Planning (14)

RoboRebound: Multi-Robot System Defense with Bounded-Time Interaction. Gandhi et al. In Proceedings of the European Conference on Computer Systems (EuroSys), 2025.
Multi-Robot Coordination with Adversarial Perception. Bahrami et al. In International Conference on Unmanned Aircraft Systems (ICUAS), 2025.
Red-Teaming LLM Multi-Agent Systems via Communication Attacks. He et al. In ACL, 2025.
Distributed Resilience-Aware Control in Multi-Robot Networks. Lee and Panagou. In IEEE Conference on Decision and Control (CDC), 2025.
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents. Witt. arXiv 2505.02077, 2025.
A Systematic Literature Review on Multi-Robot Task Allocation. KA, Athira and Subramaniam, Umashankar. ACM Computing Surveys (CSUR), 2024.
Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning. Standen et al. ACM Computing Surveys (CSUR), 2024.
Robot swarms neutralize harmful Byzantine robots using a blockchain-based token economy. Strobel et al. Science Robotics, 2023.
Dynamic multi-robot task allocation under uncertainty and temporal constraints. Choudhury et al. Autonomous Robots, 2020.
The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning. Blumenkamp and Prorok. In CoRL, 2021.
Multi-robot Coordination and Planning in Uncertain and Adversarial Environments. Zhou and Tokekar. In Robotics, 2021.
Resilient Distributed Diffusion for Multi-Robot Systems Using Centerpoint. Li et al. In RSS, 2020.
Blockchain Technology Secures Robot Swarms: A Comparison of Consensus Protocols and Their Resilience to Byzantine Robots. Strobel et al. In Robotics, 2020.
Hierarchical Attacks for Multi-Modal Multi-Agent Reasoning. Zhou et al. In CVPR, 2026.

Action and Interaction 112 papers

Robot Control (97)

AdvGrasp: Adversarial Attacks on Robotic Grasping from a Physical Perspective. Wang et al. arXiv 2507.09857, 2025.
Optimal Actuator Attacks on Autonomous Vehicles Using Reinforcement Learning. Wang et al. arXiv 2502.07839, 2025.
Pre-VLA: Preemptive Runtime Verification for Reliable Vision-Language-Action and World-Model Rollouts. Sun et al. arXiv 2605.22446, 2026.
IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks. Lu et al. arXiv 2506.16402, 2025.
RedVLA: Physical Red Teaming of Vision-Language-Action Models. Zhang et al. arXiv 2604.22591, 2026.
HazardArena: Evaluating Semantic Safety in Vision-Language-Action Models. Chen et al. arXiv 2604.12447, 2026.
Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming. Tong et al. arXiv 2604.05595, 2026.
JailWAM: Jailbreaking World Action Models in Robot Control. Liu et al. arXiv 2604.05498, 2026.
From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems. Xie and Wei-Kocsis. arXiv 2604.03890, 2026.
Tex3D: Objects as Attack Surfaces via Adversarial 3D Textures for Vision-Language-Action Models. Chen et al. arXiv 2604.01618, 2026.
UniAda: Universal Adaptive Multi-objective Adversarial Attack for End-to-End Autonomous Driving Systems. Zhang et al. arXiv 2604.23362, 2026.
STRONG-VLA: Decoupled Robustness Learning for Vision-Language-Action Models under Multimodal Perturbations. Xie et al. arXiv 2604.10055, 2026.
BEAT: Visual Backdoor Attacks on VLM-based Embodied Agents via Contrastive Trigger Learning. Zhan et al. In ICLR, 2026.
Regret-based Defense in Adversarial Reinforcement Learning. Belaire et al. In AAMAS, 2024.
Train Hard, Fight Easy: Robust Meta Reinforcement Learning. Greenberg et al. In NeurIPS, 2023.
Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL. Sun et al. In ICLR, 2022.
Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation. Zhou et al. In NeurIPS, 2023.
LIBERO-X: Robustness Litmus for Vision-Language-Action Models. Wang et al. arXiv 2602.06556, 2026.
On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning. Belaire et al. In ICLR, 2025.
PNAct: Crafting Backdoor Attacks in Safe Reinforcement Learning. Guo et al. In IJCAI, 2025.
ANNIE: Be Careful of Your Robots. Huang et al. 2025.
Adversarial Attacks on Robotic Vision Language Action Models. Jones et al. 2025.
Embodied Red Teaming for Auditing Robotic Foundation Models. Karnik et al. 2025.
Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient. Li et al. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025.
TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning. Li et al. 2025.
AERMANI-VLM: Structured Prompting and Reasoning for Aerial Manipulation with Vision Language Models. Mishra et al. arXiv preprint arXiv.2511.01472, 2025.
Off-Policy Actor-Critic for Adversarial Observation Robustness: Virtual Alternative Training via Symmetric Policy Evaluation. Nakanishi et al. In ICML, 2025.
Action Robust Reinforcement Learning via Optimal Adversary Aware Policy Optimization. Nie et al. 2025.
How Vulnerable Is My Learned Policy? Universal Adversarial Perturbation Attacks On Modern Behavior Cloning Policies. Patil et al. 2025.
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics. Wang et al. In ICCV, 2025.
Towards Robust Deep Reinforcement Learning against Environmental State Perturbation. Wang and Liu. 2025.
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models. Fei et al. arXiv 2510.13626, 2025.
Run-time Observation Interventions Make Vision-Language-Action Models More Visually Robust. Hancock et al. In ICRA, 2025.
VLSA: Vision-Language-Action Models with Plug-and-Play Safety Constraint Layer. Hu et al. arXiv 2512.11891, 2025.
AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models. Li et al. arXiv:2511.12149, 2025.
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models. Lu et al. arXiv:2511.21192, 2025.
ADVEDM:Fine-grained Adversarial Attack against VLM-based Embodied Agents. Wang et al. arXiv 2509.16645, 2025.
FreezeVLA: Action-Freezing Attacks against Vision-Language-Action Models. Wang et al. arXiv 2509.19870, 2025.
Model-agnostic Adversarial Attack and Defense for Vision-Language-Action Models. Xu et al. arXiv 2510.13237, 2025.
BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization. Xu et al. arXiv:2510.10932, 2025.
When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models. Yan et al. arXiv:2511.16203, 2025.
Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models. Zhang et al. arXiv:2511.21663, 2025.
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning. Zhang et al. In NeurIPS, 2025.
Goal-oriented Backdoor Attack against Vision-Language-Action Models via Physical Objects. Zhou et al. arXiv 2510.09269, 2025.
Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies. Chen et al. In NeurIPS, 2024.
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models. Cheng et al. 2024.
Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel. Gadot et al. In ICML, 2024.
Baffle: Hiding Backdoors in Offline Reinforcement Learning Datasets. Gong et al. In S&P, 2024.
Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations. Liang et al. In ICLR, 2024.
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL. Liu et al. In ICLR, 2024.
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies. Liu et al. In ICLR, 2024.
Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization. Luu et al. In IROS, 2024.
SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems. Ma et al. In CCS, 2024.
Adversarially Robust Decision Transformer. Tang et al. In NeurIPS, 2024.
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption. Yang et al. In ICLR, 2024.
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions. Yang et al. In NeurIPS, 2024.
Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation. Zhao et al. 2024.
Toward Evaluating Robustness of Reinforcement Learning with Adversarial Policy. Zheng et al. In DSN, 2024.
Robot Collapse: Supply Chain Backdoor Attacks Against VLM-based Robotic Manipulation. Wang et al. arXiv 2411.11683, 2024.
MARNet: Backdoor Attacks Against Cooperative Multi-Agent Reinforcement Learning. Chen et al. IEEE Transactions on Dependable and Secure Computing (TDSC), 2023.
PATROL: Provable Defense against Adversarial Policy in Two-player Games. Guo et al. In USENIX Security, 2023.
PolicyCleanse: Backdoor Detection and Mitigation for Competitive Reinforcement Learning. Guo et al. In ICCV, 2023.
Trade-Off Between Robustness and Rewards Adversarial Training for Deep Reinforcement Learning Under Large Perturbations. Huang et al. IEEE Robotics and Automation Letters (RA-L), 2023.
Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization. Lien et al. In ICML, 2023.
Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication. Sun et al. In ICLR, 2023.
Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers. Yuan et al. In AAAI, 2023.
A Robust Mean-Field Actor-Critic Reinforcement Learning Against Adversarial Perturbations on Agent States. Zhou et al. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023.
Distributionally Adaptive Meta Reinforcement Learning. Ajay et al. In NeurIPS, 2022.
Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization. Kuang et al. In AAAI, 2022.
Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning. Liang et al. In NeurIPS, 2022.
CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing. Wu et al. In ICLR, 2022.
Monotonic Robust Policy Optimization with Model Discrepancy. Jiang et al. In ICML, 2021.
Robust Deep Reinforcement Learning through Adversarial Loss. Oikarinen et al. In NeurIPS, 2021.
BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning. Wang et al. In IJCAI, 2021.
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary. Zhang et al. In ICLR, 2021.
Adversarial Policies: Attacking Deep Reinforcement Learning. Gleave et al. In ICLR, 2020.
Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents. Lee et al. In AAAI, 2020.
Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning. Sun et al. In AAAI, 2020.
Robustifying Reinforcement Learning Agents via Action Space Adversarial Training. Tan et al. In ACC, 2020.
Robust Reinforcement Learning using Adversarial Populations. Vinitsky et al. 2020.
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations. Zhang et al. In NeurIPS, 2020.
Action Robust Reinforcement Learning and Applications in Continuous Control. Tessler et al. In ICML, 2019.
Adversarially Robust Policy Learning through Active Construction of Physically-Plausible Perturbations. Mandlekar et al. In IROS, 2017.
Robust Adversarial Reinforcement Learning. Pinto et al. In ICML, 2017.
EPOpt: Learning Robust Neural Network Policies Using Model Ensembles. Rajeswaran et al. In ICLR, 2017.
Dreaming the Unseen: World Model-Regularized Diffusion Policy for Out-of-Distribution Robustness. Hu et al. arXiv 2603.21017, 2026.
FlowHijack: A Dynamics-Aware Backdoor Attack on Flow-Matching Vision-Language-Action Models. An et al. In CVPR, 2026.
SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models. Xu et al. arXiv 2601.14323, 2026.
Backdoors in DRL: Four Environments Focusing on In-Distribution Triggers. Ashcraft et al. arXiv 2505.17248, 2025.
DropVLA: An Action-Level Backdoor Attack on Vision-Language-Action Models. Xu et al. arXiv 2510.10932, 2025.
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors. Bai et al. In AAAI, 2025.
Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization. Li et al. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2025.
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents. Sun et al. In ICML, 2024.
User-Oriented Robust Reinforcement Learning. You et al. In AAAI, 2023.
Robust Reinforcement Learning as a Stackelberg Game via Adaptively-Regularized Adversarial Training. Huang et al. In IJCAI, 2022.
Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control. Weng et al. In ICLR, 2020.
Deep Reinforcement Learning in the Era of Foundation Models: A Survey. Mienye et al. Computers, 2026.

Human-Agent Interaction (12)

Biomimetic Approach to Designing Trust-Based Robot-to-Human Object Handover in a Collaborative Assembly Task. Rahman et al. In Robotics, 2025.
LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey. Zou et al. arXiv 2505.00753, 2025.
Compliant Blind Handover Control for Human-Robot Collaboration. Ferrari et al. In IROS, 2024.
Fast and Comfortable Robot-to-Human Handover for Mobile Cooperation Robot System. Meng et al. Cyborg and Bionic Systems, 2024.
Human-robot object handover: Recent progress and future direction. Duan et al. In Robotics, 2024.
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety. Zhang et al. In ACL, 2024.
Optimizing human-robot handovers: the impact of adaptive transport methods. Käppler et al. In Robotics, 2023.
A Novel Human Intention Prediction Approach Based on Fuzzy Rules through Wearable Sensing in Human–Robot Handover. Zou et al. In Robotics, 2023.
A Taxonomy of Factors Influencing Perceived Safety in Human–Robot Interaction. Akalin et al. In Robotics, 2023.
Handover Control for Human-Robot and Robot-Robot Collaboration. Costanzo et al. In Robotics, 2021.
Perceived Safety in Physical Human Robot Interaction - A Survey. Rubagotti et al. In Robotics and Autonomous Systems, 2021.
A Review on Trust in Human-Robot Interaction. Khavas et al. arXiv 2105.10045, 2021.

Multi-Agent Collaboration (3)

When Autonomy Goes Rogue: Preparing for Risks of Multi-Agent Collusion in Social Systems. Ren et al. arXiv 2507.14660, 2025.
Distributional AGI Safety. Tomasev et al. arXiv 2512.16856, 2025.
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast. Gu et al. In ICML, 2024.

Agentic 96 papers

Tool Use and Skill (22)

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents. Chen et al. arXiv 2605.04808, 2026.
Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives. Khodayari et al. arXiv 2604.27202, 2026.
Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models. Belkhiter et al. arXiv 2604.20994, 2026.
Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills. Lv et al. arXiv 2604.25109, 2026.
RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents. Xiao et al. arXiv 2604.22888, 2026.
Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study. Wang et al. arXiv 2604.21829, 2026.
Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems. Qu et al. arXiv 2604.03081, 2026.
Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis. Li et al. arXiv 2604.02837, 2026.
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents. Wang et al. In ICSE, 2026.
BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents. Feng et al. In Findings of the Association for Computational Linguistics: ACL, 2026.
Prompt Injection Attack to Tool Selection in LLM Agents. Shi et al. arXiv 2504.19793, 2025.
SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models. Wu et al. arXiv preprint, 2024.
Plug in the Safety Chip: Enforcing Constraints for LLM-driven Robot Agents. Yang et al. Brown University Technical Report, 2024.
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis. Mu et al. In ICML, 2024.
MalTool: Malicious Tool Attacks on LLM Agents. Hu et al. arXiv 2602.12194, 2026.
STAC: When Innocent Tools Form Dangerous Chains to Jailbreak LLM Agents. Li et al. arXiv 2509.25624, 2025.
RoboCodeX. Mu et al. In ICML, 2024.
Agent Harness Engineering: A Survey. Li et al. OpenReview preprint, 2026.
SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces. Jin et al. arXiv 2605.12015, 2026.
Toward Efficient Agents: Memory, Tool Learning, and Planning. Yang et al. arXiv 2601.14192, 2026.
OWASP. Project. https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/, 2026.
AgentTrap: Measuring Runtime Trust Failures in Third-Party Agent Skills. Zhuang et al. arXiv 2605.13940, 2026.

Memory (22)

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety. Liu et al. In ICML, 2026.
How Far Are VLMs from Privacy Awareness in the Physical World? An Empirical Study. Wang et al. arXiv 2605.05340, 2026.
From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms. Luo et al. Preprints.org, 2026.
Just Ask: Curious Code Agents Reveal System Prompts in Frontier LLMs. Zheng et al. In ICML, 2026.
A-MEM: Agentic Memory for LLM Agents. Xu et al. In NeurIPS, 2026.
Memory in the Age of AI Agents. Hu et al. arXiv 2512.13564, 2025.
MemOS: An Operating System for Memory-Augmented Generation (MAG) in Large Language Models. Li et al. arXiv 2507.03724, 2025.
Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs. Liu et al. arXiv 2512.04668, 2025.
Unveiling Privacy Risks in LLM Agent Memory. Wang et al. In ACL, 2025.
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems. Wu and Shu. TechRxiv preprint, 2025.
"Ghost of the past": identifying and resolving privacy leakage from LLM's memory through proactive user interaction. Zhang et al. arXiv 2410.14931, 2025.
A Survey on the Memory Mechanism of Large Language Model-based Agents. Zhang et al. ACM Transactions on Information Systems, 2025.
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance. Kwon et al. arXiv 2505.16348, 2025.
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases. Chen et al. In NeurIPS, 2024.
Hidden in Memory: Sleeper Memory Poisoning in LLM Agents. Pulipaka et al. arXiv 2605.15338, 2026.
PRISM: Generation-Time Detection and Mitigation of Secret Leakage in Multi-Agent LLM Pipelines. Tapwal et al. arXiv 2605.10614, 2026.
MemOS: A Memory OS for AI System. Li et al. arXiv 2507.03724, 2025.
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering. Zhou et al. arXiv 2604.08224, 2026.
Cascading Failures in Agentic AI. AI. https://adversa.ai/blog/cascading-failures-in-agentic-ai-complete-owasp-asi08-security-guide-2026/, 2025.
Memory in LLM-Based Multi-Agent Systems: Mechanisms, Challenges, and Collective Intelligence. Wu and Shu. Authorea Preprints, 2025.
SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World. Zhang et al. In AAAI, 2026.
Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents. Al-Tawaha et al. arXiv 2605.17830, 2026.

Self-Evolving (17)

EmbodiedGovBench: A Benchmark for Governance, Recovery, and Upgrade Safety in Embodied Agent Systems. Qin et al. arXiv 2604.11174, 2026.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems. Fang et al. arXiv 2508.07407, 2025.
Safe Continual Reinforcement Learning Methods for Nonstationary Environments. Towards a Survey of the State of the Art. Tomashevskiy. arXiv 2601.05152, 2025.
Self-Improving Embodied Foundation Models. Ghasemipour et al. 2025.
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents. Shao et al. arXiv 2509.26354, 2025.
C3AI: Crafting and Evaluating Constitutions for Constitutional AI. Kyrychenko et al. In ACM Web Conference, 2025.
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey. Zhang et al. arXiv 2509.02547, 2025.
Lifelong Learning of Large Language Model Based Agents: A Roadmap. Zheng et al. arXiv 2501.07278, 2025.
Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies. Srikanth et al. 2024.
Moral Anchor System: A Predictive Framework for AI Value Alignment and Drift Prevention. Ravindran. arXiv 2510.04073, 2024.
R2AI: Towards Resistant and Resilient AI in an Evolving World. Anonymous. OpenReview Preprint, 2024.
Agent-SafetyBench: Evaluating the Safety of LLM Agents. Zhang et al. arXiv 2412.14470, 2024.
OEP: Poisoning Self-Evolving LLM Agents via Locally Correct but Non-Transferable Experiences. Wang et al. arXiv 2605.18930, 2026.
Aligning AI Agents with Humans through Law as Information. Nay. Stanford Law School Working Paper, 2025.
A Systematic Survey of Self-Evolving Agents: From Model-Centric to Environment-Driven Co-Evolution. Xiang et al. TechRxiv preprint, 2025.
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence. Gao et al. arXiv 2507.21046, 2025.
2025 AI Safety Index. Institute. https://futureoflife.org/ai-safety-index-summer-2025/, 2025.

Cascading Risks (35)

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems. Yuan et al. arXiv 2604.04426, 2026.
ARMs: Adaptive Red-Teaming Agent against Multimodal Models with Plug-and-Play Attacks. Chen et al. arXiv 2510.02677, 2025.
RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents. Guo et al. arXiv 2510.02609, 2025.
The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents. Yeke et al. arXiv 2605.20544, 2026.
Position: Embodied AI Requires a Privacy-Utility Trade-off. Fan et al. arXiv 2605.05017, 2026.
A Survey of Agentic AI and Cybersecurity: Challenges, Opportunities and Use-case Prototypes. Lazer et al. arXiv 2601.05293, 2026.
A Survey on Agentic Security: Applications, Threats and Defenses. Shahriar. arXiv 2510.06445, 2025.
SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement. Jia et al. arXiv 2602.14211, 2026.
SoK: Agentic Skills - Beyond Tool Use in LLM Agents. Jiang et al. arXiv 2602.20867, 2026.
Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale. Liu et al. arXiv 2601.10338, 2026.
Cascading Failures in Agentic AI: ASI08 Security Guide. OWASP. 2026.
A white-box prompt injection attack on embodied AI agents driven by large language models. Geng et al. Science, 2026.
Securing (vision-based) autonomous systems: taxonomy, challenges, and defense mechanisms against adversarial threats. Lopez Pellicer et al. Artificial Intelligence (AIJ), 2025.
From Failure Modes to Reliability Awareness in Generative and Agentic AI System. Lin and Zhang. 2025.
The Adolescence of Technology. Amodei. 2025.
International AI Safety Report 2025: Second Key Update: Technical Safeguards and Risk Management. Bengio et al. arXiv 2511.19863, 2025.
Revisiting Adversarial Perception Attacks and Defense Methods on Autonomous Driving Systems. Chen et al. arXiv 2505.11532, 2025.
Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard. Wong et al. arXiv 2511.14876, 2025.
SoK: How Sensor Attacks Disrupt Autonomous Vehicles: An End-to-end Analysis, Challenges, and Missed Threats. Zhang et al. arXiv 2509.11120, 2025.
Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks. Xing et al. ACM Computing Surveys, 2026.
SoK: Cybersecurity Assessment of Humanoid Ecosystem. Surve et al. arXiv 2508.17481, 2025.
Automated Discovery of Semantic Attacks in Multi-Robot Navigation Systems. Yeke et al. In USENIX Security, 2025.
Secure Robotics: Navigating Challenges at the Nexus of Safety, Trust, and Cybersecurity in Cyber-Physical Systems. Haskard et al. ACM Computing Surveys (CSUR), 2024.
Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation. Chen et al. In IROS, 2024.
Toward Robust 3D Perception for Autonomous Vehicles: A Review of Adversarial Attacks and Countermeasures. Mahima et al. IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2024.
SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents. Yin et al. arXiv 2412.13178, 2024.
Dynamic Adversarial Attacks on Autonomous Driving Systems. Chahe et al. In RSS, 2023.
Security Considerations in AI-Robotics: A Survey of Current Methods, Challenges, and Opportunities. Neupane et al. IEEE Access, 2023.
Adversarial Driving: Attacking End-to-End Autonomous Driving. Wu et al. In IEEE Intelligent Vehicle Symposium, 2021.
Robotics cyber security: vulnerabilities, attacks, countermeasures, and recommendations. Yaacoub et al. International Journal of Information Security, 2021.
Spatiotemporal Attacks for Embodied Agents. Liu et al. In ECCV, 2020.
Auditing Agent Harness Safety. Liu et al. arXiv 2605.14271, 2026.
Safety at Scale: A Comprehensive Survey of Large Model and Agent Safety. Ma et al. Foundations and Trends, 2025.
Internal Safety Collapse in Frontier Large Language Models. Wu et al. arXiv 2603.23509, 2026.
The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey. Kim et al. In USENIX Security, 2026.

Other Related Works 33 papers

Surveys & Reviews (13)

A Survey on Adversarial Robustness of LiDAR-based Machine Learning Perception in Autonomous Vehicles. Kim and Kaur. arXiv 2411.13778, 2024.
SoK: Rethinking Sensor Spoofing Attacks against Robotic Vehicles from a Systematic View. Xu et al. In IEEE EuroS&P, 2022.
A Survey on Predictive Safety in Embodied AI. da Costa et al. SSRN, 2026.
AI Agents under Threat: A Survey of Key Security Challenges and Future Pathways. Deng et al. ACM Computing Surveys, 2025.
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents. Gan et al. arXiv 2411.09523, 2024.
Trust in LLM-Controlled Robotics: A Survey of Security Threats, Defenses and Challenges. Huang et al. arXiv 2601.02377, 2026.
Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms. Li et al. arXiv 2604.23775, 2026.
Towards Safe and Trustworthy Embodied AI: Foundations, Status, and Prospects. Tan et al. OpenReview preprint, 2025.
Adversarial Robustness in Embodied AI: A Closed-Loop Perspective on Attacks and Defenses. Wang et al. TechRxiv preprint, 2026.
Navigating Embodied Intelligence: Enabling Technologies, Security and Privacy, and Emerging Trends. Wang et al. IEEE Internet of Things Journal (IoT-J), 2026.
A Comprehensive Survey in LLM (-Agent) Full Stack Safety: Data, Training and Deployment. Wang et al. arXiv 2504.15585, 2025.
Safety of Embodied Navigation: A Survey. Wang et al. arXiv 2508.05855, 2025.
A Survey of Safety on Large Vision-Language Models: Attacks, Defenses and Evaluations. Ye et al. arXiv 2502.14881, 2025.

Benchmarks & Datasets (2)

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments. Anderson et al. In CVPR, 2018.
VizWiz Grand Challenge: Answering Visual Questions from Blind People. Gurari et al. In CVPR, 2018.

Foundation, World, World-Action & VLA Models (10)

OpenVLA: An Open-Source Vision-Language-Action Model. Kim and Pertsch. In CoRL, 2025.
Dolphins: Multimodal Language Model for Driving. Ma et al. In ECCV, 2024.
FAST: Efficient Action Tokenization for Vision-Language-Action Models. Pertsch et al. arXiv 2501.09747, 2025.
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model. Qu et al. In RSS, 2025.
Gemini Robotics: Bringing AI into the Physical World. Team et al. arXiv 2503.20020, 2025.
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models. Tian et al. In CoRL, 2024.
World Action Models: The Next Frontier in Embodied AI. Wang et al. arXiv 2605.12090, 2026.
DiffusionVLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression. Wen et al. In ICML, 2025.
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation. Wen et al. IEEE Robotics and Automation Letters (RA-L), 2025.
OpenDriveVLA: Towards End-to-End Autonomous Driving with Large Vision Language Action Model. Zhou et al. arXiv 2503.23463, 2025.

Other & Foundational (8)

Beyond Alignment: Why Robotic Foundation Models Need Context-Aware Safety. Robey et al. Science Robotics, 2026.
Your Robot Therapist Will See You Now: Ethical Implications of Embodied Artificial Intelligence in Psychiatry, Psychology, and Psychotherapy. Fiske et al. Journal of Medical Internet Research (JMIR), 2019.
Service Robots in the Healthcare Sector. Holland et al. In Robotics, 2021.
Healthcare Robots to Combat COVID-19. Kaiser et al. COVID-19, 2020.
A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5. Ma et al. arXiv 2601.10527, 2026.
What Breaks Embodied AI Security: LLM Vulnerabilities, CPS Flaws, or Something Else?. Ma et al. arXiv 2602.17345, 2026.
Emerging Risks from Embodied AI Require Urgent Policy Action. Perlo et al. In NeurIPS, 2025.
Computing Machinery and Intelligence. Turing. Parsing the Turing test, 1950.

Contribute

This survey is a living document. We welcome the community to help keep it current and comprehensive.

Submit a Missing Paper

Found a relevant paper we haven't covered? Submit it with the venue, year, link, and a brief note on which layer it belongs to. We review submissions regularly.

Suggest a Taxonomy Change

Think a topic is missing from our taxonomy, or a sub-category should be reorganized? Open a discussion and we will actively improve the framework with you.

Open Discussion

Review process: We welcome any work that offers insights into embodied AI safety, broadly construed. Submissions are reviewed and added in batches, and accepted contributions are credited in the repository. If you care about embodied safety, please contribute.

News

2026/06/01 Added an interactive At a Glance dashboard above: papers-per-year, a clickable taxonomy sunburst, and venue-type and top-venue charts, all rendered live from the paper list and cross-filtering it on click. Author citations across the list were also standardized.
2026/05/30 Added a Paper Reader page (papers.html) for browsing abstracts and keywords, with search across titles, authors, venues, and keywords, plus taxonomy-layer filtering.
2026/05/25 Featured by Chinese AI & tech media: 机器之心, 专知, 机器学习算法与自然语言处理, AI与安全, AI思想会, 非具身不智能, 诺亚星辰, and 小红书.
2026/05/24 arXiv v2 released; integrated further 2026-04 / 2026-05 arXiv papers across all 5 layers; URL re-verification across the paper list fixed title truncations and shifted Google Scholar links to arXiv where available. Survey now indexes 500+ papers across 38 co-authors from 13 institutions.
2026/05/11 Updated with 29 new arXiv papers (2026-04 / 2026-05) across all 5 layers, including HazardArena, RedVLA, JailWAM, DTap, IPI-in-Wild, MCP function-hijacking, and skill-safety literature. Renamed Tool Use to Tool Use and Skill to track the agentic-skill threat surface. Survey then indexed 481 papers across 38 co-authors from 13 institutions.
2026/05/09 Paper posted on arXiv.
2026/04/01 Beautified the paper list with layer icons and visual separators.
2026/03/31 Added llms.txt and SEO meta tags for AI discoverability.
2026/03/28 Added 11 missing safety papers; unified paper counts to 400+.
2026/03/27 Repository and paper released! Launched the project website on GitHub Pages, with an automated paper-review GitHub Action for community contributions.
2026/03/26 ISC-Bench paper on arXiv. 400+ stars in 48 hours!
2026/03/22 ISC-Bench repository released. Internal Safety Collapse benchmark for frontier LLMs.
2025/09/15 Safety at Scale survey published in Foundations and Trends in Security.
2025/02/02 Safety at Scale survey on arXiv. A complementary survey on large model & agent safety.

Citation

If you find this survey useful in your research, please cite:

@article{li2026safety,
  title   = {Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses},
  author  = {Li, Xiao and Zheng, Xiang and Gao, Yifeng and Xia, Xinyu and Wang, Yixu and Wang, Xin and Sun, Ye and Zhao, Yunhan and Wen, Ming and Li, Jiayu and Chen, Zixing and Gong, Xun and Liu, Yi and Li, Yige and Wu, Yutao and Wang, Cong and Sun, Jun and Cao, Yixin and Chen, Zhineng and Chen, Jingjing and Gui, Tao and Zhang, Qi and Wu, Zuxuan and Qiu, Xipeng and Huang, Xuanjing and Zhang, Tiehua and Wei, Zhipeng and Wang, Kun and Li, Xinfeng and Huang, Hanxun and Erfani, Sarah and Bailey, James and Wang, Jianping and Xiao, Chaowei and He, Ran and Li, Bo and Ma, Xingjun and Jiang, Yu-Gang},
  journal = {arXiv preprint arXiv:2605.02900},
  year    = {2026},
  url     = {https://arxiv.org/abs/2605.02900}
}

Recommended Reading & Viewing

Curated external resources on frontier AI safety beyond the surveyed papers:

Frontier AI Safety Policies–METR's resource on frontier AI developers' safety policies and commitments.