Backdoor detection and protection system for embodied agent based on behavior trajectory watermarking

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By embedding controllable perturbation watermarks into the behavioral trajectory of embodied intelligent agents and combining trajectory reconstruction with anomaly detection, the problem of detecting backdoor attacks in embodied intelligent agents is solved, achieving efficient and real-time security protection and improving the security and reliability of embodied intelligent agents in complex environments.

CN122241664APending Publication Date: 2026-06-19HUAZHONG UNIV OF SCI & TECH

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HUAZHONG UNIV OF SCI & TECH
Filing Date: 2026-03-10
Publication Date: 2026-06-19

Application Information

Patent Timeline

10 Mar 2026

Application

19 Jun 2026

Publication

CN122241664A

IPC: G06F21/16; G06F21/57; G06F18/213; G06F18/25; G06F18/22; G06F18/241; G06F18/243; G06F18/2433; G06N3/008; G06F123/02

AI Tagging

Application Domain

Biological models Platform integrity maintainance

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A power distribution network voltage support evaluation method, system, device and medium based on generalized regulation resources
CN122225477ABiological models Ac network voltage adjustment
System(s) and method(s) for generative model processing of image data including object(s) having particular feature(s) and / or classification(s)
WO2026122857A1Biological models
Knowledge graph construction method and device, equipment and storage medium
CN119149753BImprove timing analysisImproving performance in directional reasoningBiological models Knowledge representation
QA system and method
US20260162247A1Programme control Image enhancement
Systems and methods for data collection in an industrial environment
US20260161153A1Machine part testing Receivers monitoring

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies struggle to effectively detect and protect against backdoor attacks in embodied intelligent agents, especially in their highly dynamic and complex behavioral trajectories. Traditional methods lack effective watermark embedding and detection mechanisms, making it difficult to identify and block backdoor behaviors in a timely manner.

Method used

A backdoor detection and protection system for embodied intelligent agents based on behavioral trajectory watermarking is constructed. By generating and embedding controllable perturbation watermarks, combined with trajectory reconstruction and abnormal distribution detection, the system can verify the integrity and security status of the embodied intelligent agent's behavior. The system includes watermark generation, injection, trajectory acquisition, feature extraction and detection modules, and supports online real-time monitoring.

Benefits of technology

It significantly improves the accuracy and timeliness of backdoor detection for embodied intelligent agents, is suitable for open and untrusted real-world environments, provides a secure and reliable deployment solution for embodied intelligent agents, and has high versatility and cross-task migration capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122241664A_ABST

Patent Text Reader

Abstract

This invention belongs to the field of artificial intelligence security technology, and specifically relates to a backdoor detection and protection system for embodied intelligent agents based on behavioral trajectory watermarking. The system includes a watermark generation module, a watermark injection module, a trajectory acquisition module, a feature extraction module, a watermark detection module, and a response and protection module. By constructing a unified trajectory latent space, achieving watermark embedding with controllable perturbations, and combining this with trajectory reconstruction and anomaly distribution detection mechanisms, the system verifies the integrity and security status of the intelligent agent's behavior. This invention can provide robust behavioral watermark identification for embodied intelligent agents without affecting normal execution performance, thereby enhancing resistance to threats such as model contamination, backdoor triggering, and policy replacement.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of artificial intelligence security technology, and in particular relates to an embodied intelligent agent backdoor anomaly detection and protection system based on behavioral trajectory watermarking. Background Technology

[0002] Embodied agents, through multimodal perception (vision, hearing, touch, etc.), cognitive reasoning, and autonomous decision-making, form a unified intelligent form, capable of adapting to various environments and performing complex tasks. They demonstrate strong potential in complex tasks such as robot manipulation, mobile navigation, and human-robot collaboration. These agents typically rely on deep reinforcement learning or large-scale multimodal foundational models (such as RT-2, PaLM-E, etc.) for autonomous decision-making, enabling them to perceive, understand, and perform tasks in real physical environments just like humans.

[0003] As the deployment scale of embodied agents expands and their application scenarios become more open, the security threats they face are becoming increasingly severe. Among these, the risk of backdoor attacks has risen significantly: attackers can implant hidden backdoor mechanisms within the agent through various means such as poisoning training data, replacing model parameters, or polluting the system during online runtime. This allows the agent to perform unexpected or even harmful actions under specific triggering conditions, seriously threatening system security. Embodied agents typically integrate multiple modalities such as vision, language, and action. Attackers can design triggers for any modality, and even leverage cross-modal associations to achieve more covert attacks.

[0004] However, traditional security methods are difficult to apply effectively in embodied agent scenarios. On the one hand, existing model watermarking techniques and parameter-level detection methods often rely on static or labelable model structures, while the control strategies of embodied agents are highly dynamic, and their parameters are difficult to label or track stably. On the other hand, the behavioral sequences of agents are usually high-dimensional, highly temporally correlated, and must perform tasks in real physical environments, resulting in a high degree of uncontrollability in input and output data, further weakening the applicability of traditional methods. More importantly, current research on detecting backdoors in agents still mainly focuses on model parameters, input features, or single action output levels, neglecting the core dimension of behavioral trajectories. As a comprehensive manifestation of the interaction between the agent and the environment, behavioral trajectories have not yet established an effective watermark embedding mechanism and security analysis framework, making it difficult to identify and block backdoor behaviors in a timely manner.

[0005] Therefore, there is an urgent need to build a general technical system oriented towards the behavior level, which can embed verifiable watermark signals in behavior trajectories and achieve efficient detection of potential backdoor behaviors, thereby ensuring the safe and reliable operation of embodied intelligent agents in open and complex environments. Summary of the Invention

[0006] The technical problem to be solved by this invention is to provide a backdoor detection and protection system for embodied intelligent agents based on behavioral trajectory watermarking. By constructing a unified trajectory latent space, realizing watermark embedding in a controllable perturbation form, and combining it with trajectory reconstruction and abnormal distribution detection mechanisms, the system can verify the integrity and security status of the embodied intelligent agent's behavior. It can provide robust behavioral watermark identification for the embodied intelligent agent without affecting normal execution performance, thereby enhancing its resistance to threats such as model contamination, backdoor triggering, and policy replacement.

[0007] To address the aforementioned technical problems, embodiments of the present invention provide a backdoor detection and protection system for embodied intelligent agents based on behavioral trajectory watermarking, the system comprising:

[0008] The watermark generation module is used to generate structured watermark elements, which include a specific action sequence for short-term action perturbation, a feature vector for latent spatial signature, and a time window or environmental state that specifies the watermark activation condition.

[0009] The watermark injection module is used to embed the watermark elements generated by the watermark generation module into the strategy of the embodied intelligent agent in a sparse and low-interference manner.

[0010] The trajectory acquisition module is used to acquire the trajectory data sequence of the embodied intelligent agent during operation. The trajectory data includes the real-time state sequence, action sequence and timestamp information of the embodied intelligent agent.

[0011] The feature extraction module is used to segment the collected trajectory data sequence.

[0012] The watermark detection module is used to analyze watermark features and abnormal behavior in trajectory data.

[0013] The watermark injection module is embedded into the policy of the embodied intelligent agent in a sparse and low-interference manner, and the injection methods include:

[0014] Short-term action perturbation watermarking, at sparse time points that meet the triggering conditions, replaces the original action with a preset watermark action sequence with a certain probability through a behavior wrapper, so as to achieve explicit and controllable watermark embedding.

[0015] Latent space signature watermarking, during policy training, guides the model to generate a watermarked representation of a specific input in the latent variable space by adding a regularization term to the loss function.

[0016] The condition-triggered injection mechanism binds the activation of the watermark to a specific environmental state, and the watermark behavior is only displayed under preset conditions.

[0017] The feature extraction module segments the collected trajectory data sequence and processes each trajectory segment to transform it into a low-dimensional, structured feature vector. Specifically, this includes the following steps:

[0018] Segmentation: The continuous trajectory is segmented according to the preset window length and sliding step size to obtain the trajectory segment to be processed;

[0019] Action statistical feature extraction: Calculate the mean, variance, kurtosis, and skewness of the action sequence within the window, calculate the action entropy, and calculate the dynamic time warping distance between it and the baseline action template to obtain the action statistical feature vector;

[0020] State transition feature extraction: Calculate path curvature, extract velocity statistical features, and detect the path curvature of interactive events to obtain the state transition feature vector;

[0021] Temporal coding feature extraction: The state-action sequence is encoded by a temporal encoder, and the labeled output vector is taken as the temporal embedding to obtain the temporal embedding vector;

[0022] The above action statistical feature vector, state transition feature vector, and temporal embedding vector are concatenated to obtain preliminary fused features. Then, the preliminary fused features are subjected to layer normalization to obtain multimodal feature vectors.

[0023] The watermark detection module includes:

[0024] The statistical testing module is used to calculate the difference between the baseline trajectory distribution and the trajectory to be detected, and to assess the similarity between the two in order to identify abnormal motion distributions or trajectory patterns.

[0025] A machine learning classifier, based on feature vectors obtained from a feature extractor, trains a binary classification model to distinguish between normal trajectories and triggered trajectories;

[0026] The watermark signature similarity calculation module is used to calculate the similarity between the feature vector of a trajectory suspected of containing a watermark and a known watermark signature in order to confirm whether a watermark exists.

[0027] The watermark detection module analyzes watermark features and abnormal behaviors in trajectory data, including the following steps:

[0028] Calculate the KL divergence between the current trajectory feature distribution and the baseline normal trajectory distribution. If it exceeds a threshold, it is marked as a statistical anomaly. Train a binary classification model using historically collected normal trajectories and known backdoor-triggered trajectory samples. Input a multimodal feature vector and output a confidence score, which represents the probability of belonging to an abnormal trajectory. Calculate the cosine similarity between the multimodal feature vector and the pre-stored watermark signature to determine if a valid watermark has been detected. Integrate the KL divergence, confidence score, and cosine similarity using a weighted summation method to generate the final detection score, which determines whether a backdoor trigger or watermark anomaly has been detected.

[0029] The system also includes a response and protection module, which is used to immediately interrupt the current strategy execution process, activate the security rollback strategy, switch the embodied agent to a restricted operation mode, record the current task context and abnormal trajectory data, and freeze the online update permissions of the suspicious strategy model when the watermark detection module detects watermark abnormality or backdoor trigger signal.

[0030] The beneficial effects of the above technical solution of the present invention are as follows:

[0031] This invention overcomes the limitations of traditional backdoor detection and watermarking technologies in embodied agent scenarios. Compared to existing methods, it does not require modification of the agent's underlying architecture and possesses high versatility and cross-task transfer capabilities. By embedding controllable perturbation watermarks into behavioral trajectories in the latent space and utilizing trajectory offset to achieve sensitive detection of backdoor behaviors, it significantly improves detection accuracy and timeliness. At the same time, the system supports online real-time monitoring and is suitable for open, untrusted real-world operating environments, providing a new technical path and practical solution for the secure and reliable deployment of embodied agents. Attached Figure Description

[0032] Figure 1 This is a schematic diagram of the overall architecture of the embodied intelligent agent backdoor detection and protection system based on behavior trajectory watermarking in this invention.

[0033] Figure 2 This is a schematic diagram of the structure of the watermark generation module and the watermark injection module in this invention;

[0034] Figure 3 This is a schematic diagram of the trajectory acquisition module, feature extraction module, and watermark detection module in this invention;

[0035] Figure 4 This is a schematic diagram of the response and protection module in this invention. Detailed Implementation

[0036] To make the technical problems, technical solutions and advantages of the present invention clearer, a detailed description will be given below in conjunction with the accompanying drawings and specific embodiments.

[0037] like Figure 1 As shown, embodiments of the present invention provide a backdoor detection and protection system for embodied intelligent agents based on behavioral trajectory watermarking. The system includes a watermark generation module, a watermark injection module, a trajectory acquisition module, a feature extraction module, a watermark detection module, and a response and protection module. Each module is integrated into the local computing unit of the embodied intelligent agent. By constructing a unified trajectory latent space, realizing watermark embedding in a controllable perturbation form, and cooperating with trajectory reconstruction and abnormal distribution detection mechanisms, the system can verify the integrity and security status of the embodied intelligent agent's behavior.

[0038] The watermark generation module is based on a key. A set of structured watermark elements is generated using a pseudo-random number generator (PRNG). The key... The watermark is generated, stored, and rotated by the system's built-in key manager, supporting periodic updates. The watermark elements include a specific action sequence for short-term action perturbation, a feature vector for latent space signature, and a time window or environmental state specifying the watermark activation condition. This ensures that the perturbed action sequence, the latent space signature feature vector, and the specified time window or environmental state carry verifiable watermark features while maintaining task performance. All these watermark contents can be accurately reproduced with the key unchanged, but are highly unpredictable to external observers.

[0039] Watermark generation includes the following steps:

[0040] Short-term action sequence generation: using a pseudo-random number generator (PRNG) with a key Use the seed to generate a set of action sequences that conform to the robot's kinematic constraints. Each The joint angle or end-effector pose vector is generated, and its safety is verified through collision detection and dynamic simulation.

[0041] Latent space signature generation: using a pre-trained encoder network For a set of preset trigger states Encode to obtain the latent vector The final watermark signature vector is then obtained through orthogonalization. Preferably, the above orthogonalization process can be replaced by normalization, random projection, or hash mapping.

[0042] Activation condition generation: A sparse set of activation time points is generated using a pseudo-random number generator (PRNG). or environmental conditions And store it as a watermark activation rule table.

[0043] The watermark injection module is used to embed the watermark elements generated by the watermark generation module into the policy of the embodied agent in a sparse and low-interference manner, supporting implementation during the training or deployment phase. Specifically, it is embedded into the embodied agent through three complementary injection mechanisms:

[0044] Short-term action perturbation watermarking: At sparse time points where triggering conditions are met, a behavior wrapper replaces the original action with a preset watermark action sequence with a certain probability. Although these actions deviate slightly, they are still within the task's acceptable range, thus achieving explicit but controllable watermark embedding. Specifically, in the agent's decision-making loop, the current time step is continuously detected. Does it belong to Or whether the environmental conditions are met If the conditions are met, then the preset probability will be applied. Invoke the behavior wrapper to replace the original action output with an action sequence. The corresponding subsequence or action sequence The corresponding subsequences are weighted and fused, and after replacement, dynamic smoothing is required to avoid mutations.

[0045] Latent space signature watermarking: During policy training, a regularization term is added to the loss function to guide the model to generate a watermarked representation of a specific input in the latent variable space. This implicit watermark is difficult to observe directly and requires detection through trajectory encoding and similarity calculation, exhibiting strong resistance to watermarking. Specifically, during the policy network training phase, a regularization term is added to the loss function... Add a signature regular expression to: ,in These are the weight coefficients. Gradient descent optimization guides the policy network to respond to the trigger state. Generate a watermark signature in the latent variable space. Consistent expression.

[0046] Conditional injection mechanism: Watermark activation is tied to specific environmental states, with watermark behavior only appearing under preset conditions such as "honeypot scenarios." This enhances concealment and provides a controllable verification method for subsequent auditing. Specifically, a "honeypot environment detector" is deployed in the system to continuously monitor the environment. The corresponding watermark behavior is activated and its trajectory recorded only when a perfect match with preset honeypot conditions is detected; otherwise, the system remains silent.

[0047] like Figure 2As shown, in this embodiment, the watermark generation module and the watermark injection module embed verifiable, forgery-resistant, and difficult-to-detect digital watermarks into the agent's strategy without affecting the agent's task performance. The watermark generation module and the watermark injection module work together to generate and inject the watermark, and the entire process is controlled by a key. Control measures are implemented to ensure the reproducibility and security of the watermark.

[0048] The trajectory acquisition module is used to acquire the trajectory data sequence of the embodied intelligent agent during operation. The trajectory data includes the real-time state sequence, action sequence, and timestamp information of the embodied intelligent agent, including:

[0049] Multi-sensor synchronous acquisition: Employs a precise time protocol to align the data streams of the body sensors (IMU, joint encoder, torque sensor), environmental sensors (RGB-D camera, LiDAR), and mission status.

[0050] Data encapsulation and buffering: Data at each time step is encapsulated into a structure containing timestamps, joint angles, end-effector poses, visual data, and task objectives, and a circular buffer is used to store the latest data in real time. Step trajectory data.

[0051] The feature extraction module is used to segment the acquired trajectory data sequence. During the feature extraction stage, the module transforms the trajectory data sequence acquired by the trajectory acquisition module into a low-dimensional vector representation, including an action statistics module and a temporal feature module. It employs techniques such as RNN, Transformer, and Temporal CNN, and can also use autoencoders or contrastive learning encoders to enhance expressive power. The feature extraction process emphasizes segmenting the trajectory. Through a sliding window mechanism, it calculates the action distribution vector, state transition vector (such as path curvature, velocity statistics, etc.), and embedding vector output by the encoder for each trajectory segment. The features are normalized and aggregated to form a comprehensive feature vector, laying the foundation for subsequent analysis. This includes the following steps:

[0052] Real-time segmentation: based on preset window length With sliding step size The continuous trajectory in the buffer zone set by the trajectory acquisition module is divided into sliding segments to obtain the trajectory segment to be processed. Preferably, when a sudden change in action, state transition, or task phase change is detected, a new trajectory segment division is triggered in advance.

[0053] Action statistical feature extraction: Action sequences within a window Calculate statistical measures such as mean, variance, kurtosis, and skewness, and calculate action entropy. And calculate its dynamic time warping (DTW) distance from the baseline motion template.

[0054] State transition feature extraction: Calculating path curvature Extract speed statistical features (mean, peak value, rate of change) and detect the curvature of interactive event paths (e.g., judging grabbing and placement based on contact force threshold).

[0055] Temporal coding feature extraction: using a pre-trained temporal encoder for state-action sequences Encode the vector and use the output vector marked with [CLS] as the temporal embedding.

[0056] Feature fusion and normalization: This involves combining the above action statistical feature vectors... State transition feature vector Temporal embedding vectors By splicing the parts together, preliminary fusion characteristics are obtained. And on Layer normalization (LayerNorm) is performed to obtain the final multimodal feature vector. If the feature dimension is still too high, dimensionality reduction can be achieved by using methods such as principal component analysis, autoencoders, or linear projection.

[0057] In this embodiment, the "multimodal data" collected by the trajectory acquisition module covers all dimensions of interaction information between the embodied intelligent agent and the environment, mainly including:

[0058] Visual modalities: RGB image, depth map, infrared image, object detection box, semantic segmentation mask.

[0059] Body perception modalities: joint encoder feedback, inertial measurement unit (IMU) data, six-dimensional force / torque sensor readings, battery voltage and temperature.

[0060] Task and context modalities: task objectives described in natural language, global and local waypoint sequences, safe zones defined by electronic fences, and physical attributes of interactive objects (such as weight, material, and ID).

[0061] Behavioral output modalities: continuous or discrete sequences of action commands, trajectory points output by the motion planner, and gripping force commands from the end effector.

[0062] Timing and event modality: timestamps of all actions and states, and trigger records of system abnormal events (such as collisions, exceeding limits, and communication interruptions).

[0063] One or more modal combinations can be selected based on the hardware configuration. These multimodal data are fused and recorded on the basis of timestamp synchronization to form a complete description of the behavioral trajectory, providing a rich information source for subsequent feature extraction and watermark detection.

[0064] The watermark detection module analyzes watermark features and abnormal behaviors in trajectory data, calculates the similarity index between the current trajectory and the trajectory expected to contain watermark elements, and determines whether there are missing watermarks or abnormal deviations. The watermark detection module employs a multimodal method for detection, specifically including:

[0065] The statistical testing module is used to calculate the difference between the baseline trajectory distribution and the trajectory to be detected, such as KL divergence, Wasserstein distance, DTW distance, or KS test value, to assess the similarity between the two and identify anomalous motion distributions or trajectory patterns. Specifically, it calculates the current trajectory feature distribution. With baseline normal trajectory distribution KL divergence between: .like If a value exceeds a preset threshold, it is marked as a statistical anomaly. KL divergence can also be replaced with Wasserstein distance, KS test statistic, or maximum mean difference (MMD), etc.

[0066] A machine learning classifier, based on feature vectors obtained from a feature extractor, trains a binary classification model to distinguish between normal trajectories and triggered trajectories. When sufficient data is lacking, a single-class anomaly detection method, such as a One-Class SVM or an Autoencoder, is used. Specifically, a binary classification model (such as an SVM, random forest, or lightweight neural network) is trained using historically collected samples of normal trajectories and known backdoor-triggered trajectories. During online inference, the model processes the input multimodal feature vectors. Output a confidence score , which represents the probability of belonging to an abnormal trajectory.

[0067] The watermark signature similarity calculation module is used to calculate the similarity between the feature vector of a trajectory suspected of containing a watermark and a known watermark signature, in order to confirm the presence of a watermark. Specifically, it calculates the current feature vector... With pre-stored watermark signature Cosine similarity between them: .like If so, it is determined that a valid watermark has been detected.

[0068] The detection results from the statistical test module, the machine learning classifier, and the watermark signature similarity calculation module are integrated. A weighted summation method is used to combine the statistical test scores, machine learning classifier scores, and latent vector similarity scores to generate the final detection score. Specifically, a weighted summation method is used to fuse the outputs of the three detection sources to generate the final detection score. ,in Adjustable weights. Set a total score threshold. ,like If the system detects a backdoor trigger or an abnormal watermark, it will determine that the system has detected the backdoor trigger or the watermark is abnormal.

[0069] like Figure 3 As shown in this embodiment, after the feature extraction module processes the trajectory window in parallel with the action statistics feature module and the time encoder, the feature vector is concatenated in the fusion module. Then, it is comprehensively analyzed by the statistical verification module, machine learning classifier and watermark signature similarity calculation module in the watermark detection module to output the final detection score, so as to identify watermark or backdoor triggering behavior.

[0070] like Figure 4 As shown, the response and protection module is used to immediately interrupt the current strategy execution process, activate the security rollback strategy, switch the embodied intelligent agent to the restricted operation mode, record the current task context and abnormal trajectory data, and freeze the online update permissions of the suspicious strategy model when the watermark detection module detects watermark abnormality or backdoor trigger signal.

[0071] Specifically, the entire response process follows a clear state machine logic: the system starts from the normal execution state, enters the suspicious state after the detector identifies the suspicious trajectory, and if the anomaly persists or the confidence level exceeds the threshold, it is upgraded to the intervention state, triggering a series of proactive protection actions; after the risk is eliminated, the system enters the recovery state and finally returns to normal operation.

[0072] At the immediate response level, the system will immediately switch to a preset security policy. While this policy may not be optimal for the task, it is highly conservative and reversible, ensuring that system stability is prioritized in uncertain environments. It can also perform operations such as safe shutdown, obstacle avoidance, or return to the origin to prevent physical damage. Simultaneously, the system automatically records and reports complete anomaly trajectories, status snapshots, detection evidence, and timestamps for analysis by security platforms or operations personnel.

[0073] In the subsequent processing phase, the system provides multiple model-level repair methods. If a trusted historical version exists, it directly rolls back to the verified model snapshot; if no suitable snapshot exists, trusted data is used to fine-tune the current strategy online or perform knowledge distillation to cover or dilute potential backdoor behaviors. With white-box access privileges, differential analysis can also be performed on the model to locate, remove, or repair tampered parameter regions. Furthermore, to enhance long-term security, the system supports dynamically updating watermark keys and detector configurations, effectively preventing attackers from bypassing subsequent rounds of detection through reverse engineering or imitation.

[0074] In summary, this invention achieves covert watermark injection without affecting task performance through a watermark generation module and a watermark injection module, constructing a verifiable behavioral identity identifier; by fusing trajectory data of the embodied intelligent agent through a trajectory acquisition module and a feature extraction module, it effectively suppresses the interference of environmental noise on watermark extraction; the watermark detection module can complete anomaly detection relying only on externally observable trajectories, making it suitable for black-box deployment scenarios; the response and protection module realizes closed-loop control from detection to response, ensuring physical security while providing a complete chain of evidence for subsequent auditing.

[0075] The above description represents the preferred embodiments of the present invention. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of the present invention, and these improvements and modifications should also be considered within the scope of protection of the present invention.

Claims

1. A behavioral trajectory watermarking based embodied agent backdoor detection and defense system, characterized in that, The system includes: The watermark generation module is used to generate structured watermark elements, which include a specific action sequence for short-term action perturbation, a feature vector for latent spatial signature, and a time window or environmental state that specifies the watermark activation condition. The watermark injection module is used to embed the watermark elements generated by the watermark generation module into the strategy of the embodied intelligent agent in a sparse and low-interference manner. The trajectory acquisition module is used to acquire the trajectory data sequence of the embodied intelligent agent during operation. The trajectory data includes the real-time state sequence, action sequence and timestamp information of the embodied intelligent agent. The feature extraction module is used to segment the collected trajectory data sequence and process each trajectory segment into a low-dimensional, structured feature vector. The watermark detection module is used to analyze watermark features and abnormal behavior in trajectory data.

2. The behavior trajectory watermark based embodied agent backdoor detection and defense system according to claim 1, wherein, The watermark injection module is embedded into the policy of the embodied intelligent agent in a sparse and low-interference manner, and the watermark injection method includes: Short-term action perturbation watermarking, at sparse time points that meet the triggering conditions, replaces the original action with a preset watermark action sequence with a certain probability through a behavior wrapper, so as to achieve explicit and controllable watermark embedding. Latent space signature watermarking, during policy training, guides the model to generate a watermarked representation of a specific input in the latent variable space by adding a regularization term to the loss function. The condition-triggered injection mechanism binds the activation of the watermark to a preset environment state, and only displays the watermark behavior and records the trajectory when the preset environment and preset conditions match.

3. The embodied intelligent agent backdoor detection and protection system based on behavior trajectory watermarking according to claim 1, wherein the feature extraction module segments the collected trajectory data sequence and processes each trajectory segment to convert it into a low-dimensional, structured feature vector, specifically including the following steps: Segmentation: The continuous trajectory is segmented according to the preset window length and sliding step size to obtain the trajectory segment to be processed; Action statistical feature extraction: Calculate the mean, variance, kurtosis, and skewness of the action sequence within the window, calculate the action entropy, and calculate the dynamic time warping distance between it and the baseline action template to obtain the action statistical feature vector; State transition feature extraction: Calculate path curvature, extract velocity statistical features, and detect the path curvature of interactive events to obtain the state transition feature vector; Temporal coding feature extraction: The state-action sequence is encoded by a temporal encoder, and the labeled output vector is taken as the temporal embedding to obtain the temporal embedding vector; The above action statistical feature vector, state transition feature vector, and temporal embedding vector are concatenated to obtain preliminary fused features. Then, the preliminary fused features are subjected to layer normalization to obtain multimodal feature vectors.

4. The embodied intelligent agent backdoor detection and protection system based on behavior trajectory watermarking according to claim 1, wherein the watermark detection module comprises: The statistical testing module is used to calculate the difference between the baseline trajectory distribution and the trajectory to be detected, and to assess the similarity between the two in order to identify abnormal motion distributions or trajectory patterns. A machine learning classifier, based on feature vectors obtained from a feature extractor, trains a binary classification model to distinguish between normal trajectories and triggered trajectories; The watermark signature similarity calculation module is used to calculate the similarity between the feature vector of a trajectory suspected of containing a watermark and a known watermark signature in order to confirm whether a watermark exists.

5. The behavior trajectory watermark based embodied intelligent agent backdoor detection and defense system of claim 1, wherein, The watermark detection module analyzes watermark features and abnormal behaviors in trajectory data, including the following steps: Calculate the KL divergence between the current trajectory feature distribution and the baseline normal trajectory distribution. If it exceeds the threshold, mark it as a statistical anomaly. A binary classification model is trained using historically collected normal trajectories and known backdoor-triggered trajectory samples. The input is a multimodal feature vector, and the output is a confidence score, which represents the probability of belonging to an abnormal trajectory. Calculate the cosine similarity between the multimodal feature vector and the pre-stored watermark signature to determine if a valid watermark has been detected; A weighted summation method is used to integrate KL divergence, confidence score, and cosine similarity to generate a final detection score, which is used to determine whether a backdoor trigger or watermark anomaly is detected.

6. The behavior trajectory watermark based embodied intelligent agent backdoor detection and defense system of claim 1, wherein, The system also includes a response and protection module, which is used to immediately interrupt the current policy execution process, activate the security rollback policy, switch the embodied agent to a restricted operation mode, record the current task context and abnormal trajectory data, and freeze the online update permissions of the suspicious policy model when the watermark detection module detects watermark abnormality or backdoor trigger signal.