Robot control model training method, motion control method, device and equipment

By using a fusion architecture of event camera and spiking neural network model, student models are trained to match high frame rate perception and joint control cycles. This solves the problems of high energy consumption and performance degradation of traditional sensors under complex terrain and extreme lighting, and achieves low-energy, high-efficiency parkour skill transfer and terrain adaptability.

CN122185192APending Publication Date: 2026-06-12BEIJING HUMANOID ROBOTICS INNOVATION CENTER CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING HUMANOID ROBOTICS INNOVATION CENTER CO LTD
Filing Date
2026-03-23
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

When existing quadruped robots are used for motion control in complex terrain and extreme lighting conditions, traditional sensors have low frame rates and high power consumption, making it difficult to meet the deployment requirements of complex outdoor environments.

Method used

A fusion architecture combining an event camera and a spiking neural network model is adopted. By training a student model to acquire brightness change event data and proprioceptive information in a preset simulation environment, and combining the prediction results of the teacher model, the loss value is calculated and the student model is optimized to achieve high frame rate perception and joint control cycle matching, reduce energy consumption and improve environmental adaptability.

🎯Benefits of technology

Significantly reduces energy consumption, maintains terrain perception accuracy, improves the success rate of parkour skills under different lighting conditions, and ensures parkour performance in complex terrain.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122185192A_ABST
    Figure CN122185192A_ABST
Patent Text Reader

Abstract

The application provides a robot control model training method, a motion control method, a device and equipment, and relates to the technical field of robot control. The method comprises the following steps: a simulation robot deployed with a teacher model is used to move in a preset simulation environment, first brightness change event data, first proprioceptive information and a prediction result output by the teacher model are obtained, the prediction result comprises a first predicted joint action and a first predicted yaw angle instruction; a student model to be trained is used to output a second predicted joint action and a second predicted yaw angle instruction according to the first brightness change event data and the first proprioceptive information; a first loss value is calculated according to the first predicted joint action and the second predicted joint action, the first predicted yaw angle instruction and the second predicted yaw angle instruction; and the student model is trained according to the first loss value to obtain a robot control model. The application can reduce the energy consumption of the robot while improving the environmental adaptability and motion agility.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of robot control technology, and more specifically, to a robot control model training method, motion control method, device, and equipment. Background Technology

[0002] The quadruped robot parkour mission requires agile motion control under complex terrain (steps, ditches, obstacles) and extreme lighting conditions (strong light, weak light).

[0003] Existing technologies mainly rely on traditional sensors such as depth cameras and LiDAR, combined with artificial neural networks (ANNs) for perception and control. While this can accomplish basic parkour movements, traditional sensors suffer from low frame rates (mismatched with joint control cycles) and sensitivity to lighting (performance degradation under extreme conditions). In addition, SNN models have high computational demands and energy consumption, which limits the robot's endurance and makes it difficult to meet the deployment requirements of complex outdoor environments. Summary of the Invention

[0004] The purpose of this application is to address the shortcomings of the prior art by providing a robot control model training method, motion control method, device, and equipment, so as to reduce robot energy consumption while improving environmental adaptability and motion agility.

[0005] To achieve the above objectives, the technical solutions adopted in the embodiments of this application are as follows: In a first aspect, embodiments of this application provide a method for training a robot control model based on an event camera, the method comprising: A simulated robot equipped with a pre-trained teacher model moves in a preset simulation environment. The simulation model acquires first brightness change event data of the preset simulation environment collected by the event camera, first proprioceptive information of the simulated robot, and prediction results output by the teacher model. The prediction results are output by the teacher model based on terrain scan point information, the first proprioceptive information, and the target yaw angle. The prediction results include: first predicted joint movements and first predicted yaw angle commands. The student model to be trained outputs a second predicted joint action and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information. Calculate the first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command; The student model is trained based on the first loss value to obtain the robot control model.

[0006] Optionally, the teacher model includes: an artificial neural network encoder and an artificial neural network actuator, and the process by which the teacher model outputs prediction results includes: The artificial neural network encoder is used to generate potential terrain depth features and the first predicted yaw angle command based on the terrain scan point information; The artificial neural network actuator is used to generate the first predicted joint action based on the potential features of terrain depth, the first proprioceptive information, and the target yaw angle.

[0007] Optionally, the student model includes: a spiking neural network encoder and a spiking neural network actuator. The student model to be trained outputs a second predicted joint motion and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information, including: Using the pulse neural network encoder, a latent feature of brightness change depth and a second predicted yaw angle command are generated based on the first brightness change event data; Using the spiking neural network actuator, the second predicted joint action is output based on the potential features of the brightness change depth, the second predicted yaw angle command, and the first proprioceptive information.

[0008] Optionally, acquiring the first brightness change event data of the preset simulation environment collected by the event camera simulation model includes: The event camera simulation model captures the pixel brightness change of each pixel in multiple consecutive depth image frames of the preset simulation environment. Based on the pixel brightness change of each pixel and a preset contrast threshold, first brightness change event data is generated. The first brightness change event data includes: pixel coordinates, brightness change timestamp, and brightness change polarity.

[0009] Optionally, training the student model based on the first loss value to obtain the robot control model includes: The student model is trained based on the first loss value to obtain a preliminary distillation model; A simulated robot equipped with the preliminary distillation model moves in the preset simulation environment to acquire second brightness change event data and second proprioceptive information. Using the teacher model, based on the terrain scan point information, the second proprioceptive information, and the target yaw angle, the third predicted joint action and the third predicted yaw angle command of the simulation robot are output; Using the aforementioned preliminary distillation model, based on the second brightness change event data and the second proprioceptive information, the fourth predicted joint action and the fourth predicted yaw angle command of the simulated robot are output; The second loss value is calculated based on the third predicted joint action and the fourth predicted joint action, the third predicted yaw angle command and the fourth predicted yaw angle command; The robot control model is obtained by training the preliminary distillation model based on the second loss value.

[0010] Optionally, the neuron model of the spiking neural network encoder and the spiking neural network actuator adopts an integrated-firing neuron model.

[0011] Secondly, embodiments of this application also provide a motion control method for a quadruped robot, the method comprising: An event stream is acquired by an event camera deployed on the quadruped robot; A robot control model is used to generate control commands based on the environmental event flow, wherein the robot control model is trained using the robot control model training method described in any of the first aspects; The quadruped robot is controlled to move according to the control instructions.

[0012] Thirdly, embodiments of this application also provide a robot control model training device based on an event camera, the device comprising: The teacher model prediction module is used to move a simulated robot with a pre-trained teacher model deployed in a preset simulation environment, and to acquire the first brightness change event data of the preset simulation environment collected by the event camera simulation model, the first proprioceptive information of the simulated robot, and the prediction result output by the teacher model. The prediction result is output by the teacher model based on the terrain scan point information, the first proprioceptive information, and the target yaw angle. The prediction result includes: the first predicted joint action and the first predicted yaw angle command. The student model prediction module is used to output a second predicted joint action and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information using the student model to be trained. The loss value calculation module is used to calculate a first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command; The model training module is used to train the student model based on the first loss value to obtain the robot control model.

[0013] Optionally, the teacher model includes an artificial neural network encoder and an artificial neural network actuator. The teacher model prediction module is specifically used to use the artificial neural network encoder to generate terrain depth potential features and the first predicted yaw angle command based on the terrain scan point information; and to use the artificial neural network actuator to generate the first predicted joint action based on the terrain depth potential features, the first proprioceptive information, and the target yaw angle.

[0014] Optionally, the student model includes a spiking neural network encoder and a spiking neural network actuator. The student model prediction module is specifically used to use the spiking neural network encoder to generate a brightness change depth latent feature and a second predicted yaw angle command based on the first brightness change event data; and to use the spiking neural network actuator to output the second predicted joint action based on the brightness change depth latent feature, the second predicted yaw angle command, and the first proprioceptive information.

[0015] Optionally, the device further includes: The data acquisition module is used to acquire the pixel brightness change of each pixel in a series of depth image frames captured by the event camera simulation model in the preset simulation environment; and to generate first brightness change event data based on the pixel brightness change of each pixel and a preset contrast threshold. The first brightness change event data includes: pixel coordinates, brightness change timestamp, and brightness change polarity.

[0016] Optionally, the model training module is specifically used to train the student model based on the first loss value to obtain a preliminary distillation model; The data acquisition module is also used to move a simulation robot with the preliminary distillation model deployed in the preset simulation environment to acquire second brightness change event data and second proprioceptive information. The teacher model prediction module is also used to use the teacher model to output the third predicted joint action and the third predicted yaw angle command of the simulation robot based on the terrain scan point information, the second proprioceptive information and the target yaw angle. The student model prediction module is also used to use the preliminary distillation model to output the fourth predicted joint action and the fourth predicted yaw angle command of the simulation robot based on the second brightness change event data and the second proprioceptive information. The loss value calculation module is used to calculate the second loss value based on the third predicted joint action and the fourth predicted joint action, the third predicted yaw angle command and the fourth predicted yaw angle command; The model training module is further configured to train the preliminary distillation model based on the second loss value to obtain the robot control model.

[0017] Optionally, the neuron model of the spiking neural network encoder and the spiking neural network actuator adopts an integrated-firing neuron model.

[0018] Fourthly, embodiments of this application also provide a motion control device for a quadruped robot, the device comprising: The event stream acquisition module is used to acquire environmental event streams through an event camera deployed on the quadruped robot; The instruction generation module is used to generate control instructions based on the environmental event flow using a robot control model, wherein the robot control model is trained using the robot control model training method described in any of the first aspects. The motion control module is used to control the quadruped robot to move according to the control commands.

[0019] Fifthly, embodiments of this application also provide an electronic device, including: a processor, a storage medium, and a bus, wherein the storage medium stores program instructions executable by the processor, and when the electronic device is running, the processor communicates with the storage medium via the bus, and the processor executes the program instructions to perform the steps of the robot control model training method as described in any of the first aspects, or the steps of the motion control method for a quadruped robot as described in the second aspect.

[0020] Sixthly, embodiments of this application also provide a computer-readable storage medium storing a computer program, which, when executed by a processor, performs the steps of the robot control model training method as described in any of the first aspects, or the steps of the motion control method for a quadruped robot as described in the second aspect.

[0021] The beneficial effects of this application are: The robot control model training method, motion control method, device and equipment provided in this application adopt a fusion architecture of event camera and student model of spiking neural network model. Its energy consumption is significantly lower than that of artificial neural network model. It is adapted to kilohertz-level high frame rate perception and joint control cycle matching. It maintains terrain perception accuracy under different lighting conditions. Through distillation learning, it realizes efficient transfer of parkour skills from artificial neural network model and spiking neural network model, ensuring the success rate of parkour in complex terrain. Attached Figure Description

[0022] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0023] Figure 1 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 1 ; Figure 2 This is an architecture diagram of the robot control model training system provided in the embodiments of this application; Figure 3 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 2 ; Figure 4 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 3 ; Figure 5 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 4 ; Figure 6 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 5 ; Figure 7 A schematic flowchart illustrating the motion control method for a quadruped robot provided in an embodiment of this application; Figure 8 A schematic diagram of the structure of a robot control model training device based on an event camera provided in an embodiment of this application; Figure 9 This is a schematic diagram of the motion control device for a quadruped robot provided in an embodiment of this application; Figure 10 A schematic diagram of an electronic device provided in an embodiment of this application. Detailed Implementation

[0024] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are some embodiments of this application, but not all embodiments.

[0025] Therefore, the following detailed description of the embodiments of this application provided in the accompanying drawings is not intended to limit the scope of the claimed application, but merely to illustrate selected embodiments of the application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.

[0026] Furthermore, the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. Additionally, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0027] It should be noted that, where there is no conflict, the features in the embodiments of this application can be combined with each other.

[0028] Figure 1 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 1 ,like Figure 1 As shown, the method may include: S101. A simulated robot with a pre-trained teacher model is deployed to move in a preset simulation environment. The simulation robot acquires the first brightness change event data of the preset simulation environment collected by the event camera simulation model, the first proprioceptive information of the simulated robot, and the prediction results output by the teacher model. The prediction results are output by the teacher model based on the terrain scan point information, the first proprioceptive information, and the target yaw angle. The prediction results include: the first predicted joint action and the first predicted yaw angle command.

[0029] In this embodiment, a quadruped robot simulator is deployed in a high-fidelity simulation environment built on a simulation platform, such as the IsaacGym high-performance robot simulation environment. The simulator is equipped with a virtual LiDAR and a terrain scanning module to acquire terrain scanning point information in the form of a three-dimensional point cloud in real time. Simultaneously, the simulator is equipped with a virtual inertial measurement unit and joint encoders to continuously output the simulator's first proprioceptive information during motion. This first proprioceptive information may include: six-degree-of-freedom body attitude angles, joint angles, joint angular velocities, etc.

[0030] The simulation environment can include dynamic lighting changes, unstructured terrain, and target heading constraints. Unstructured terrain can include, for example, randomly distributed step heights, ravine widths, sloping ramps, and discrete obstacles.

[0031] In this simulation environment, a reinforcement learning-based approach is adopted, using terrain scan points and target yaw angle as privileged information. The teacher model is trained in multiple environments including steps, ditches, obstacles, and parkour terrain, and outputs joint movements and yaw angle commands. The reward function design of reinforcement learning focuses on parkour completion, movement stability, and terrain adaptability, guiding the teacher model to learn the ability to traverse complex terrain.

[0032] A teacher model, fully trained through discrete reinforcement learning, is deployed on the simulated robot. This model is an end-to-end artificial neural network (ANN). The input layer of the teacher model receives three types of data: a terrain scan point cloud processed by voxel meshing, normalized first proprioceptive information, and a user-specified target yaw angle, which is the angular deviation between the desired orientation and the global north direction. The output layer of the teacher model directly regresses to generate two control variables: a first predicted joint action and a first predicted yaw angle command. The first predicted joint action can include the target angles of each joint, and the first predicted yaw angle command is the yaw increment that the simulated robot needs to adjust in the next control cycle.

[0033] During the operation of a simulated robot equipped with a teaching model in a simulation environment, and the output of the first predicted joint motion and the first predicted yaw angle command based on terrain scan point cloud, first proprioceptive information, and target yaw angle, the event camera simulation model deployed on the robot is synchronously invoked. Pixel-level brightness change modeling is performed on each frame of depth image in the simulation environment, generating first brightness change event data. The brightness change events recorded in this data are asynchronous events, simulating the asynchronous output characteristics of a real event camera. The event density adapts to the dynamics of the scene. All data, including the first brightness change event data, first proprioceptive information, and the first predicted joint motion and the first predicted yaw angle command output by the teaching model, are aligned with the same timestamp and packaged into a circular buffer for subsequent steps. The sampling frequency of the event camera simulation model is 10Hz.

[0034] In some embodiments, multiple simulation environments can be run in parallel, each covering different terrains and lighting conditions. Simulation robots are deployed in each of the multiple simulation environments, and the teacher model is trained uniformly based on the data from the multiple simulation robots. After training, the multiple simulation robots that deployed the teacher model run in the multiple simulation environments to collect multiple sets of data. Each set of data includes: first brightness change event data, first proprioceptive information, first predicted joint motion, and first predicted yaw angle.

[0035] It should be noted that during the training of the teacher model, the terrain scan point information of the simulation environment is treated as privileged information. The global terrain scan points of the simulation environment can be directly input into the teacher model, without needing to collect terrain scan point information within the detection range frame by frame based on the robot's movement within the simulation environment. After the teacher model is trained and deployed on the robot, the terrain scan point information used by the teacher model to output the first predicted joint action and the first predicted yaw angle command can be information collected frame by frame within the detection range during the robot's movement.

[0036] S102. Using the student model to be trained, based on the first brightness change event data and the first proprioceptive information, output the second predicted joint action and the second predicted yaw angle command.

[0037] In this embodiment, after data acquisition is completed in S101, the cached first brightness change event data and the synchronously acquired first proprioceptive information are input together into a student model to be trained in an initial state. The student model is a spiking neural network (SNN) model, and its input interface supports joint encoding of asynchronous event streams and dense sensing vectors.

[0038] After extracting features from the first brightness change event data and the first proprioceptive information, the two types of features are concatenated in the temporal dimension and then fed into the multilayer perceptron backbone network of the student model. The output layer of the student model directly regresses to generate two control variables, namely the second predicted joint action and the second predicted yaw angle command.

[0039] In some embodiments, the pulse time step of the SNN is 4, which can match the joint control cycle of the robot.

[0040] S103. Calculate the first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command.

[0041] In this embodiment, a first loss value is calculated based on the motion loss of the first predicted joint action and the second predicted joint action, as well as the yaw angle loss of the first predicted yaw angle command and the second predicted yaw angle command.

[0042] In some embodiments, if multiple simulated robots in multiple simulation scenarios are used, the mean and variance of the first predicted joint actions and the second predicted joint actions corresponding to the multiple simulated robots can be used as the action loss, and the mean and variance of the first predicted yaw angle command and the second predicted yaw angle command corresponding to the multiple simulated robots can be used as the yaw angle loss.

[0043] Example, action loss and yaw angle loss The calculation formula can be expressed as:

[0044]

[0045] Where n is the number of joints of the simulated robot, and m is the number of simulated robots.

[0046] Furthermore, the first loss value can be calculated based on the weighted sum of the motion loss and the yaw angle loss.

[0047] S104. Train the student model based on the first loss value to obtain the robot control model.

[0048] In this embodiment, the parameters of the student model are optimized based on the first loss value, and after multiple rounds of training, the parameters of the student model converge or the number of training rounds reaches a preset number, thus obtaining the robot control model.

[0049] The robot control model training method based on event camera provided in the above embodiments adopts a fusion architecture of event camera and student model of spiking neural network model. Its energy consumption is significantly lower than that of artificial neural network model. It is adapted to kilohertz-level high frame rate perception and joint control cycle matching, maintains terrain perception accuracy under different lighting conditions, and realizes efficient transfer of parkour skills from artificial neural network model and spiking neural network model through distillation learning, ensuring the success rate of parkour in complex terrain.

[0050] In one possible implementation, Figure 2 This is an architecture diagram of the robot control model training system provided in the embodiments of this application, such as... Figure 2 As shown, the teacher model may include an artificial neural network encoder (ANN Encoder) and an artificial neural network actuator (ANN Actor).

[0051] Figure 3 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 2 ,like Figure 3 As shown, the process by which the S101 teacher model outputs prediction results may include: S201. An artificial neural network encoder is used to generate potential terrain depth features and a first predicted yaw angle command based on terrain scan point information.

[0052] In this embodiment, as Figure 2 As shown, the terrain scan point information is input into the artificial neural network encoder, which extracts features from the terrain scan point information to generate potential terrain depth features.

[0053] The terrain depth latent features generated by the artificial neural network encoder are further compressed and abstracted by the first fully connected layer Linear, and then mapped to the yaw angle representation space by the activation function and the second fully connected layer Linear, outputting the first predicted yaw angle command.

[0054] S202. An artificial neural network actuator is used to generate the first predicted joint action based on the potential features of terrain depth, the first proprioceptive information and the target yaw angle.

[0055] In this embodiment, the potential features of terrain depth, the first proprioceptive information and the target yaw angle are fused, and the fused features are input into the artificial neural network actuator. The artificial neural network actuator outputs the first predicted joint action based on the fused features.

[0056] In one possible implementation, such as Figure 2 As shown, the student model may include: a spiking neural network encoder (SNN Encoder) and a spiking neural network actor (SNN Actor). Figure 4 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 3 ,like Figure 4 As shown, the process by which the student model to be trained, based on the first brightness change event data and the first proprioceptive information, outputs the second predicted joint action and the second predicted yaw angle command in S102 can include: S301. A pulse neural network encoder is used to generate a brightness change depth potential feature and a second predicted yaw angle command based on the first brightness change event data.

[0057] In this embodiment, as Figure 2 As shown, the first brightness change event data is input into the spiking neural network encoder. The spiking neural network encoder converts the asynchronous event stream into a pulse tensor of a regular spatiotemporal grid as a deep latent feature of brightness change. The pulse residual network in the spiking neural network encoder extracts high-level spatiotemporal features from the event stream, and then integrates and accumulates the features over time through spiking neurons. Finally, the high-level spatiotemporal features are mapped to the yaw angle through the fully connected spiking output layer, and the second predicted yaw angle command is output.

[0058] S302. A pulse neural network actuator is used to output a second predicted joint action based on the potential features of brightness change depth, the second predicted yaw angle command and the first proprioceptive information.

[0059] In this embodiment, the potential features of brightness change depth, the second predicted yaw angle command and the first proprioceptive information are fused, and the fused features are input to the spiking neural network actuator. The spiking neural network actuator outputs the second predicted joint action based on the fused features.

[0060] In some embodiments, the spiking residual network uses spiking ResNet-18 and features are fused using a GRU module. The spiking neural network actuator uses a 3-layer spiking MLP with dimensions of 512, 256, and 128 respectively.

[0061] In one possible implementation, Figure 5 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 4 ,like Figure 5 As shown, the process of acquiring the first brightness change event data of the preset simulation environment collected by the event camera simulation model in step S101 can include: S401. Obtain the pixel brightness change of each pixel in multiple consecutive depth image frames of the preset simulation environment captured by the event camera simulation model.

[0062] S402. Generate first brightness change event data based on the pixel brightness change amount of each pixel and the preset contrast threshold. The first brightness change event data includes: pixel coordinates, brightness change timestamp, and brightness change polarity.

[0063] In this embodiment, in a preset simulation environment, based on the assumption of brightness constancy and Taylor expansion, the depth map of a single frame is analyzed, the brightness gradient of each pixel position is calculated, and combined with the motion state of the simulated robot, the motion speed of each pixel on the image plane is calculated. The brightness gradient and the motion speed are multiplied to obtain the pixel brightness change rate, and the rate is integrated to obtain the pixel brightness change rate.

[0064] A fixed contrast threshold is preset. When the pixel brightness change rate and the contrast threshold satisfy the preset relationship, the pixel is considered to have undergone a reportable change and an event is generated. This event is a quadruple (x,y,t,p), where (x,y) are the pixel coordinates, t is the timestamp accurate to the microsecond level, and p is the polarity. +1 indicates that the brightness increases (becomes brighter), and -1 indicates that the brightness decreases (becomes darker).

[0065] The events of all pixels are arranged in timestamp order, and an asynchronous event stream is generated as the first brightness change event data.

[0066] The robot control model training method based on event cameras provided in the above embodiments can generate event data that closely resembles real-world scenes without relying on real event camera hardware, and supports efficient training in parkour scenarios.

[0067] In one possible implementation, Figure 6 A flowchart illustrating the robot control model training method based on an event camera provided in this application embodiment. Figure 5 ,like Figure 6 As shown, the process of training the student model based on the first loss value to obtain the robot control model in S104 above may include: S501. Train the student model based on the first loss value to obtain the preliminary distillation model.

[0068] S502. A simulated robot with a preliminary distillation model is deployed and moves in a preset simulation environment to acquire second brightness change event data and second proprioceptive information.

[0069] S503: Using a teacher model, based on terrain scan point information, second proprioceptive information, and target yaw angle, outputs the third predicted joint action and third predicted yaw angle command of the simulated robot.

[0070] S504. Using a preliminary distillation model, based on the second brightness change event data and the second proprioceptive information, output the fourth predicted joint action and the fourth predicted yaw angle command of the simulated robot.

[0071] S505. Calculate the second loss value based on the third predicted joint action and the fourth predicted joint action, the third predicted yaw angle command and the fourth predicted yaw angle command.

[0072] S506. Train the preliminary distillation model based on the second loss value to obtain the robot control model.

[0073] In this embodiment, the training of the student model is divided into two stages: a warm-up stage and an environment interaction optimization stage. S101-S104 above are the warm-up stage, in which the teacher model collects data in the simulation environment and performs preliminary training on the student model based on the first loss value, so that the student model initially imitates the decision output of the teacher model.

[0074] In the environmental interaction optimization phase, the preliminary distillation model trained in the prediction phase is deployed in the simulation robot. The preliminary distillation model controls the simulation robot to collect data in the simulation environment. The collected data includes second brightness change event data and second proprioceptive information.

[0075] The specific execution methods of S503-S506 are the same as those of S101-S104, and will not be repeated here.

[0076] The robot control model training method based on event cameras provided in the above embodiments continuously optimizes distillation loss by placing the student model in the same simulation environment as the teacher model, ensuring that the student model's parkour performance in complex terrain is not inferior to that of the teacher model. This embodiment eliminates the need to train the student model from scratch, rapidly transferring parkour skills through distillation, thus reducing training complexity and computational consumption.

[0077] In one possible implementation, the neuron models of the spiking neural network encoder and the spiking neural network actuator adopt an ensemble-firing neuron model.

[0078] In this embodiment, the neuron models in the spiking ResNet-18 in the spiking neural network encoder, the spiking residual network, and the 3-layer spiking MLP in the spiking neural network actuator all adopt the integrated-fire (IF) neuron model. The membrane potential accumulates with the input signal, generates a pulse when it exceeds the threshold, and resets to a resting potential below the threshold after the pulse.

[0079] The robot control model training method based on event cameras provided in the above embodiments uses a spiking neural network model composed of spiking neurons to perform sparse transmission and computation at each layer, reducing redundant computation and achieving low-energy inference.

[0080] Based on the event camera-based robot control model training method provided in the above embodiments, this application also provides a motion control method for a quadruped robot. Figure 7 This is a flowchart illustrating the motion control method for a quadruped robot provided in an embodiment of this application, as shown below. Figure 7 As shown, the method may include: S601: Collect environmental event streams through an event camera deployed on a quadruped robot.

[0081] S602: A robot control model is used to generate control commands based on the flow of environmental events.

[0082] S603. Control the quadruped robot to move according to the control command.

[0083] In this embodiment, the robot control model trained using the above-mentioned event camera-based robot control model training method is deployed in a quadruped robot, which moves in a preset task execution environment.

[0084] By acquiring the environmental event stream of the preset task execution environment through the event camera deployed on the quadruped robot, the robot control model outputs the joint movement commands and yaw angle commands that the quadruped robot needs to execute in the preset task execution environment based on the environmental event stream. The controller controls the quadruped robot to execute the environmental event stream and complete the parkour action.

[0085] In one possible implementation, the training and control system of this application may include: a sensing module, a computing module, a storage module, and a computing module.

[0086] The perception module includes an event camera and a proprioception sensor. The event camera is used to capture dynamic pixel brightness changes and output asynchronous event streams. The proprioception sensor is used to collect motion states such as robot joint position and speed, supports 10Hz event image sampling, and is adapted to kilohertz-level time resolution.

[0087] The computing module is equipped with a GPU and integrates an event data processing module, an SNN inference module, and a distillation learning module. It supports training in 32 parallel simulation environments and realizes end-to-end perception-control inference.

[0088] The storage module includes high-speed storage units for caching simulation environment data (terrain model, lighting parameters), ANN / SNN model parameters, event stream data, and training logs (loss curves, parkour success rate).

[0089] The execution module includes the quadruped robot body (including multi-degree-of-freedom joints and drive system), receives joint motion commands and yaw angle commands output by SNN, and executes parkour actions. The energy consumption of the joint motors is comparable to that of the ANN solution, but the environmental adaptability is better.

[0090] The experimental equipment and parameter settings used in training the robot control model are as follows: The hardware configuration includes a simulation platform, computing devices, sensor configuration, and robot model. The simulation platform can be, for example, the IsaacGym high-performance robot simulation environment, which supports 32 parallel robot simulation environments. The computing devices include GPUs with a training time of 30 hours. The sensor configuration includes a simulation event camera (outputting position, timestamp, and polarity event stream) and a proprioceptive sensor (collecting joint motion states). The robot model is a quadruped robot with a multi-degree-of-freedom joint structure, adapted for parkour motion execution.

[0091] Key parameters may include, for example: Training parameters: learning rate 0.001, SNN adopts IF neuron model, pulse time step = 4, distillation loss is action MSE loss + yaw angle MSE loss; network parameters: visual backbone is pulse ResNet-18, feature fusion adopts GRU module, actor network is 3-layer pulse MLP; simulation parameters: event image sampling frequency 10Hz, terrain scene includes steps, ditches, obstacles, parkour terrain, lighting conditions cover strong light, low light, and normal light.

[0092] The testing steps are as follows: First, train the ANN teacher model: A multi-terrain simulation environment was built in IsaacGym. Privileged information such as terrain scan points and target yaw angle was input. Reinforcement learning was used to optimize the ANN strategy until the parkour performance was stable. The action output, yaw angle command and parkour success rate of the ANN under different terrain and lighting conditions were recorded.

[0093] Then, the SNN student model is distilled: Warm-up phase: Fix the ANN parameters, train the SNN to minimize the output error with the ANN, and iterate until the loss converges; Environment interaction phase: Place the SNN in the same simulation environment and continuously optimize the distillation loss by combining environmental feedback to ensure the transfer of parkour skills; Fine-tuning of control parameters: Adjust parameters such as the pulse threshold and leakage factor of the SNN to optimize the smoothness of joint movements and terrain adaptability.

[0094] Finally, performance testing was conducted. Among them, the energy consumption test statistically analyzes the computational load (FLOPs / SOPs) and theoretical energy consumption of SNN and ANN, and calculates the energy saving ratio; the parkour performance test tests the parkour success rate and action response speed of SNN under four types of terrain and different lighting conditions; the robustness test verifies the perception stability and control continuity under extreme lighting and terrain change scenarios.

[0095] Performance testing yielded the following results: In simulation verification, the SNN solution consumed only 11.7% of the energy of the ANN, the actor module consumed 69.44% less energy, and the computational efficiency ratio (SNN:ANN) was as low as 0.29:1. Regarding parkour success rate: 60% for steps, 45% for ravines, 71% for obstacles, and 29% for parkour terrain, demonstrating superior adaptability to various terrain types compared to the traditional ANN solution. In terms of lighting adaptability, terrain perception showed no significant degradation in both strong and low light environments, and the parkour success rate was improved by more than 30% compared to the traditional depth camera solution. Regarding dynamic response, the combination of the event camera and SNN reduced the robot's response latency to terrain changes by 50%, matching the joint control cycle.

[0096] Through comparative verification, this application shows that compared with the ANN solution, its energy consumption is reduced by 88.3%, the parkour success rate is comparable, and the light adaptability and dynamic response are better. Compared with the traditional sensor solution, the parkour success rate under extreme light is significantly improved, and there are no control failure problems caused by overexposure or noise.

[0097] Based on the above method embodiments, this application also provides a robot control model training device based on an event camera. Figure 8 This is a schematic diagram of the structure of the robot control model training device based on an event camera provided in an embodiment of this application, as shown below. Figure 8 As shown, the device may include: The teacher model prediction module 701 is used to move a simulated robot with a pre-trained teacher model in a preset simulation environment, and to acquire the first brightness change event data of the preset simulation environment collected by the event camera simulation model, the first proprioceptive information of the simulated robot, and the prediction results output by the teacher model. The prediction results are output by the teacher model based on the terrain scan point information, the first proprioceptive information and the target yaw angle. The prediction results include: the first predicted joint action and the first predicted yaw angle command. The student model prediction module 702 is used to output a second predicted joint action and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information using the student model to be trained. The loss value calculation module 703 is used to calculate the first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command. The model training module 704 is used to train the student model based on the first loss value to obtain the robot control model.

[0098] Optionally, the teacher model includes an artificial neural network encoder and an artificial neural network actuator. The teacher model prediction module 701 is specifically used to generate terrain depth potential features and a first predicted yaw angle command based on terrain scan point information using the artificial neural network encoder; and to generate a first predicted joint action based on terrain depth potential features, first proprioceptive information and target yaw angle using the artificial neural network actuator.

[0099] Optionally, the student model includes a spiking neural network encoder and a spiking neural network actuator. The student model prediction module 702 is specifically used to generate a brightness change depth latent feature and a second predicted yaw angle command based on the first brightness change event data using the spiking neural network encoder; and to output a second predicted joint action based on the brightness change depth latent feature, the second predicted yaw angle command, and the first proprioceptive information using the spiking neural network actuator.

[0100] Optionally, the device may further include: The data acquisition module is used to acquire the pixel brightness change of each pixel in multiple consecutive depth image frames captured by the event camera simulation model in a preset simulation environment; based on the pixel brightness change of each pixel and a preset contrast threshold, it generates first brightness change event data, which includes: pixel coordinates, brightness change timestamp, and brightness change polarity.

[0101] Optionally, the model training module 704 is specifically used to train the student model based on the first loss value to obtain a preliminary distillation model; The data acquisition module is also used to move a simulated robot with a preliminary distillation model deployed in a preset simulation environment to acquire second brightness change event data and second proprioceptive information. The teacher model prediction module 701 is also used to use the teacher model to output the third predicted joint action and the third predicted yaw angle command of the simulation robot based on the terrain scan point information, the second proprioceptive information and the target yaw angle. The student model prediction module 702 is also used to use a preliminary distillation model to output the fourth predicted joint action and the fourth predicted yaw angle command of the simulated robot based on the second brightness change event data and the second proprioceptive information. The loss value calculation module 703 is used to calculate the second loss value based on the third predicted joint action and the fourth predicted joint action, the third predicted yaw angle command and the fourth predicted yaw angle command; The model training module 704 is also used to train the preliminary distillation model based on the second loss value to obtain the robot control model.

[0102] Optionally, the neuron models of the spiking neural network encoder and the spiking neural network actuator adopt the ensemble-firing neuron model.

[0103] Based on the above method embodiments, this application also provides a motion control device for a quadruped robot. Figure 9 This is a schematic diagram of the motion control device for a quadruped robot provided in an embodiment of this application, as shown below. Figure 9 As shown, the device may include: The event stream acquisition module 801 is used to acquire environmental event streams through an event camera deployed on a quadruped robot; The instruction generation module 802 is used to generate control instructions based on the environmental event flow using a robot control model, wherein the robot control model is trained using a robot control model training method as described in any of the first aspects. The motion control module 803 is used to control the movement of the quadruped robot according to control commands.

[0104] The above-described device is used to execute the method provided in the foregoing embodiments, and its implementation principle and technical effect are similar, so they will not be described again here.

[0105] These modules can be one or more integrated circuits configured to implement the above methods, such as one or more Application Specific Integrated Circuits (ASICs), one or more microprocessors, or one or more Field Programmable Gate Arrays (FPGAs). Alternatively, when a module is implemented using processing element scheduler code, the processing element can be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. Furthermore, these modules can be integrated together as a system-on-a-chip (SOC).

[0106] Figure 10 A schematic diagram of the electronic device provided in the embodiments of this application, such as... Figure 10 As shown, the electronic device 900 may include a processor 901, a storage medium 902, and a bus. The storage medium 902 stores program instructions executable by the processor 901. When the electronic device 900 is running, the processor 901 communicates with the storage medium 902 via the bus, and the processor 901 executes the program instructions to perform the above-described method embodiment. The specific implementation and technical effects are similar and will not be described in detail here.

[0107] Optionally, this application also provides a computer-readable storage medium storing a computer program, which is executed by a processor to perform the above-described method embodiments.

[0108] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0109] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0110] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or in a combination of hardware and software functional units.

[0111] The integrated units implemented as software functional units described above can be stored in a computer-readable storage medium. These software functional units, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0112] The above are merely specific embodiments of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

Claims

1. A method for training a robot control model based on an event camera, characterized in that, The method includes: A simulated robot equipped with a pre-trained teacher model moves in a preset simulation environment. The simulation model acquires first brightness change event data of the preset simulation environment collected by the event camera, first proprioceptive information of the simulated robot, and prediction results output by the teacher model. The prediction results are output by the teacher model based on terrain scan point information, the first proprioceptive information, and the target yaw angle. The prediction results include: first predicted joint movements and first predicted yaw angle commands. The student model to be trained outputs a second predicted joint action and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information. Calculate the first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command; The student model is trained based on the first loss value to obtain the robot control model.

2. The method as described in claim 1, characterized in that, The teacher model includes an artificial neural network encoder and an artificial neural network actuator. The process by which the teacher model outputs prediction results includes: The artificial neural network encoder is used to generate potential terrain depth features and the first predicted yaw angle command based on the terrain scan point information; The artificial neural network actuator is used to generate the first predicted joint action based on the potential features of terrain depth, the first proprioceptive information, and the target yaw angle.

3. The method as described in claim 1, characterized in that, The student model includes a spiking neural network encoder and a spiking neural network actuator. The student model, based on the first brightness change event data and the first proprioceptive information, outputs second predicted joint movements and second predicted yaw angle commands, including: Using the pulse neural network encoder, a latent feature of brightness change depth and a second predicted yaw angle command are generated based on the first brightness change event data; Using the spiking neural network actuator, the second predicted joint action is output based on the potential features of the brightness change depth, the second predicted yaw angle command, and the first proprioceptive information.

4. The method as described in claim 1, characterized in that, The acquisition of the first brightness change event data of the preset simulation environment collected by the event camera simulation model includes: The event camera simulation model captures the pixel brightness change of each pixel in multiple consecutive depth image frames of the preset simulation environment. Based on the pixel brightness change of each pixel and a preset contrast threshold, first brightness change event data is generated. The first brightness change event data includes: pixel coordinates, brightness change timestamp, and brightness change polarity.

5. The method as described in claim 1, characterized in that, The step of training the student model based on the first loss value to obtain the robot control model includes: The student model is trained based on the first loss value to obtain a preliminary distillation model; A simulated robot equipped with the preliminary distillation model moves in the preset simulation environment to acquire second brightness change event data and second proprioceptive information. Using the teacher model, based on the terrain scan point information, the second proprioceptive information, and the target yaw angle, the third predicted joint action and the third predicted yaw angle command of the simulation robot are output; Using the aforementioned preliminary distillation model, based on the second brightness change event data and the second proprioceptive information, the fourth predicted joint action and the fourth predicted yaw angle command of the simulated robot are output; The second loss value is calculated based on the third predicted joint action and the fourth predicted joint action, the third predicted yaw angle command and the fourth predicted yaw angle command; The robot control model is obtained by training the preliminary distillation model based on the second loss value.

6. The method as described in claim 3, characterized in that, The neuron models of the spiking neural network encoder and the spiking neural network actuator adopt the ensemble-firing neuron model.

7. A motion control method for a quadruped robot, characterized in that, The method includes: An event stream is acquired by an event camera deployed on the quadruped robot; A robot control model is used to generate control commands based on the environmental event flow, wherein the robot control model is trained using the robot control model training method as described in any one of claims 1 to 6; The quadruped robot is controlled to move according to the control instructions.

8. A robot control model training device based on an event camera, characterized in that, The device includes: The teacher model prediction module is used to move a simulated robot with a pre-trained teacher model deployed in a preset simulation environment, and to acquire the first brightness change event data of the preset simulation environment collected by the event camera simulation model, the first proprioceptive information of the simulated robot, and the prediction result output by the teacher model. The prediction result is output by the teacher model based on the terrain scan point information, the first proprioceptive information, and the target yaw angle. The prediction result includes: the first predicted joint action and the first predicted yaw angle command. The student model prediction module is used to output a second predicted joint action and a second predicted yaw angle command based on the first brightness change event data and the first proprioceptive information using the student model to be trained. The loss value calculation module is used to calculate a first loss value based on the first predicted joint action and the second predicted joint action, the first predicted yaw angle command and the second predicted yaw angle command; The model training module is used to train the student model based on the first loss value to obtain the robot control model.

9. A motion control device for a quadruped robot, characterized in that, The device includes: The event stream acquisition module is used to acquire environmental event streams through an event camera deployed on the quadruped robot; The instruction generation module is used to generate control instructions based on the environmental event flow using a robot control model, wherein the robot control model is trained using the robot control model training method as described in any one of claims 1 to 6; The motion control module is used to control the quadruped robot to move according to the control commands.

10. An electronic device, characterized in that, include: The device includes a processor, a storage medium, and a bus. The storage medium stores program instructions executable by the processor. When the electronic device is running, the processor communicates with the storage medium via the bus. The processor executes the program instructions to perform the steps of the robot control model training method as described in any one of claims 1 to 6, or the steps of the motion control method for a quadruped robot as described in claim 7.