Motion control method and device, robot and storage medium
By discretizing the robot's motion trajectory into multiple trajectory points and optimizing the state variables, and by utilizing cost functions and constraints, the collision impact problem at the moment of landing in highly dynamic motion is solved, thereby improving the stability of the robot's motion and the accuracy of trajectory tracking.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING XIAOMI ROBOT TECH CO LTD
- Filing Date
- 2023-10-31
- Publication Date
- 2026-06-30
AI Technical Summary
When a robot performs highly dynamic movements that include being airborne, the impact of a collision upon landing affects the stability of the movement, resulting in large trajectory tracking errors, poor stability, or even the inability to complete the movement.
The trajectory of the target motion is discretized into multiple trajectory points, the state variables of the trajectory points are optimized, and the robot is planned and controlled to execute the target motion using a pre-built cost function and constraints. This includes using centroid dynamics and full kinematics models, adding relaxation variables for linearization, and optimizing the state variables of the trajectory points.
It enables real-time trajectory planning in highly dynamic motion, mitigates the impact of collisions upon landing, and improves the stability of robot motion and trajectory tracking accuracy.
Smart Images

Figure CN119910636B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of robotics, specifically to a motion control method, device, robot, and storage medium. Background Technology
[0002] In recent years, robotics technology has continuously developed, becoming increasingly intelligent and automated, with improvements in the richness, stability, and flexibility of its movements. Robots can replace users in performing specific tasks in their production and daily lives, thus bringing convenience. However, in related technologies, when robots perform movements involving airborne states, the impact of collisions upon landing can often affect the robot's motion stability. Summary of the Invention
[0003] To overcome the problems existing in the related technologies, this disclosure provides a motion control method, device, robot, and storage medium to solve the defects in the related technologies.
[0004] According to a first aspect of the present disclosure, a motion control method is provided, the method comprising:
[0005] Based on the action sequence of the target motion, the trajectory of the target motion is discretized into multiple trajectory points, wherein the action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages, and the trajectory points have state variables;
[0006] The state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the state variable optimization results of the trajectory points, wherein the optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence;
[0007] Based on the optimization results of the state variables of the trajectory points, the robot is controlled to perform the target motion.
[0008] In one embodiment of this disclosure, optimizing the state variables of the trajectory points among the plurality of trajectory points to obtain the optimized state variable results of the trajectory points includes:
[0009] Using a pre-constructed cost function, the state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the optimized state variables of the trajectory points, wherein the cost function is related to at least one of the following:
[0010] The error between the position vector of at least one key trajectory point among the plurality of trajectory points and the preset reference position;
[0011] Contact force at least one of the plurality of trajectory points;
[0012] The joint angular velocity of at least one of the plurality of trajectory points.
[0013] In one embodiment of this disclosure, optimizing the state variables of the trajectory points among the plurality of trajectory points to obtain the optimized state variable results of the trajectory points includes:
[0014] The state variables of the trajectory points among the plurality of trajectory points are optimized under the constraint of the first constraint condition to obtain the state variable optimization result of the trajectory points, wherein the first constraint condition is related to the action sequence.
[0015] In one embodiment of this disclosure, the first constraint includes at least one of the following:
[0016] The state variables of each trajectory point conform to the dynamic model obtained from the center of mass dynamics and the whole kinematics;
[0017] The state variables of adjacent trajectory points satisfy integral continuity;
[0018] The state variables of each trajectory point satisfy the relevant limit conditions;
[0019] The contact point with the ground has no relative motion with the ground, the supporting force between the contact point and the ground is not less than 0, and the frictional force between the contact point and the ground is not greater than the maximum static sliding friction.
[0020] The support force between the contact point that is not in contact with the ground and the ground is 0, and the speed meets the speed requirements of the preset reference trajectory.
[0021] In one embodiment of this disclosure, the step of optimizing the state variables of the trajectory points among the plurality of trajectory points using the cost function under the constraint of the first constraint condition to obtain the optimized state variable result of the trajectory points includes:
[0022] Under the constraint of the linearization result of the first constraint, the state variables of the trajectory points among the multiple trajectory points are optimized using the quadratic approximation result of the cost function to obtain the optimized state variable result of the trajectory points.
[0023] In one embodiment of this disclosure, the method further includes:
[0024] Slack variables are added to the first constraint to linearize it.
[0025] In one embodiment of this disclosure, the state variables include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector, wherein the momentum vector includes linear momentum and nonlinear momentum, the position vector includes a floating base position and a joint position, the floating base position includes a coordinate position and an attitude angle in the form of a unit quaternion, the contact point vector includes the external force acting on the contact point, and the joint velocity vector includes the velocity of the joint.
[0026] In one embodiment of this disclosure, controlling the robot to perform the target motion based on the optimization result of the state variables of the trajectory points includes:
[0027] Under the second constraint, at least one slack variable is tracked to obtain the optimization result of the optimization variable, wherein at least one condition of the second constraint is related to the at least one slack variable, at least one condition of the second constraint is related to the optimization variable, and at least one condition of the second constraint is related to the optimization result of the state variable;
[0028] Based on the optimization results of the optimization variables, the robot is controlled to perform the target motion.
[0029] In one embodiment of this disclosure, the optimization variables include an acceleration vector, a contact point vector, and a relaxation vector, wherein the acceleration vector includes the acceleration of the joint, the contact point vector includes the external force acting on the contact point, and the relaxation vector includes the at least one relaxation variable.
[0030] In one embodiment of this disclosure, the second constraint includes at least one of the following:
[0031] Consistency of floating base dynamics;
[0032] The point of contact with the ground has no relative movement to the ground;
[0033] The frictional force between the contact point and the ground is no greater than the maximum static sliding friction.
[0034] The contact point that is not in contact with the ground tracks the position and velocity in the optimization result of the state variables under the relaxation effect of the first relaxation variable;
[0035] The robot's floating base tracks the floating base pose in the optimization result of the state variables under the relaxation effect of the second relaxation variable;
[0036] The robot's arm joints track the joint position and velocity in the state variable optimization result under the relaxation effect of the third relaxation variable;
[0037] The robot's contact point tracks the contact point vector in the state variable optimization result under the relaxation effect of the fourth relaxation variable.
[0038] According to a second aspect of the present disclosure, a motion control device is provided, the device comprising:
[0039] The discrete module is used to discretize the trajectory of the target motion into multiple trajectory points according to the action sequence of the target motion, wherein the action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages, and the trajectory points have state variables.
[0040] An optimization module is used to optimize the state variables of the trajectory points among the plurality of trajectory points to obtain the optimization results of the state variables of the trajectory points, wherein the optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence;
[0041] The control module is used to control the robot to perform the target motion based on the optimization results of the state variables of the trajectory points.
[0042] In one embodiment of this disclosure, the optimization module is used to:
[0043] Using a pre-constructed cost function, the state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the optimized state variables of the trajectory points, wherein the cost function is related to at least one of the following:
[0044] The error between the position vector of at least one key trajectory point among the plurality of trajectory points and the preset reference position;
[0045] Contact force at least one of the plurality of trajectory points;
[0046] The joint angular velocity of at least one of the plurality of trajectory points.
[0047] In one embodiment of this disclosure, the optimization module is used to:
[0048] The state variables of the trajectory points among the plurality of trajectory points are optimized under the constraint of the first constraint condition to obtain the state variable optimization result of the trajectory points, wherein the first constraint condition is related to the action sequence.
[0049] In one embodiment of this disclosure, the first constraint includes at least one of the following:
[0050] The state variables of each trajectory point conform to the dynamic model obtained from the center of mass dynamics and the whole kinematics;
[0051] The state variables of adjacent trajectory points satisfy integral continuity;
[0052] The state variables of each trajectory point satisfy the relevant limit conditions;
[0053] The contact point with the ground has no relative motion with the ground, the supporting force between the contact point and the ground is not less than 0, and the frictional force between the contact point and the ground is not greater than the maximum static sliding friction.
[0054] The support force between the contact point that is not in contact with the ground and the ground is 0, and the speed meets the speed requirements of the preset reference trajectory.
[0055] In one embodiment of this disclosure, the optimization module is used to optimize the state variables of the trajectory points among the plurality of trajectory points using the cost function under the constraint of the first constraint condition, and when obtaining the optimization result of the state variables of the trajectory points, it is used to:
[0056] Under the constraint of the linearization result of the first constraint, the state variables of the trajectory points among the multiple trajectory points are optimized using the quadratic approximation result of the cost function to obtain the optimized state variable result of the trajectory points.
[0057] In one embodiment of this disclosure, the apparatus further includes a linear module for:
[0058] Slack variables are added to the first constraint to linearize it.
[0059] In one embodiment of this disclosure, the state variables include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector, wherein the momentum vector includes linear momentum and nonlinear momentum, the position vector includes a floating base position and a joint position, the floating base position includes a coordinate position and an attitude angle in the form of a unit quaternion, the contact point vector includes the external force acting on the contact point, and the joint velocity vector includes the velocity of the joint.
[0060] In one embodiment of this disclosure, the control module is used to:
[0061] Under the second constraint, at least one slack variable is tracked to obtain the optimization result of the optimization variable, wherein at least one condition of the second constraint is related to the at least one slack variable, at least one condition of the second constraint is related to the optimization variable, and at least one condition of the second constraint is related to the optimization result of the state variable;
[0062] Based on the optimization results of the optimization variables, the robot is controlled to perform the target motion.
[0063] In one embodiment of this disclosure, the optimization variables include an acceleration vector, a contact point vector, and a relaxation vector, wherein the acceleration vector includes the acceleration of the joint, the contact point vector includes the external force acting on the contact point, and the relaxation vector includes the at least one relaxation variable.
[0064] In one embodiment of this disclosure, the second constraint includes at least one of the following:
[0065] Consistency of floating base dynamics;
[0066] The point of contact with the ground has no relative movement to the ground;
[0067] The frictional force between the contact point and the ground is no greater than the maximum static sliding friction.
[0068] The contact point that is not in contact with the ground tracks the position and velocity in the optimization result of the state variables under the relaxation effect of the first relaxation variable;
[0069] The robot's floating base tracks the floating base pose in the optimization result of the state variables under the relaxation effect of the second relaxation variable;
[0070] The robot's arm joints track the joint position and velocity in the state variable optimization result under the relaxation effect of the third relaxation variable;
[0071] The robot's contact point tracks the contact point vector in the state variable optimization result under the relaxation effect of the fourth relaxation variable.
[0072] According to a third aspect of the present disclosure, a robot is provided, the robot including a memory and a processor, the memory being used to store computer instructions executable on the processor, and the processor being used to implement the motion control method described in the first aspect when executing the computer instructions.
[0073] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the method described in the first aspect.
[0074] The technical solutions provided by the embodiments of this disclosure may include the following beneficial effects:
[0075] The motion control method provided in this disclosure discretizes the trajectory of a target motion into multiple trajectory points based on the action sequence of the target motion, optimizes the state variables of the trajectory points to obtain the optimized state variable results, and then controls the robot to execute the target motion based on the optimized state variable results of each trajectory point. Since the action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages, and the optimization direction of the state variables includes making the trajectory of the target motion conform to the action sequence, this method can plan the trajectory of the target motion in real time and control the robot to execute the target motion based on the planning results. Even if the target motion is a highly dynamic motion involving an airborne state, this method can still plan the trajectory of the highly dynamic motion in real time and control the robot to stably complete the highly dynamic motion based on the planning results, mitigating the impact of collisions at the moment of landing on the robot's motion stability. Attached Figure Description
[0076] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with the invention and, together with the description, serve to explain the principles of the invention.
[0077] Figure 1 A schematic diagram of the structure of a bipedal robot shown in an exemplary embodiment of this disclosure;
[0078] Figure 2 A flowchart illustrating a motion control method in an exemplary embodiment of this disclosure;
[0079] Figure 3 This disclosure provides a schematic diagram of the contact points of a robot, illustrating an exemplary embodiment.
[0080] Figure 4 This disclosure provides an exemplary embodiment illustrating a schematic diagram of an action sequence;
[0081] Figure 5 A schematic diagram of a control frame is shown in an exemplary embodiment of this disclosure;
[0082] Figure 6 This disclosure presents a schematic diagram of the structure of a motion control device according to an exemplary embodiment;
[0083] Figure 7 This disclosure includes a structural block diagram of a robot as illustrated in an exemplary embodiment. Detailed Implementation
[0084] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.
[0085] The terminology used in this disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The singular forms “a,” “the,” and “the” as used in this disclosure and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
[0086] It should be understood that although the terms first, second, third, etc., may be used in this disclosure to describe various information, such information should not be limited to these terms. These terms are used only to distinguish information of the same type from one another. For example, without departing from the scope of this disclosure, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to determination."
[0087] In recent years, robotics technology has continuously developed, becoming increasingly intelligent and automated, with improvements in the richness, stability, and flexibility of its movements. Robots can replace users in performing specific tasks in their production and daily lives, thus bringing convenience to users. However, in related technologies, when robots perform highly dynamic movements involving airborne states (i.e., movements that include airborne states), the stability of their motion is often affected by the impact of collisions upon landing.
[0088] For example, when a robot performs high-dynamic motion, it can do so by tracking its reference trajectory. However, the reference trajectory is often a trajectory that is planned offline and configured in the robot in advance. This trajectory cannot adapt to situations such as collisions and impacts in high-dynamic motion, which are prone to trajectory tracking errors. It has poor robustness, resulting in large errors and poor stability when the robot performs high-dynamic motion, or even being unable to complete high-dynamic motion or encountering danger.
[0089] Based on this, at least one embodiment of this disclosure provides a motion control method that can be applied to robots, such as bipedal robots (humanoid robots) and quadrupedal robots (robot dogs). Please refer to the appendix. Figure 1The paper illustrates the degree-of-freedom structure of a bipedal robot, which includes two upper limbs and two lower limbs. The shoulders of the upper limbs have three degrees of freedom in the pitch, roll, and yaw directions; the elbows have one degree of freedom in the pitch direction; and the wrists may have two or three degrees of freedom. The hips of the lower limbs have three degrees of freedom in the pitch, roll, and yaw directions; the knees have one degree of freedom in the pitch direction; and the ankles have two degrees of freedom in the pitch and roll directions. Roll, pitch, and yaw represent the directions of rotation around the X, Y, and Z axes, respectively. The lower limbs end in two flat feet. By changing the contact state variables between the flat feet and the external environment, various dynamic behaviors can be achieved, such as walking, running, jumping, somersaulting, and stable operation. The flat feet (i.e., the robot) can have multiple contact points with the ground, for example, at least two contact points: the toe and the heel.
[0090] For example, this method can be applied to the planning and control of high-dynamic motion in robots, where high-dynamic motion refers to motion that includes a full-air phase. For instance, this method can be a real-time planning and control approach for high-dynamic motion based on a combination of model predictive control (MPC) and whole-body control (WBC). The robot can utilize both the MPC and WBC modules to implement this method. Specifically, the MPC module plans the trajectory of the high-dynamic motion in real time, and the WBC module controls the robot to execute the high-dynamic motion based on the real-time planning results from the MPC module. These modules can operate at their respective frequencies in different threads; generally, the control frequency of the WBC module is not lower than the operating frequency of the MPC module.
[0091] This disclosure first establishes a dynamic model of the robot system, and then constructs a complete process of the method based on this dynamic model. Therefore, before introducing the specific process of the method, the aforementioned dynamic model and its construction process will be described in detail.
[0092] The legged robot can be modeled as a multi-rigid-body system with a floating base B, on which legs and arms are connected; the motion of the entire system can be described relative to a fixed inertial frame I (i.e., the world coordinate system); the system's position vector q and velocity vector v are respectively:
[0093]
[0094]
[0095] In the above formula, q b =[r IB ,qIB ] T ∈R 7 It is the pose of the floating base in coordinate system I, and its position is r, which is the position of the floating base relative to the origin of the inertial coordinate system. IB This indicates that the attitude angle is expressed by the unit quaternion q. IB =[q w ,q x ,q y ,q z ] T This is to avoid universal joint locking issues and improve applicability. j Indicates the joint angle, v b Indicates the velocity of the floating base B. I v IB This represents the position of the floating base in coordinate system B relative to the velocity of inertial frame I. B ω IB v is the attitude angle of the floating base relative to the inertial frame I, expressed in coordinate system B. j , Indicates the joint angular velocity. n j It is the number of robot joints.
[0096] Therefore, the full-model multi-rigid-body dynamics equations of the legged robot system with contact (hereinafter referred to as the first dynamics model) can be written as:
[0097]
[0098] Where M and b represent the inertia matrix and nonlinear terms (the manifestations of Coriolis force, centrifugal force and gravity in the generalized coordinate space), respectively. τ is the acceleration vector, containing the acceleration of the robot's floating base and the acceleration of each joint; τ is the torque vector, containing the torque of each joint of the robot. It is the selection matrix for the driveable joint; J c It is the velocity Jacobian matrix at the contact point, f c It is the contact point vector, which contains the external forces acting on each contact point of the robot.
[0099] Therefore, the floating base dynamics model of the legged robot system (hereinafter referred to as the second dynamics model) can be:
[0100]
[0101] In the above formula, It is a floating basis selection matrix.
[0102] Furthermore, the center-of-mass dynamics of the legged robot system are combined with the full kinematic model.
[0103] To improve computational efficiency, the full-model multi-rigid-body dynamics of the robot is not considered in the MPC module. Instead, we adopt a model that combines centroidal dynamics and full kinematics, referred to as kino-dynamics. If we have sufficient control over the joints, we can control the robot to generate arbitrary external forces to affect the momentum of the robot's center of mass, thereby controlling the overall motion of the robot.
[0104] Center-of-mass dynamics refers to the Newton-Euler equations applied to the center of mass (CoM) of a robot:
[0105]
[0106] In the above formula, h = [h lin ,h ang ] T ∈R 6 It is the mass momentum relative to the center-of-mass coordinate system G, where the origin of G is located at the center of mass and the coordinate axes are in the same direction as the inertial coordinate system I. It is the position of contact point i relative to the centroid, and and These are the contact force and torque experienced at contact point i, respectively. This disclosure uses the four vertices of the foot as contact points to describe the contact between a legged robot and the external environment. Therefore, in the case of a humanoid robot, this center-of-mass dynamics model includes contact forces corresponding to eight contact points, but no contact torque.
[0107] Using the centroidal momentum matrix (CMM) Connecting the dynamics of the center of mass with the kinematics of the whole body, we obtain the following equation:
[0108]
[0109] Based on the algebraic properties of quaternions, we can obtain:
[0110]
[0111] In the above formula,
[0112] Assume the state and control inputs of the robot system are as follows:
[0113]
[0114] In the above formula, v j n represents the joint angular velocity. c It is the number of points that come into contact with the outside world.
[0115] Based on the above equation regarding the center of mass dynamics, total kinematics, and the system's state and control inputs, we can obtain the following equation representing the robot's continuous dynamics model, kino-dynamics (hereinafter referred to as the third dynamics model):
[0116]
[0117] Please refer to the appendix. Figure 2 The diagram illustrates the flow of the method, including steps S201 to S203. Steps S201 to S202 can be executed by the MPC module to plan the trajectory of the high-dynamic motion online in real time; step S203 can be executed by the WBC module to enable the robot to track the trajectory of the high-dynamic motion and complete the high-dynamic motion.
[0118] In step S201, the trajectory of the target motion is discretized into multiple trajectory points according to the action sequence of the target motion. The action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages. The trajectory points have state variables.
[0119] The action sequence of the target motion can be pre-selected and constructed. The target motion can be highly dynamic, in which the robot's contact state with the ground often differs at different stages. The robot's contact state with the ground includes the contact state of each contact point of the robot with the ground. For example, a time-varying array can be used to represent the contact state of each contact point at different times; this array can be called the contact state array, denoted as c. i =c[i]=1 indicates that the contact point numbered i is in contact with the outside world (the contact state with the ground is contact), c i =c[i]=0 indicates that the contact point numbered i is not in contact with the outside world (the contact state with the ground is no contact). The dimension of C is the total number of contact points of the robot; please refer to the appendix. Figure 3 Taking a bipedal robot with 4 contact points on each foot as an example, the array dimension is 8.
[0120] Based on the above analysis, events in which the contact state between at least one contact point of the first joint model and the ground changes can be identified as switching events, and the motion between adjacent switching events can be defined as an action phase. Each action phase is described by a dynamic equation, meaning different action phases are described by different dynamic equations. (See attached...) Figure 1 Taking the bipedal robot shown as an example of performing a backflip (i.e., the target motion is a backflip), it can be constructed in the manner described above. Figure 4The action sequence shown includes four motion phases: full foot contact, toe contact, no contact, and full foot contact, as well as heel lift switching events between full foot contact and toe contact, toe lift switching events between toe contact and no contact, and landing switching events between no contact and full foot contact.
[0121] It is understandable that the robot can be controlled at a certain frequency, meaning this method can be executed at a certain frequency. The moment when this method executes control is called a control frame. Please refer to the appendix. Figure 5 t0, t1, t2... are control frames, and the time interval between adjacent control frames is δt, i.e., the control frequency is...
[0122] Please continue to refer to the appendix. Figure 5 When this step is performed in each control frame, a time frame T can be constructed for the future. horizon The trajectory is calculated and discretized according to a preset time step Δt to obtain multiple trajectory points with an interval of Δt. The states and control inputs corresponding to these trajectory points are the target variables for trajectory planning using this method, hereinafter referred to as state variables. For example, state variables may include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector. The momentum vector includes linear momentum and nonlinear momentum; the position vector includes the floating base position and the position of (each) joint; the floating base position includes coordinate position and attitude angle in the form of a unit quaternion; the contact point vector includes the contact state between (each) contact point and the ground; and the joint velocity vector includes the velocity of (each) joint. For example, the state variable X is shown in the following equation:
[0123] X = {h[k],q[k],f} c [k],v j [k]}, for all k=1,...,N
[0124] In the above formula, k is one of the N trajectory points (knot nodes) obtained after discretizing the motion trajectory, i is one of nc contact points, c is an nc×N two-dimensional matrix, each column corresponds to a trajectory point, and each column includes the contact point vector on the corresponding trajectory point; h[k] is the momentum vector of the k-th trajectory point, q[k] is the position vector of the k-th trajectory point, and f c [k] is the contact point vector of the kth trajectory point, v j [k] is the joint velocity vector of the kth trajectory point, which contains the angular velocity of each joint of the robot.
[0125] Understandably, the action sequence can also include the time, state x, and control input u of key trajectory points in the aforementioned trajectory. For example, the height of the floating base at the highest point of the takeoff phase in a somersault.
[0126] In step S102, the state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the state variable optimization results of the trajectory points. The optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence.
[0127] For example, the state variables of the trajectory points among the plurality of trajectory points are optimized using a pre-constructed cost function to obtain the state variable optimization result of the trajectory points, wherein the cost function is related to at least one of the following: the error between the position vector of at least one key trajectory point among the plurality of trajectory points and a preset reference position; the contact force of at least one trajectory point among the plurality of trajectory points; and the joint angular velocity of at least one trajectory point among the plurality of trajectory points.
[0128] As another example, the state variables of the trajectory points among the plurality of trajectory points are optimized under the constraint of the first constraint condition to obtain the state variable optimization result of the trajectory points, wherein the first constraint condition is related to the action sequence.
[0129] For example, under the constraint of the first constraint, the state variables of the trajectory points among the plurality of trajectory points are optimized using a pre-constructed cost function to obtain the state variable optimization result of the trajectory points. The first constraint is related to the action sequence, and the cost function is related to at least one of the following: the error between the position vector of at least one key trajectory point among the plurality of trajectory points and a preset reference position; the contact force of at least one trajectory point among the plurality of trajectory points; and the joint angular velocity of at least one trajectory point among the plurality of trajectory points.
[0130] The following examples illustrate the cost function and the first constraint mentioned in the three examples above.
[0131] For example, the cost function can be expressed as follows:
[0132]
[0133] In the above formula, k # K is the key trajectory point. # The set of key trajectory points can contain the trajectory of the target's motion (e.g., time T after the current moment). horizon All key trajectory points in the trajectory within the time interval; q[k # ] is k # The position vector on, q ref [k #] is k # The reference position is on the matrix; W1 and W2 are the weight matrices, respectively.
[0134] For example, the first constraint includes at least one of the following:
[0135] The first condition is that the state variables of each trajectory point conform to the dynamic model (i.e., the third dynamic model) obtained from the center-of-mass dynamics and the total kinematics.
[0136] The second condition is that the state variables of adjacent trajectory points satisfy integral continuity, that is:
[0137]
[0138] Thirdly, the state variable of each trajectory point satisfies the relevant limit conditions, namely at least one of the following:
[0139]
[0140]
[0141] In the above formula, q min q represents the lower limit of the joint position. max This represents the upper limit of the joint position. This represents the upper limit of joint velocity.
[0142] Fourth requirement: The contact point with the ground has no relative motion with the ground, the supporting force between the contact point and the ground is not less than 0, and the frictional force between the contact point and the ground is not greater than the maximum static sliding friction.
[0143]
[0144] f c,z [k]≥0, if c i [k]==1
[0145]
[0146] In the above formula, c is the velocity vector of the k-th trajectory point; i [k] represents the contact state of the i-th contact point of the k-th trajectory point, where 1 indicates contact and 0 indicates no contact; f c,z [k] represents the supporting force between the contact point (i) of the k-th trajectory point and the ground, f c,x [k]、f c,y [k] represents the frictional force between the i-th contact point and the ground along the X-axis and Y-axis in the k-th trajectory point, and μ is the maximum static friction coefficient.
[0147] Fifth: The supporting force between the contact point (where there is no contact with the ground) and the ground is 0, and the speed meets the speed requirements of the preset reference trajectory, that is:
[0148] f c,z [k] = 0, if c i [k]==1
[0149]
[0150] In the above formula, v c [k] represents the velocity of the contact point (i) in the k-th trajectory point. Let be the reference velocity of the contact point (i-th) in the k-th trajectory point. r is the reference position of the contact point (i) in the k-th trajectory point. c (t) represents the actual position of the contact point (i) among the k-th trajectory points.
[0151] The above example constructs the constraints and / or cost function for state variable optimization. Furthermore, when optimizing the state variables using the constraints and / or cost function, the first constraint can be linearized (e.g., the first constraint above can be linearized as: In the formula, Indicates the nonlinear term The linearized approximation of the cost function is used, and a second approximation is performed on the cost function. Under the constraint of the linearization result of the first constraint, the second approximation result of the cost function is used to optimize the state variables of the trajectory points among the multiple trajectory points, so as to obtain the state variable optimization result of the trajectory points.
[0152] Linearizing the constraints and / or approximating the cost function can simplify the optimization process of the state variables, reducing it to a sequential quadratic programming (SQP) problem consisting of quadratic programming (QP) problems at each trajectory point, thereby improving the efficiency of online real-time trajectory planning.
[0153] Optionally, slack variables are added to the first constraint to linearize it. For example, the values of the slack variables are very large in the initial QP problem; after solving the first QP problem, the values of the slack variables are gradually reduced, the constraints are linearized again, and a new QP problem is constructed and solved; this process is repeated until the convergence accuracy is met, and the state variable optimization result is obtained.
[0154] The above steps S201 and S202 have completed the trajectory planning for high-dynamic motion, and the obtained target motion (in the future time period T) has been obtained.horizon The optimization results of the state variables of the trajectory points on the trajectory (within) represent the state of the trajectory points and the control input. Combining the details of the above two steps, it can be seen that the MPC module can construct the trajectory planning of highly dynamic motion as an optimal control problem (OCP). It uses direct multiple-shooting to discretize the original problem, adds cost functions and constraints at each discretized trajectory point, performs a second-order approximation on the optimization objective (e.g., the cost function), and performs a linear approximation on the constraints. Finally, the original problem is transformed into a series of QP problems (i.e., SQP) that progressively approximate the original nonlinear problem. Solving these QP problems sequentially yields the solution to the original problem, which is the trajectory of the highly dynamic motion.
[0155] It should be understood that after the MPC module completes step S102 to obtain the state variable optimization results (i.e., the motion trajectory planned online) of each trajectory point (after trajectory discretization), it can send the state variable optimization results of each trajectory point (after trajectory discretization) to the WBC module so that the WBC module controls the robot to move according to step S203.
[0156] In step S203, the robot is controlled to perform the target motion based on the optimization results of the state variables of each of the plurality of trajectory points.
[0157] For example, step S203 can be performed in each control frame as follows: First, at least one slack variable is tracked under the second constraint to obtain the optimization result of the optimization variable; then, based on the optimization result of the optimization variable, the robot is controlled to perform the target motion.
[0158] The optimization variables include an acceleration vector, a contact point vector, and a relaxation vector. The acceleration vector includes the acceleration of each joint, the contact point vector includes the contact state between each contact point and the ground, and the relaxation vector includes at least one relaxation variable.
[0159] Wherein, at least one of the second constraints is related to the at least one slack variable, at least one of the second constraints is related to the optimization variable, and at least one of the second constraints is related to the optimization result of the state variable. For example, the second constraint includes at least one of the following:
[0160] The first item: the consistency of the dynamics of the floating base, that is:
[0161]
[0162] The parameters in the above equation were explained in detail when introducing the second dynamic model, and will not be repeated here.
[0163] The second condition: The point of contact with the ground has no relative motion with the ground, that is:
[0164]
[0165] In the above formula, This is the acceleration vector. The other parameters have been described in detail in the previous text and will not be repeated here.
[0166] The third condition: The frictional force between the point of contact with the ground and the ground does not exceed the maximum static sliding friction force, that is:
[0167] ||f c,x ||≤μf c,z ,||f c,y ||≤μf c,z ,if c i ==1
[0168] The parameters in the above formula have been explained in detail above, and will not be repeated here.
[0169] The fourth item: The contact point that is not in contact with the ground tracks the position and velocity in the optimization result of the state variables under the relaxation effect of the first relaxation variable, that is:
[0170]
[0171] In the above formula, For the contact point location in the state variable optimization result, r c Actual location of the contact point For the contact point velocity in the optimization results of the state variables, k is the actual velocity at the point of contact. p k d w1 and w2 are the proportional coefficient, differential coefficient, and first relaxation coefficient, respectively. Other parameters have been introduced in the previous text and will not be repeated here.
[0172] Fifthly: The robot's floating base, under the relaxation effect of the second relaxation variable, tracks the floating base pose in the optimization result of the state variables, that is:
[0173]
[0174] In the above formula, J b For floating basis Jacobian matrices, q represents the floating basis position in the optimization results of the state variables. b The actual position of the floating base. For the floating base velocity in the optimization results of the state variables, w1 represents the actual velocity of the floating base, and w2 is the second relaxation coefficient.
[0175] Item 6: Under the relaxation effect of the third relaxation variable, the robot's arm joints track the joint position and velocity in the state variable optimization result, that is:
[0176]
[0177] In the above formula, S arm Select a matrix for the arm joints. For the position of the arm joint in the optimization results of the state variables, q arm This refers to the actual position of the arm joint. The velocity of the arm joint in the optimization results of the state variables, w3 represents the actual velocity of the arm joint, and w3 is the third relaxation coefficient.
[0178] Item 7: The robot's contact point tracks the contact point vector in the state variable optimization result under the relaxation effect of the fourth relaxation variable, that is:
[0179]
[0180] In the above formula, f is the external force acting on the contact point in the state variable optimization result. c w4 represents the actual external force acting on the contact point, and w4 is the fourth relaxation variable.
[0181] For example, construct a QP problem represented by the following formula to optimize the optimization variables:
[0182]
[0183] In the above formula, Q is a positive definite matrix, and the optimization variables of QP are generalized acceleration, external force, and relaxation variables. The goal is to minimize the slack variable w of some equality constraints based on the weighting coefficients.
[0184] This step yields... Then, the torque command at the joint end can be obtained through inverse dynamics calculation using the following formula:
[0185]
[0186] The WBC module sends the torque command to the tube stage driver, thereby driving the joint to perform the movement and enabling the robot to complete the target movement.
[0187] The motion control method provided in this disclosure discretizes the trajectory of a target motion into multiple trajectory points based on the action sequence of the target motion, optimizes the state variables of the trajectory points to obtain the optimized state variable results, and then controls the robot to execute the target motion based on the optimized state variable results of each trajectory point. Since the action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages, and the optimization direction of the state variables includes making the trajectory of the target motion conform to the action sequence, this method can plan the trajectory of the target motion in real time and control the robot to execute the target motion based on the planning results. Even if the target motion is a highly dynamic motion involving an airborne state, this method can still plan the trajectory of the highly dynamic motion in real time and control the robot to stably complete the highly dynamic motion based on the planning results, mitigating the impact of collisions at the moment of landing on the robot's motion stability.
[0188] This disclosure proposes a high-dynamic real-time motion planning and control method for humanoid / legged robots, based on a combination of model predictive control (MPC) and whole-body control (WBC). Specifically, the real-time planning controller proposed in this patent consists of an online model predictive control (MPC) module and a whole-body control (WBC) module. The two modules run at their respective frequencies on different threads, with the control frequency of the WBC module generally not lower than the running frequency of the MPC module. In the construction of the optimization problem in the MPC module, the system dynamics adopt a kino-dynamics model (the third dynamics model mentioned above) that combines the robot's center-of-mass dynamics and total kinematics. The complex original nonlinear optimization problem is transformed into a sequential quadratic programming (SQP) problem using a multiple-targeting method and order-reduction approximation, thereby reducing computation time and enabling online solution. The optimal solution obtained is sent to the WBC module as the desired trajectory. In the optimization problem construction of the WBC module, the system dynamics adopts full-rigid-dynamics with contact (i.e., the first dynamic model mentioned above). The WBC module tracks the desired trajectory provided by MPC while satisfying various physical constraints, and constructs the control problem as a quadratic programming (QP) problem. The solution of QP serves as the control command at the joint end. The joint end executes the control command to complete the high-dynamic action.
[0189] According to a second aspect of the embodiments of this disclosure, a motion control device is provided; please refer to the appendix. Figure 6 The device includes:
[0190] Discretization module 601 is used to discretize the trajectory of the target motion into multiple trajectory points according to the action sequence of the target motion, wherein the action sequence includes multiple action stages, and there is at least one contact point with a different contact state between different action stages, and the trajectory points have state variables.
[0191] The optimization module 602 is used to optimize the state variables of the trajectory points among the plurality of trajectory points to obtain the optimization result of the state variables of the trajectory points, wherein the optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence;
[0192] The control module 603 is used to control the robot to perform the target motion based on the optimization results of the state variables of the trajectory points.
[0193] In one embodiment of this disclosure, the optimization module is used to:
[0194] Using a pre-constructed cost function, the state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the optimized state variables of the trajectory points, wherein the cost function is related to at least one of the following:
[0195] The error between the position vector of at least one key trajectory point among the plurality of trajectory points and the preset reference position;
[0196] Contact force at least one of the plurality of trajectory points;
[0197] The joint angular velocity of at least one of the plurality of trajectory points.
[0198] In one embodiment of this disclosure, the optimization module is used to:
[0199] The state variables of the trajectory points among the plurality of trajectory points are optimized under the constraint of the first constraint condition to obtain the state variable optimization result of the trajectory points, wherein the first constraint condition is related to the action sequence.
[0200] In one embodiment of this disclosure, the first constraint includes at least one of the following:
[0201] The state variables of each trajectory point conform to the dynamic model obtained from the center of mass dynamics and the whole kinematics;
[0202] The state variables of adjacent trajectory points satisfy integral continuity;
[0203] The state variables of each trajectory point satisfy the relevant limit conditions;
[0204] The contact point with the ground has no relative motion with the ground, the supporting force between the contact point and the ground is not less than 0, and the frictional force between the contact point and the ground is not greater than the maximum static sliding friction.
[0205] The support force between the contact point that is not in contact with the ground and the ground is 0, and the speed meets the speed requirements of the preset reference trajectory.
[0206] In one embodiment of this disclosure, the optimization module is used to optimize the state variables of the trajectory points among the plurality of trajectory points using the cost function under the constraint of the first constraint condition, and when obtaining the optimization result of the state variables of the trajectory points, it is used to:
[0207] Under the constraint of the linearization result of the first constraint, the state variables of the trajectory points among the multiple trajectory points are optimized using the quadratic approximation result of the cost function to obtain the optimized state variable result of the trajectory points.
[0208] In one embodiment of this disclosure, the apparatus further includes a linear module for:
[0209] Slack variables are added to the first constraint to linearize it.
[0210] In one embodiment of this disclosure, the state variables include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector, wherein the momentum vector includes linear momentum and nonlinear momentum, the position vector includes a floating base position and a joint position, the floating base position includes a coordinate position and an attitude angle in the form of a unit quaternion, the contact point vector includes the external force acting on the contact point, and the joint velocity vector includes the velocity of the joint.
[0211] In one embodiment of this disclosure, the control module is used to:
[0212] Under the second constraint, at least one slack variable is tracked to obtain the optimization result of the optimization variable, wherein at least one condition of the second constraint is related to the at least one slack variable, at least one condition of the second constraint is related to the optimization variable, and at least one condition of the second constraint is related to the optimization result of the state variable;
[0213] Based on the optimization results of the optimization variables, the robot is controlled to perform the target motion.
[0214] In one embodiment of this disclosure, the optimization variables include an acceleration vector, a contact point vector, and a relaxation vector, wherein the acceleration vector includes the acceleration of the joint, the contact point vector includes the external force acting on the contact point, and the relaxation vector includes the at least one relaxation variable.
[0215] In one embodiment of this disclosure, the second constraint includes at least one of the following:
[0216] Consistency of floating base dynamics;
[0217] The point of contact with the ground has no relative movement to the ground;
[0218] The frictional force between the contact point and the ground is no greater than the maximum static sliding friction.
[0219] The contact point that is not in contact with the ground tracks the position and velocity in the optimization result of the state variables under the relaxation effect of the first relaxation variable;
[0220] The robot's floating base tracks the floating base pose in the optimization result of the state variables under the relaxation effect of the second relaxation variable;
[0221] The robot's arm joints track the joint position and velocity in the state variable optimization result under the relaxation effect of the third relaxation variable;
[0222] The robot's contact point tracks the contact point vector in the state variable optimization result under the relaxation effect of the fourth relaxation variable.
[0223] In one embodiment of this disclosure, a robot is provided, please refer to the appendix. Figure 7 It illustrates the structure of the robot, which includes a memory and a processor. The memory stores computer instructions that can be executed on the processor, and the processor executes the computer instructions based on the motion control method of any of the foregoing embodiments.
[0224] In one embodiment of this disclosure, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the motion control method of any of the foregoing embodiments.
[0225] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the disclosure herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0226] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.
Claims
1. A motion control method, characterized in that, The method includes: Based on the action sequence of the target motion, the trajectory of the target motion is discretized into multiple trajectory points. The action sequence includes multiple action stages, and the contact state between at least one contact point of the robot and the ground is different between different action stages. The trajectory points have state variables, and the state variables include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector. The state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the state variable optimization results of the trajectory points, wherein the optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence; Based on the optimization results of the state variables of the trajectory points, the robot is controlled to execute the target motion.
2. The motion control method according to claim 1, characterized in that, The optimization of the state variables of the trajectory points among the plurality of trajectory points to obtain the optimization result of the state variables of the trajectory points includes: Using a pre-constructed cost function, the state variables of the trajectory points among the plurality of trajectory points are optimized to obtain the optimized state variables of the trajectory points, wherein the cost function is related to at least one of the following: The error between the position vector of at least one key trajectory point among the plurality of trajectory points and the preset reference position; Contact force at least one of the plurality of trajectory points; The joint angular velocity of at least one of the plurality of trajectory points.
3. The motion control method according to claim 1 or 2, characterized in that, The optimization of the state variables of the trajectory points among the plurality of trajectory points to obtain the optimization result of the state variables of the trajectory points includes: The state variables of the trajectory points among the plurality of trajectory points are optimized under the constraint of the first constraint condition to obtain the state variable optimization result of the trajectory points, wherein the first constraint condition is related to the action sequence; The first constraint includes at least one of the following: The state variables of each trajectory point conform to the dynamic model obtained from the center of mass dynamics and the whole kinematics; The state variables of adjacent trajectory points satisfy integral continuity; The state variables of each trajectory point satisfy the relevant limit conditions; The contact point with the ground has no relative motion with the ground, the supporting force between the contact point and the ground is not less than 0, and the frictional force between the contact point and the ground is not greater than the maximum static sliding friction. The support force between the contact point that is not in contact with the ground and the ground is 0, and the speed meets the speed requirements of the preset reference trajectory.
4. The motion control method according to claim 3, characterized in that, Under the constraint of the first constraint, the state variables of the trajectory points among the plurality of trajectory points are optimized using a cost function to obtain the optimization results of the state variables of the trajectory points, including: Under the constraint of the linearization result of the first constraint, the state variables of the trajectory points among the multiple trajectory points are optimized using the quadratic approximation result of the cost function to obtain the optimized state variable result of the trajectory points.
5. The motion control method according to claim 4, characterized in that, The method further includes: Slack variables are added to the first constraint to linearize it.
6. The motion control method according to claim 1, characterized in that, The momentum vector includes linear momentum and nonlinear momentum; the position vector includes the floating base position and the joint position; the floating base position includes the coordinate position and the attitude angle in the form of a unit quaternion; the contact point vector includes the external force on the contact point; and the joint velocity vector includes the velocity of the joint.
7. The motion control method according to claim 1, characterized in that, The step of controlling the robot to execute the target motion based on the optimization result of the state variables of the trajectory points includes: Under the second constraint, at least one slack variable is tracked to obtain the optimization result of the optimization variable, wherein at least one condition of the second constraint is related to the at least one slack variable, at least one condition of the second constraint is related to the optimization variable, and at least one condition of the second constraint is related to the optimization result of the state variable; Based on the optimization results of the optimization variables, the robot is controlled to perform the target motion; The optimization variables include an acceleration vector, a contact point vector, and a relaxation vector, wherein the acceleration vector includes the acceleration of the joint, the contact point vector includes the external force acting on the contact point, and the relaxation vector includes at least one relaxation variable. The second constraint includes at least one of the following: Consistency of floating base dynamics; The point of contact with the ground has no relative movement to the ground; The frictional force between the contact point and the ground is no greater than the maximum static sliding friction. The contact point that is not in contact with the ground tracks the position and velocity in the optimization result of the state variables under the relaxation effect of the first relaxation variable; The robot's floating base tracks the floating base pose in the optimization result of the state variables under the relaxation effect of the second relaxation variable; The robot's arm joints track the joint position and velocity in the state variable optimization result under the relaxation effect of the third relaxation variable; The robot's contact point tracks the contact point vector in the state variable optimization result under the relaxation effect of the fourth relaxation variable.
8. A motion control device, characterized in that, The device includes: The discrete module is used to discretize the trajectory of the target motion into multiple trajectory points according to the action sequence of the target motion. The action sequence includes multiple action stages, and the contact state of at least one contact point of the robot is different between different action stages. The trajectory points have state variables, and the state variables include at least one of the following: momentum vector, position vector, contact point vector, and joint velocity vector. An optimization module is used to optimize the state variables of the trajectory points among the plurality of trajectory points to obtain the optimization results of the state variables of the trajectory points, wherein the optimization direction of the state variables includes: making the trajectory of the target motion conform to the action sequence; The control module is used to control the robot to perform the target motion based on the optimization results of the state variables of the trajectory points.
9. A robot, characterized in that, The robot includes a memory and a processor, the memory being used to store computer instructions that can be executed on the processor, and the processor being used to implement the method of any one of claims 1 to 7 when executing the computer instructions.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method of any one of claims 1 to 7.