Method, system, and storage medium for dexterous hand motion control based on multi-modal tasks

By analyzing the image information of the target object and combining it with tactile sensors, the dexterous hand adopts a multimodal motion control method, which solves the problem of poor adaptability and single mode in the existing technology, and achieves efficient adaptation and stable operation to multiple types of task targets.

CN122033993BActive Publication Date: 2026-06-26WUTONG SENSATION CONTROL (BEIJING) TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
WUTONG SENSATION CONTROL (BEIJING) TECH CO LTD
Filing Date
2026-04-14
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing dexterous hand control methods rely on preset programs or vision-based single motion patterns, which cannot effectively adapt to diverse and complex tasks, resulting in a high failure rate and limiting their application scope and the overall performance of robots.

Method used

By analyzing the image information of the target object, identifying its shape type, the location of its maximum and minimum rotation radius, and combining visual perception with multimodal motion control, the system operates in three modes: holding, grasping, and gripping. It uses palm and fingertip tactile sensors to adjust the motion in real time to adapt to different task requirements.

Benefits of technology

It improves the adaptability and success rate of dexterous hands to various task objectives, ensures stability and reliability in complex environments, and expands the application boundaries of dexterous hands.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122033993B_ABST
    Figure CN122033993B_ABST
Patent Text Reader

Abstract

The application provides a dexterous hand action control method, system and storage medium based on a multi-modal task, and belongs to the technical field of robots. The method comprises the following steps: acquiring image information of a target object in an operation task; identifying shape size features of the target object by analyzing the image information; when the shape type of the target object is irregular or the maximum rotation radius of the target object is greater than the maximum gripping radius of the dexterous hand, performing the operation task in a supporting mode; when the shape type of the target object is regular, the maximum rotation radius of the target object is less than or equal to the maximum gripping radius of the dexterous hand, and the position of the minimum rotation radius is at the top of the target object, performing the operation task in a grabbing mode; and when the shape type of the target object is regular, the maximum rotation radius of the target object is less than or equal to the maximum gripping radius of the dexterous hand, and the position of the minimum rotation radius is at the middle of the target object, performing the operation task in a holding mode. The application improves the adaptability of the dexterous hand to multi-type task targets.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of robotics technology, and in particular to a method, system, and storage medium for controlling the movement of a dexterous hand based on multimodal tasks. Background Technology

[0002] With the rapid development of artificial intelligence and robotics, embodied intelligent robots are being used more and more widely in industries, services, and healthcare. Especially in high-precision operations such as minimally invasive surgery, surgical robots rely on their end effectors—dexterous hands—to perform delicate operations on tissues and instruments. Their control performance directly determines the safety, efficiency, and success rate of the surgery.

[0003] However, existing dexterous hand control methods mostly rely on preset programs or vision-based single action patterns, such as determining the grasping point through image recognition and then executing a grasping action along a fixed trajectory. Although some systems have introduced haptic feedback, it is usually only used for simple force control or anti-slip detection, failing to achieve dynamic recognition and adaptive adjustment for different task modes. This results in insufficient adaptability when facing diverse and complex tasks, a high failure rate, and limits the application scope of dexterous hands and the overall performance of robots.

[0004] Therefore, there is an urgent need to propose a dexterous hand control method that can adapt to multiple types of task objectives. Summary of the Invention

[0005] The purpose of this application is to provide a method, system, and storage medium for dexterous hand motion control based on multimodal tasks, in order to solve the above-mentioned problems.

[0006] To achieve the above objectives, in a first aspect, this application proposes a dexterous hand motion control method based on multimodal tasks, the method comprising:

[0007] Receive the operation task and acquire the image information of the target object in the operation task;

[0008] The shape and size features of the target object are identified by analyzing the image information. The shape and size features include the shape type, the location of the maximum rotation radius and the minimum rotation radius.

[0009] When the target object has an irregular shape or a maximum rotation radius greater than the maximum gripping radius of the dexterous hand, the dexterous hand is controlled to perform the operation task in a holding mode.

[0010] When the target object is of regular shape, its maximum rotation radius is less than or equal to the maximum gripping radius of the dexterous hand, and the minimum rotation radius is located at the top of the target object, the dexterous hand is controlled to perform the operation task in gripping mode.

[0011] When the target object has a regular shape, its maximum rotation radius is less than or equal to the maximum grip radius of the dexterous hand, and the minimum rotation radius is located in the middle of the target object, the dexterous hand is controlled to perform the operation task in a grip mode.

[0012] In some embodiments, the palm of the dexterous hand is equipped with a palm tactile sensor, and controlling the dexterous hand to perform the operation task in a supporting mode includes:

[0013] The target object is supported on the palm of the dexterous hand, and the dexterous hand is controlled to move the target object to the target position in the operation task.

[0014] During movement, based on the real-time pressure distribution data collected by the palm tactile sensor, it is determined whether the target object is in an unbalanced state;

[0015] If the target object is in an unbalanced state, the target object is restored to a balanced state by adjusting the movement speed and / or palm posture of the dexterous hand.

[0016] In some embodiments, if the target object is in an unbalanced state, restoring the target object to a balanced state by adjusting the movement speed and / or hand posture of the dexterous hand includes:

[0017] If the target object is out of balance, control the dexterous hand to reduce its movement speed;

[0018] Based on the real-time pressure distribution data collected by the palm tactile sensor, it is determined whether the target object has returned to a balanced state;

[0019] If not, the tilt direction of the target object is determined based on real-time pressure distribution data;

[0020] Control the dexterous hand to tilt in the opposite direction of the tilting direction, so that the target object is restored to a balanced state.

[0021] In some embodiments, the dexterous hand includes multiple fingers, each fingertip having a fingertip tactile sensor, and controlling the dexterous hand to perform the operation task in a grasping mode includes:

[0022] Control the dexterous hand to move to the preset grasping position with the multiple fingers spread out;

[0023] Control each finger of the dexterous hand to perform a tightening action. When the fingertip tactile sensor detects that any finger is in contact with the target object, the tightening action of the contacting finger is stopped until all fingers are in contact with the target object.

[0024] When it is determined that all of the fingers are in contact with the target object, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

[0025] In some implementations, controlling the dexterous hand to grasp the target object and move it towards the target position in the operation task when it is determined that all of the plurality of fingers are in contact with the target object includes:

[0026] When it is determined that all the fingers are in contact with the target object, the grasping torque is applied to the fingers, and the normal resultant force vector of the fingers is determined to be zero based on the real-time pressure data collected by the fingertip tactile sensor.

[0027] When the resultant normal force vector of the multiple fingers is not zero, the grasping torque of each finger is adjusted according to the real-time pressure data collected by the fingertip tactile sensor so that the resultant normal force vector of the multiple fingers is zero.

[0028] When the resultant normal force vector of the multiple fingers is zero, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

[0029] In some embodiments, the dexterous hand is equipped with a palm tactile sensor in the palm, and each fingertip and fingertip are equipped with a fingertip tactile sensor and a fingertip tactile sensor, respectively. Controlling the dexterous hand to perform the operation task in a grip mode includes:

[0030] Control the dexterous hand to move to the preset grip position with all fingers spread out;

[0031] Control each finger of the dexterous hand to perform tightening actions, and monitor the tactile sensor data collected by the palm tactile sensor, fingertip tactile sensor and finger pad tactile sensor in real time;

[0032] When the tactile sensing data is in a stable state, the gripping state is determined based on whether the tip of the thumb of the dexterous hand is in contact with the tips of other fingers. The gripping state includes a semi-enclosed state and a fully enclosed state.

[0033] Based on the grip state, perform the corresponding stability verification action;

[0034] After passing stability verification, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

[0035] In some implementations, the gripping state is a semi-enclosed state, and the step of performing a corresponding stability verification action based on the gripping state includes:

[0036] Control the dexterous hand to grasp the target object and raise it to a preset safe height;

[0037] During the lifting process, the tactile sensor data collected by the palm tactile sensor, fingertip tactile sensor and finger pad tactile sensor are monitored in real time to determine whether the target object has slipped.

[0038] If no slippage occurs, the stability verification is considered successful.

[0039] If a slippage occurs, the stability verification is deemed unsuccessful, and a redesign of the working mode is triggered.

[0040] In some implementations, the gripping state is a fully enclosed state, and the step of performing a corresponding stability verification action based on the gripping state includes:

[0041] Obtain the maximum diameter of the target object projected downwards, and the internal gripping diameter of the dexterous hand;

[0042] If the maximum diameter is greater than the internal gripping diameter, the stability verification is deemed successful.

[0043] If the maximum diameter is less than or equal to the internal gripping diameter, the stability verification is deemed to have failed, and a redesign of the working mode is triggered.

[0044] Secondly, to achieve the above objectives, this application also proposes a dexterous hand motion control system based on multimodal tasks, comprising: one or more processors; and a memory for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors execute the dexterous hand motion control method based on multimodal tasks as described above.

[0045] Thirdly, to achieve the above objectives, this application also proposes a computer storage medium storing executable instructions, which, when executed by a processor, cause the processor to perform the dexterous hand motion control method based on multimodal tasks as described above.

[0046] Compared with the prior art, the beneficial effects of this application include:

[0047] Firstly, this application analyzes the image information of the target object to automatically identify its shape type, the location of its maximum and minimum rotation radii, and other shape and size features, thereby determining the optimal action mode (holding, grasping, gripping) for performing the operation task. This improves the adaptability of the dexterous hand to various types of task targets and overcomes the shortcomings of existing technologies, such as single mode and poor adaptability.

[0048] Secondly, by combining the shape and size characteristics of the target object with the physical properties of the action mode, this application can significantly improve the success rate and reliability of complex tasks. For example, for irregular or excessively large target objects (with a maximum rotation radius greater than the maximum gripping radius of a dexterous hand), selecting the "holding mode" can avoid failure or damage to the object caused by forced gripping; for regular objects, the grasping mode or holding mode is further precisely selected based on the position of the minimum rotation radius (top or middle), ensuring that the force is applied to the most stable and easiest-to-operate position of the object. Attached Figure Description

[0049] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments will be briefly described below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation on the scope of this application.

[0050] Figure 1 This is a flowchart illustrating a dexterous hand motion control method based on multimodal tasks in one embodiment;

[0051] Figure 2 This is a schematic diagram of the dexterous hand execution in a toggle mode in one embodiment;

[0052] Figure 3 This is a schematic diagram illustrating the execution of a dexterous hand in a grasping pattern in one embodiment.

[0053] Figure 4 This is a schematic diagram of the semi-enclosed state of the grip mode in one embodiment;

[0054] Figure 5 This is a schematic diagram of tactile sensing of a target object in a balanced state based on a toss pattern in one embodiment;

[0055] Figure 6 This is a schematic diagram of tactile sensing of a target object in an unbalanced state based on a toss pattern in one embodiment;

[0056] Figure 7 This is a schematic diagram of the five-finger distribution in a grasping mode in one embodiment;

[0057] Figure 8 This is a force diagram illustrating the situation when the resultant normal force vector of multiple fingers in a grasping mode is not zero in one embodiment.

[0058] Figure 9 This is a force diagram illustrating the situation where the resultant normal force vector of multiple fingers in a grasping pattern is zero in one embodiment.

[0059] Figure 10 This is a flowchart illustrating a dexterous hand motion control method based on multimodal tasks in another embodiment;

[0060] Figure 11 This is a schematic diagram of the fully enclosed state of the grip mode in one embodiment;

[0061] Figure 12 This is a projection diagram of the grip mode in one embodiment. Detailed Implementation

[0062] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0063] All terms used in this application (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein should be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.

[0064] For example, the terms "first," "second," etc., used in this application may be used herein to describe various elements, but these elements are not limited by these terms. These terms are only used to distinguish the first element from another element. For instance, without departing from the scope of this application, the first element may be referred to as the second element, and similarly, the second element may be referred to as the first element. Both the first element and the second element are elements, but they are not the same element.

[0065] For example, the terms "comprising" or "including" used in this application indicate the presence of features, steps, operations and / or components, but do not exclude the presence or addition of one or more other features, steps, operations or components.

[0066] As mentioned earlier, current dexterous hand control methods mostly rely on preset programs or vision-based single action modes, such as determining the grasping point through image recognition and then executing a grasping action along a fixed trajectory. Although some systems have introduced haptic feedback, it is usually only used for simple force control or anti-slip detection, failing to achieve dynamic recognition and adaptive adjustment for different task modes. This results in insufficient adaptability to diverse and complex tasks, a high failure rate, and limits the application scope of dexterous hands and the overall performance of robots. Therefore, there is an urgent need to propose a dexterous hand control method that can adapt to multiple types of task objectives. To this end, this application proposes a dexterous hand motion control method, system, and storage medium based on multimodal tasks. By deeply integrating visual perception with multimodal motion control, the adaptability of the dexterous hand to multiple types of task objectives is improved, overcoming the shortcomings of existing technologies in terms of single mode and poor adaptability.

[0067] like Figure 1As shown in the figure, this application provides a dexterous hand motion control method based on multimodal tasks, the method including the following steps:

[0068] Step S10: Receive the operation task and acquire the image information of the target object in the operation task.

[0069] In this embodiment, the operation task refers to a high-level instruction issued by the upper-level system or the user, such as "grab the water cup on the table" or "move the box to point A". Image information refers to images or point cloud data containing target objects obtained by capturing the task scene through visual sensors (such as RGB-D cameras, binocular cameras, etc.) installed on the robot's head or wrist, which can be used to perceive three-dimensional geometric information.

[0070] Step S20: Identify the shape and size features of the target object by analyzing the image information.

[0071] In this embodiment, the shape and size features include shape type, maximum rotation radius, and the location of minimum rotation radius. Shape type includes "regular" and "irregular." Regular objects refer to objects with standard geometric shapes (such as cubes, cylinders, and spheres). Irregular objects refer to objects with complex shapes and no uniform geometric description (such as toys and tools). Maximum rotation radius refers to the maximum distance from the outer surface of the target object to the rotation axis when it is rotated around its own axis. The location of minimum rotation radius refers to the position of the minimum distance from the outer surface of the target object to the rotation axis when it is rotated around its own axis, i.e., the "thinnest" part.

[0072] In some implementations, image information is denoised and segmented to separate it into background information and a 3D geometric model of the target object. The shape type of the target object's 3D geometric model is determined by classifying its shape using a pre-trained convolutional neural network, or by calculating the fit between the target object's 3D geometric model and a preset standard geometric model. When the target object's shape is irregular, the dexterous hand is controlled to perform the operation in a holding mode. When the target object's shape is regular, the positions of the target object's maximum and minimum rotation radii can be calculated using the minimum bounding box (OBB) algorithm or principal component analysis (PCA).

[0073] Step S30: When the target object has an irregular shape or a maximum rotation radius greater than the maximum gripping radius of the dexterous hand, control the dexterous hand to perform the operation task in a holding mode.

[0074] In this embodiment, the maximum grasping radius of the dexterous hand refers to the radius of the largest circular object that the dexterous hand can effectively surround and apply control to through its fingertips and palm when performing a grasping or holding action. For target objects whose maximum rotation radius is greater than the maximum grasping radius of the dexterous hand, since the size of the target object exceeds the grasping ability of the dexterous hand, the holding mode is the only feasible implementation method, thus expanding the capability boundary of the dexterous hand.

[0075] The execution diagram of the delegate pattern is as follows: Figure 2 As shown, Figure 2 In the cradle mode, the dexterous hand's palm is outstretched and facing upwards, supporting a cube with its flat surface. It's worth noting that in cradle mode, the fingers of the dexterous hand can be flattened, also serving as a support or barrier. For irregularly shaped objects, using cradle mode avoids the complexities of finding grasping points and controlling force.

[0076] Step S40: When the target object has a regular shape, its maximum rotation radius is less than or equal to the maximum gripping radius of the dexterous hand, and the minimum rotation radius is located at the top of the target object, control the dexterous hand to perform the operation task in gripping mode.

[0077] In this embodiment, the grasping mode refers to a dexterous hand moving directly above the target object and using the fingertips of multiple fingers to grasp the object from above. A schematic diagram of the grasping mode execution is shown below. Figure 3 As shown, Figure 3 The nimble hand, palm down, fingers clenched, is grasping a cylinder.

[0078] Taking a stemmed glass on a table as an example, it is a regular object with a maximum rotation radius (the radius of the glass body) smaller than the maximum gripping radius of the dexterous hand, and its minimum rotation radius is located at the top (the rim of the glass). Therefore, the system will control the dexterous hand to grasp the rim of the glass with the fingertips of multiple fingers, rather than gripping the body of the glass, which is more stable and safer.

[0079] Step S50: When the target object has a regular shape, its maximum rotation radius is less than or equal to the maximum gripping radius of the dexterous hand, and the minimum rotation radius is located in the middle of the target object, control the dexterous hand to perform the operation task in a gripping mode.

[0080] In this embodiment, the grip mode refers to the dexterous hand moving to the side of the target object and using multiple fingers and the palm to grip the target object. The execution diagram of the grip mode is shown below. Figure 4 As shown, Figure 4With the palm of the dexterous hand facing the side of the target object, the fingers are tightened to grip the object. For targets suitable for side gripping, the large contact area between the palm and fingers provides greater friction and more stable support.

[0081] Taking a bowling pin as an example, it is a regular object with a maximum rotation radius smaller than the maximum grip radius of a dexterous hand, and its minimum rotation radius is located in the middle (neck). Therefore, the system will control the dexterous hand to grip its neck from the side to achieve the most stable grip effect.

[0082] In the dexterous hand motion control method based on multimodal tasks proposed in this application, in a first aspect, this application analyzes the image information of the target object to automatically identify its shape type, the location of its maximum and minimum rotation radii, and other shape and size features, thereby determining the optimal motion mode (holding, grasping, gripping) for performing the operation task. This improves the adaptability of the dexterous hand to multiple types of task targets and overcomes the shortcomings of existing technologies, such as single mode and poor adaptability.

[0083] Secondly, by combining the shape and size characteristics of the target object with the physical properties of the action mode, this application can significantly improve the success rate and reliability of complex tasks. For example, for irregular or excessively large target objects (with a maximum rotation radius greater than the maximum gripping radius of a dexterous hand), selecting the "holding mode" can avoid failure or damage to the object caused by forced gripping; for regular objects, the grasping mode or holding mode is further precisely selected based on the position of the minimum rotation radius (top or middle), ensuring that the force is applied to the most stable and easiest-to-operate position of the object.

[0084] In one embodiment, the palm of the dexterous hand is equipped with a palm tactile sensor, and step S30, controlling the dexterous hand to perform the operation task in a supporting mode, includes:

[0085] Step S31: Place the target object on the palm of the dexterous hand and control the dexterous hand to move the target object to the target position in the operation task.

[0086] In some implementations, after determining that the task is to be performed in a tuck mode, the system controls the dexterous hand to move near the target object, maintaining a palm-up, fingers extended or slightly bent posture. After placing the target object into the palm of the dexterous hand via an auxiliary device (such as a conveyor belt or another robot), the system controls the dexterous hand to move with the target object toward the target location in the task.

[0087] Step S32: During the movement, based on the real-time pressure distribution data collected by the palm tactile sensor, determine whether the target object is in an unbalanced state.

[0088] In this embodiment, the palm tactile sensor refers to an array of sensing units arranged on the surface of the palm of a dexterous hand, capable of measuring pressure distribution at different locations. Each sensing unit can decompose the multidimensional force it receives into a normal force (perpendicular to the contact surface) and a tangential force (parallel to the contact surface), thereby outputting a three-dimensional force vector. The palm tactile sensor can be a sensor based on principles such as piezoresistive, capacitive, or optical principles; this embodiment does not limit this. Real-time pressure distribution data refers to a data matrix composed of the three-dimensional force vectors measured by each sensing unit on the palm surface at a certain moment. An unbalanced state refers to an unstable state in which the target object on the palm tends to slide, tilt, or flip.

[0089] like Figure 5 As shown, the dark dashed box represents the force-bearing area of ​​the dexterous hand supporting the target object. Each sensing unit within this area is represented by a gray dot of the same grayscale, indicating that the target object is in a balanced state. When the sensing units within the dark dashed box... Figure 6 As shown, the dots, which are gradually darker in gray from right to left, indicate that the force exerted by the target object on the left side of the dexterous hand's palm is greater than the force on the right side. At this time, the target object is in an unbalanced state, tilting to the left.

[0090] In some implementations, a preset offset threshold can be used to determine whether the target object is in an unbalanced state. For example, the pressure center offset can be determined based on real-time pressure distribution data and balance pressure data at the time of placement. When the pressure center offset is greater than a preset offset threshold (such as 10% of the width of a palm), the target object is determined to be in an unbalanced state.

[0091] Step S33: If the target object is in an unbalanced state, the target object is restored to a balanced state by adjusting the movement speed and / or palm posture of the dexterous hand.

[0092] In this embodiment, the movement speed refers to the linear and angular velocity of the dexterous hand (driven by the robotic arm) in space. The hand posture refers to the orientation of the dexterous hand's palm in space, which can be described by roll angle, pitch angle, and yaw angle.

[0093] In some implementations, a PID (Proportional-Integral-Derivative) controller or a model predictive control (MPC) algorithm can be used to calculate the velocity change and / or attitude adjustment based on the pressure center offset determined by the real-time pressure distribution data compared to the equilibrium pressure data at the placement moment. The movement speed and / or hand posture of the dexterous hand can then be adjusted according to the velocity change and / or attitude adjustment to restore the target object to its equilibrium state.

[0094] In some embodiments, step S33 may further include: if the target object is in an unbalanced state, controlling the dexterous hand to reduce its movement speed. Reducing the movement speed can immediately reduce the inertial force acting on the target object, preventing the target object from continuing to slide or tilt. Based on the real-time pressure distribution data collected by the palm tactile sensor, it is determined whether the target object has returned to a balanced state. Specifically, it can be determined whether the pressure center offset of the real-time pressure distribution data and the balanced pressure data collected after reducing the movement speed is less than or equal to a preset offset threshold. If yes, it is determined that the target object has returned to a balanced state, and the dexterous hand is controlled to carry the target object to move towards the target position in the operation task according to the reduced movement speed. If no, based on the real-time pressure distribution data, the tilt direction of the target object is determined, and the dexterous hand is controlled to tilt in the opposite direction of the tilt direction, so that the target object returns to a balanced state. Here, the tilt direction refers to the direction in which the center of gravity of the unbalanced target object deviates from the center of stability when it is in an unbalanced state. For example, "tilting forward", "tilting backward", "tilting to the left", "tilting to the right" or a combination thereof.

[0095] This implementation first attempts to eliminate the disturbance source (inertial force) by reducing the moving speed, and then determines whether balance has been restored. This avoids making drastic attitude adjustments that could cause oscillations in the event of slight imbalance. If deceleration solves the problem, the system restores stability with minimal intervention, demonstrating optimized control efficiency. Reverse tilt adjustment is only initiated when deceleration fails. This hierarchical, priority-based response strategy makes the entire control process both fast and smooth, reducing unnecessary mechanical wear and energy consumption, while further ensuring operational safety, making it particularly suitable for handling fragile or valuable items.

[0096] In the dexterous hand motion control method based on multimodal tasks proposed in this application, firstly, considering that existing robots often fail to detect or effectively respond to situations where the target object becomes unbalanced due to external interference (such as uneven ground or emergency obstacle avoidance) or changes in the object's own state (such as internal liquid sloshing) during lifting and transport tasks, leading to task failure, this application enables the system to sense the stability of the target object in real time, just like a human hand, by monitoring real-time pressure distribution data collected by a palm tactile sensor, and to autonomously adjust itself when unbalanced.

[0097] Secondly, this application does not rely on perfect prior knowledge. Regardless of whether the imbalance is caused by unknown ground tilt, unexpected wind resistance, or minor errors in the object model, as long as the physical effect is manifested as a change in pressure distribution detected by the palm tactile sensor, this application can detect and respond to it. This enables dexterous hands to work stably in real-world environments full of uncertainty, significantly expanding their application boundaries, such as reliably handling unknown objects in complex scenarios like logistics sorting and disaster relief.

[0098] In one embodiment, the dexterous hand includes multiple fingers, each fingertip equipped with a fingertip tactile sensor. Step S40, controlling the dexterous hand to perform the operation task in a grasping mode, includes:

[0099] Step S41: Control the dexterous hand to move to the preset grasping position with the multiple fingers spread out.

[0100] In this embodiment, the preset grasping position refers to a grasping starting point calculated by the system based on the position of the target object before the grasping operation is performed. The grasping starting point can be located at a safe distance directly above the target object to ensure that the outstretched fingers of the dexterous hand will not collide with the object when descending.

[0101] Step S42: Control each finger of the dexterous hand to perform a tightening action. When the fingertip tactile sensor detects that any finger is in contact with the target object, the tightening action of the contacting finger is stopped until all fingers are in contact with the target object.

[0102] In this embodiment, the tightening action refers to moving the fingers from an open state to a clenched fist state by driving the finger joint motor.

[0103] When the real-time pressure data collected by the tactile sensor of a finger exceeds a preset contact threshold, the system determines that the finger has made contact with the target object. The system stops the tightening action of the contacting finger and controls the other fingers that are not in contact to continue the tightening action. This process continues until all fingers are in contact with the target object. Figure 7 The dexterous hand shown has five fingers: the thumb, index finger, middle finger, ring finger, and little finger, all of which are in contact with the target object.

[0104] Step S43: When it is determined that all of the multiple fingers are in contact with the target object, control the dexterous hand to grasp the target object and move it to the target position in the operation task.

[0105] In this embodiment, when it is determined that all fingers are in contact with the target object, all fingers form a mechanical enclosure around the target object, controlling the dexterous hand to grasp the target object and move it to the target position in the operation task.

[0106] In some implementations, before movement begins, all fingers can be controlled to apply a small initial grasping torque synchronously, creating a pre-tightening force on the target object to ensure sufficient gripping force.

[0107] In some implementations, step S43 includes:

[0108] When it is determined that all the fingers are in contact with the target object, the fingers are controlled to apply a grasping torque, and the resultant normal force vector of the fingers is determined to be zero based on the real-time pressure data collected by the fingertip tactile sensor.

[0109] In this embodiment, the grasping torque refers to the torque command generated by driving the finger joint motor to produce a grasping force. The resultant normal force vector refers to the total vector obtained by vector summing the normal forces of all fingertips (the force of each fingertip perpendicular to the contact surface of the target object and pointing inwards into the object).

[0110] like Figure 8 and Figure 9 As shown, the normal force corresponding to the thumb is F5', the normal force corresponding to the index finger is F4', the normal force corresponding to the middle finger is F3', the normal force corresponding to the ring finger is F2', and the normal force corresponding to the little finger is F1'. The resultant normal force vector F = F1' + F2' + F3' + F4' + F5'.

[0111] When the resultant normal force vector of the plurality of fingers is not zero, such as Figure 8 As shown, based on the real-time pressure data collected by the fingertip tactile sensor, the gripping torque of each finger is adjusted to make the resultant normal force vector of the multiple fingers zero. Specifically, a force control algorithm (such as a force distribution algorithm based on the Jacobian matrix) can be used to calculate a new set of gripping torques to counteract this non-zero resultant normal force vector. For example, Figure 8 The direction of the resultant normal force vector shown is perpendicular to the contact surface between the thumb and the target object. By reducing the gripping torque of the thumb, the resultant normal force vector of all fingers can be made zero.

[0112] When the resultant normal force vector of the plurality of fingers is zero, such as Figure 9 As shown, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

[0113] In the dexterous hand motion control method based on multimodal tasks proposed in this application, firstly, the dexterous hand passively adapts to the specific contour of the target object by asynchronously tightening and stopping each finger upon contact. Each finger naturally stops at its respective contact point, forming a wrapping posture that highly conforms to the shape of the target object, fundamentally eliminating the destructive lateral force caused by sequential contact. This makes the grasping process extremely robust, achieving a stable and reliable initial wrapping regardless of whether the target object is a sphere, cone, or other shape, laying a perfect geometric foundation for subsequent firm grasping.

[0114] Secondly, existing technologies typically employ a fixed torque when grasping objects, leading to slippage due to insufficient force or damage to the object due to excessive force. Alternatively, they may only control the total grasping force, failing to address the internal torque problem caused by asymmetry at the contact point. This application, by first adapting the shape, then applying the grasping torque and calculating it in real time to zero out the normal resultant force vector, highly simulates the instinctive behavior of humans performing delicate grasping tasks, achieving proactive force balance optimization.

[0115] In one embodiment, the palm of the dexterous hand is equipped with a palm tactile sensor, and the fingertip and fingertip of each finger are respectively equipped with a fingertip tactile sensor and a fingertip tactile sensor, such as... Figure 10 As shown, step S50, controlling the dexterous hand to perform the operation task in a gripping mode, includes:

[0116] Step S51: Control the dexterous hand to move to the preset gripping position with all fingers spread out.

[0117] In this embodiment, the preset grip position refers to a grip preparation position on the side of the target object. This grip preparation position ensures that when the dexterous hand is in an open state, its movement trajectory allows the fingers and palm to effectively surround or contact the target object from the side.

[0118] In some implementations, the predicted center of gravity height of the target object can be determined based on its shape and size characteristics; and a preset gripping position can be determined based on the predicted center of gravity height.

[0119] In step S52, control each finger of the dexterous hand to perform a tightening action, and monitor the tactile sensing data collected by the palm tactile sensor, fingertip tactile sensor and finger pad tactile sensor in real time.

[0120] In this embodiment, the fingertip tactile sensor refers to a tactile sensor integrated on the fingertip (the finger segment connected to the fingertip) to detect the contact force between the fingertip and the target object.

[0121] By using palm tactile sensors, fingertip tactile sensors, and finger pad tactile sensors, it is possible to achieve fused perception of multiple parts (palm, tips, and pads). Compared with single-part perception, it can obtain richer tactile sensor data, thereby accurately judging the grip state.

[0122] Step S53: When the tactile sensing data is in a stable state, the gripping state is determined based on whether the tip of the thumb of the dexterous hand is in contact with the tips of other fingers.

[0123] In this embodiment, a stable state means that the fluctuation of tactile sensing data within a preset time period is less than a preset fluctuation threshold. That is, the tactile sensing data collected by each tactile sensor no longer changes significantly, which indicates that the tightening action has been completed and the system has reached a temporary static equilibrium.

[0124] The grip can be either semi-enclosed or fully enclosed. The semi-enclosed grip includes... Figure 4 As shown, this means the tip of the thumb is not in contact with the tips of any other fingers. This indicates that the target object is not completely surrounded by the dexterous hand, part of the target object's surface is not covered, and the grip is mainly maintained by the friction between the palm and fingers and the target object's surface.

[0125] Fully enclosed state, such as Figure 11 As shown, this refers to the tip of the thumb contacting the tip of at least one other finger. This indicates that the dexterous hand has formed a closed loop, completely enclosing the target object. The grip at this point relies on the geometric constraints created by this mechanical enclosure, rather than being dominated by friction.

[0126] In some implementations, it can be determined whether the thumb is in contact with the fingertips of any other fingers based on tactile sensing data from each fingertip tactile sensor and / or based on the kinematic parameters of the dexterous hand; if so, it is determined to be a fully enclosed state; if not, it is determined to be a semi-enclosed state.

[0127] Step S54: Perform the corresponding stability verification action according to the grip state.

[0128] In this embodiment, the stability verification action refers to a series of exploratory actions and judgment logic actively executed to test whether the current grip state is stable enough for subsequent movement operations. Specifically, when the grip state is semi-enclosed, a lifting test is performed; when the grip state is fully enclosed, a diameter comparison test is performed.

[0129] As an optional embodiment where the gripping state is a semi-enclosed state, step S54 includes:

[0130] The dexterous hand is controlled to grasp and lift the target object to a preset safe height. This preset safe height refers to a pre-set, relatively small lifting distance (e.g., 2-5 cm). This height needs to be small enough that even if the target object slips, it will not cause serious damage or danger; simultaneously, it needs to be large enough to effectively simulate the inertial force at the start of movement, sufficient to overcome static friction and trigger potential slippage. Furthermore, the preset safe height can be dynamically set based on the estimated weight and surface material of the target object, and is inversely proportional to both the estimated weight and smoothness. During the lifting process, tactile sensor data collected in real time by the palm tactile sensor, fingertip tactile sensor, and fingertip tactile sensor is monitored to determine whether the target object has slipped. This can be determined by checking for a significant decrease or zeroing of sensor readings, or by checking whether the pressure center measured based on the tactile sensor data has shifted. If no slippage occurs, the stability verification is considered successful; if slippage occurs, the stability verification is considered unsuccessful, and a reprogramming of the operating mode is triggered.

[0131] In some implementations, if the stability verification fails, the working mode is reprogrammed to the load mode.

[0132] In some implementations, while in a gripping state, a dexterous hand can be controlled to perform a very small and high-speed reciprocating vibration (e.g., amplitude 0.5 mm, frequency 10 Hz). The force response amplitude, phase, and vibration decay characteristics of the target object to this reciprocating vibration are measured using a tactile sensor to determine the object's stiffness characteristics (high stiffness / low stiffness) and structural characteristics (solid dense structure, high-damping structure, and elastic porous structure, etc.). A large force response amplitude, in phase with the vibration displacement, indicates high stiffness, corresponding to hard characteristics. A small force response amplitude, with phase lag, indicates low stiffness, corresponding to soft characteristics. Slow vibration decay corresponds to a solid dense structure (e.g., a steel column). Fast vibration decay corresponds to a high-damping or non-uniform internal structure. Examples include shell-like fluid structures (e.g., eggs), where the outer shell is rigid but the internal liquid causes strong damping; and elastic porous structures (e.g., sponges), where the material itself is viscoelastic.

[0133] When the system detects a target object exhibiting both high stiffness (hardness) and high damping (rapid vibration decay), it determines the object to be a hard but fragile shell structure (such as an egg or thin-walled ceramic). Such a structure cannot withstand the additional gripping torque required to increase friction, posing a risk of crushing. Therefore, the system triggers a reprogramming of the operating mode to a support mode to ensure operational safety.

[0134] When the system does not detect a target object that simultaneously possesses the characteristics of high stiffness (hardness) and high damping (rapid vibration decay), it triggers a reprogramming of the working mode to increase the grip torque of the tightening action. This increases friction by increasing the normal force, thereby retrying and verifying the stability of the grip mode.

[0135] As an alternative embodiment where the grip is in a fully enclosed state, the maximum diameter of the target object projected downwards and the internal grip diameter of the dexterous hand are obtained. The maximum diameter projected downwards refers to the longest chord in the projection pattern obtained by projecting the target object onto a horizontal plane along the direction of gravity (vertically downwards), representing the maximum width of the target object. The internal grip diameter of the dexterous hand refers to the diameter of the inscribed circle of the enclosed circle formed by the thumb, other fingers, and palm in the current grip posture. If the maximum diameter is greater than the internal grip diameter, the stability verification is considered successful; if the maximum diameter is less than or equal to the internal grip diameter, the stability verification is considered unsuccessful, and a redesign of the working mode to a gripping mode is triggered.

[0136] like Figure 12 As shown, the target object is a stemmed glass. The actual placement diameter of the stemmed glass is smaller than the internal grip diameter of the dexterous hand, but the maximum diameter of the downward projection of the glass body is larger than the internal grip diameter of the dexterous hand. Under the influence of gravity, the stemmed glass achieves a stable grip by mechanically interlocking with the dexterous hand through this maximum diameter profile and the fully enclosed constraint formed by the thumb and other fingers.

[0137] Step S55: After passing the stability verification, control the dexterous hand to hold the target object and move it to the target position in the operation task.

[0138] In this embodiment, when the corresponding stability verification action returns a "verification passed" signal, the system controls the dexterous hand to hold the target object and move it to the target position in the operation task.

[0139] During movement, tactile sensor data collected by the palm, fingertip, and fingertip sensors can be continuously monitored in real time. If any abnormality is detected in the tactile sensor data (such as a sudden decrease in pressure, indicating that the target object may be slipping), an emergency recovery action is triggered. The emergency recovery action can involve controlling another dexterous hand to move under the target object in a supporting posture, thereby preventing the target object from continuing to slip.

[0140] In the dexterous hand motion control method based on multimodal tasks proposed in this application, firstly, this application distinguishes the gripping state into a semi-enclosed state and a fully enclosed state by whether the tip of the thumb contacts the tips of other fingers, thus endowing the dexterous hand with the ability to recognize and distinguish the physical essence of gripping. This is not a simple morphological classification, but rather a distinction between two fundamental principles of gripping stability: the former mainly relies on surface friction, while the latter relies on geometric closure.

[0141] Secondly, this application proposes accurate and effective stability verification actions for different gripping states, forming a complete "diagnosis-treatment" closed loop. For the semi-enclosed state relying on friction, this application actively simulates the inertial disturbance at the start of movement by performing a low-risk micro-lifting action, thereby directly verifying the safety margin of the current gripping friction in practice. For the fully enclosed state relying on shape closure, this application pre-judges the risk of the object falling in the direction of gravity from a geometrical perspective by comparing the maximum diameter of the target object's downward projection with the internal gripping diameter of the dexterous hand. It can accurately identify unstable situations that appear to be successfully gripped (fingertip contact forming a closed loop) but actually have the risk of "empty grip" (excessive internal space), ensuring the effectiveness of shape closure.

[0142] In one embodiment, a computer storage medium is provided that stores executable instructions, which, when executed by a processor, cause the processor to perform the steps in the above-described method embodiments. The embodiments of this application will not be described in detail here.

[0143] In one embodiment, a dexterous hand motion control system based on multimodal tasks is also provided, including one or more processors; a memory storing one or more programs, wherein when one or more programs are executed by one or more processors, the one or more processors perform the steps in the above method embodiments, which will not be elaborated further in this application.

[0144] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.

[0145] Furthermore, those skilled in the art will understand that although some embodiments herein include certain features included in other embodiments but not others, combinations of features from different embodiments are intended to be within the scope of this application and form different embodiments. For example, any of the embodiments or implementations claimed above can be used in any combination. The information disclosed in this background section is intended only to enhance the understanding of the general background of this application and should not be construed as an admission or in any way implying that such information constitutes prior art known to those skilled in the art.

Claims

1. A method for controlling dexterous hand movements based on multimodal tasks, characterized in that, The method includes: Receive the operation task and acquire the image information of the target object in the operation task; The shape and size features of the target object are identified by analyzing the image information. The shape and size features include the shape type, the location of the maximum rotation radius and the minimum rotation radius. When the target object has an irregular shape or a maximum rotation radius greater than the maximum grasping radius of the dexterous hand, the target object is supported on the palm of the dexterous hand, and the dexterous hand is controlled to move the target object to the target position in the operation task. The palm of the dexterous hand is equipped with a palm tactile sensor. During the movement, based on the real-time pressure distribution data collected by the palm tactile sensor, it is determined whether the target object is in an unbalanced state. If the target object is in an unbalanced state, the movement speed and / or palm posture of the dexterous hand are adjusted to restore the target object to a balanced state. When the target object is of regular shape, its maximum rotation radius is less than or equal to the maximum gripping radius of the dexterous hand, and the minimum rotation radius is located at the top of the target object, the dexterous hand is controlled to perform the operation task in gripping mode. When the target object has a regular shape, its maximum rotation radius is less than or equal to the maximum grip radius of the dexterous hand, and the minimum rotation radius is located in the middle of the target object, the dexterous hand is controlled to perform the operation task in a grip mode.

2. The dexterous hand motion control method based on multimodal tasks according to claim 1, characterized in that, If the target object is in an unbalanced state, the target object is restored to a balanced state by adjusting the movement speed and / or hand posture of the dexterous hand, including: If the target object is out of balance, control the dexterous hand to reduce its movement speed; Based on the real-time pressure distribution data collected by the palm tactile sensor, it is determined whether the target object has returned to a balanced state; If not, the tilt direction of the target object is determined based on real-time pressure distribution data; Control the dexterous hand to tilt in the opposite direction of the tilting direction, so that the target object is restored to a balanced state.

3. The dexterous hand motion control method based on multimodal tasks according to claim 1, characterized in that, The dexterous hand includes multiple fingers, each fingertip equipped with a fingertip tactile sensor. Controlling the dexterous hand to perform the operation task in a grasping mode includes: Control the dexterous hand to move to the preset grasping position with the multiple fingers spread out; Control each finger of the dexterous hand to perform a tightening action. When the fingertip tactile sensor detects that any finger is in contact with the target object, the tightening action of the contacting finger is stopped until all fingers are in contact with the target object. When it is determined that all of the fingers are in contact with the target object, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

4. The dexterous hand motion control method based on multimodal tasks according to claim 3, characterized in that, When it is determined that all the fingers are in contact with the target object, controlling the dexterous hand to grasp the target object and move it towards the target position in the operation task includes: When it is determined that all the fingers are in contact with the target object, the grasping torque is applied to the fingers, and the normal resultant force vector of the fingers is determined to be zero based on the real-time pressure data collected by the fingertip tactile sensor. When the resultant normal force vector of the multiple fingers is not zero, the grasping torque of each finger is adjusted according to the real-time pressure data collected by the fingertip tactile sensor so that the resultant normal force vector of the multiple fingers is zero. When the resultant normal force vector of the multiple fingers is zero, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

5. The dexterous hand motion control method based on multimodal tasks according to claim 1, characterized in that, The dexterous hand is equipped with a palm tactile sensor in its palm, and each fingertip and fingertip are equipped with a fingertip tactile sensor and a fingertip tactile sensor, respectively. Controlling the dexterous hand to perform the operation task in a grip mode includes: Control the dexterous hand to move to the preset grip position with all fingers spread out; Control each finger of the dexterous hand to perform tightening actions, and monitor the tactile sensor data collected by the palm tactile sensor, fingertip tactile sensor and finger pad tactile sensor in real time; When the tactile sensing data is in a stable state, the gripping state is determined based on whether the tip of the thumb of the dexterous hand is in contact with the tips of other fingers. The gripping state includes a semi-enclosed state and a fully enclosed state. Based on the grip state, perform the corresponding stability verification action; After passing stability verification, the dexterous hand is controlled to grasp the target object and move it to the target position in the operation task.

6. The dexterous hand motion control method based on multimodal tasks according to claim 5, characterized in that, The gripping state is a semi-enclosed state, and the corresponding stability verification action is performed according to the gripping state, including: Control the dexterous hand to grasp the target object and raise it to a preset safe height; During the lifting process, the tactile sensor data collected by the palm tactile sensor, fingertip tactile sensor and finger pad tactile sensor are monitored in real time to determine whether the target object has slipped. If no slippage occurs, the stability verification is considered successful. If a slippage occurs, the stability verification is deemed unsuccessful, and a redesign of the working mode is triggered.

7. The dexterous hand motion control method based on multimodal tasks according to claim 5, characterized in that, The gripping state is a fully enclosed state, and the corresponding stability verification action is performed based on the gripping state, including: Obtain the maximum diameter of the target object projected downwards, and the internal gripping diameter of the dexterous hand; If the maximum diameter is greater than the internal gripping diameter, the stability verification is deemed successful. If the maximum diameter is less than or equal to the internal gripping diameter, the stability verification is deemed to have failed, and a redesign of the working mode is triggered.

8. A dexterous hand motion control system based on multimodal tasks, characterized in that, include: One or more processors; Memory, used to store one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors perform the dexterous hand motion control method based on multimodal tasks as described in any one of claims 1 to 7.

9. A computer storage medium, characterized in that, The storage medium stores executable instructions, which, when executed by a processor, cause the processor to perform the dexterous hand motion control method based on a multimodal task as described in any one of claims 1 to 7.