A robot single instruction grasp method and system based on visual perception

By using visual recognition and dynamic correction of the friction coefficient, combined with mechanical contact networks and interactive dynamics models, the robot grasping method is optimized, solving the problems of stable grasping of target objects of different materials and disturbance of nearby objects, and achieving a more stable and accurate grasping effect.

CN121870779BActive Publication Date: 2026-06-16HANGZHOU YUDAO ARTIFICIAL INTELLIGENCE TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HANGZHOU YUDAO ARTIFICIAL INTELLIGENCE TECHNOLOGY CO LTD
Filing Date
2026-03-17
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing robotic grasping methods struggle to stably grasp target objects of different materials and are prone to disturbing nearby objects during the grasping process. In particular, under special conditions such as when the target object is wet, the friction coefficient correction is inaccurate and the collision avoidance detection is insufficient.

Method used

By visually identifying the surface material of the target object and the robotic arm, the friction coefficient is dynamically corrected, and a mechanical contact network and interactive dynamics model are constructed to predict disturbance risks, adjust the gripping force and trajectory, and optimize the gripping strategy using machine learning.

🎯Benefits of technology

It improves the stability of target object grasping and the anti-collision capability of the robotic arm, reduces the disturbance to nearby objects, and adapts to different surface conditions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121870779B_ABST
    Figure CN121870779B_ABST
Patent Text Reader

Abstract

The application discloses a robot single-instruction grabbing method and system based on visual perception, which comprises the following steps: acquiring image information of target objects and objects adjacent to the target objects by using a visual sensor, and identifying feature types and position information of the target objects and the objects adjacent to the target objects; identifying feature states of a mechanical hand and a target object grabbing surface according to the image information, adaptively correcting a friction factor of a grabbing action based on the two feature states, and adjusting grabbing strength in a corresponding direction according to the corrected friction factor; identifying contact states between each target object and the objects adjacent to the target object based on the image information, constructing an interactive dynamics model according to the contact states and motion features of the objects adjacent to the target object; pre-constructing a plurality of simulated grabbing actions and trajectories, judging disturbance risks of each grabbing action and trajectory on the objects adjacent to the target object according to the interactive dynamics model, and adjusting the grabbing action and trajectory of the mechanical hand by using machine learning and according to the disturbance risks.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of robotics, and in particular to a single-command grasping method and system for robots based on visual perception. Background Technology

[0002] Currently, robotic grasping methods primarily rely on vision-based posture localization and collision avoidance detection. However, these existing technologies suffer from the following technical problems: Since the material, shape, and size of the grasped target vary, ensuring stable grasping is a critical challenge. Firstly, different materials have different coefficients of friction, necessitating adjustments to these coefficients, especially for targets or robotic arms in special conditions, such as those wetted with water, where the coefficient of friction differs from normal conditions, requiring special correction. Secondly, current collision avoidance detection for robotic arms typically targets the robotic arm itself or the robot as the target. However, in practice, simply detecting collision avoidance on the robotic arm or robot itself after grasping the target object may overlook the associated motion of the target object on the robotic arm and its neighbors, resulting in excessive disturbance to nearby objects during the grasping process. Summary of the Invention

[0003] One objective of this invention is to provide a single-command grasping method and system for robots based on visual perception. This method and system provide friction coefficients for different material pairs. The robot can identify the surface material of a target object through visual images and extract the corresponding reference friction coefficient using its stored material information. Furthermore, this invention identifies the adhesion state between the target object and the robotic arm based on visual images and provides an adhesion factor correction factor to adjust the friction coefficient according to the adhesion state. Therefore, this invention can provide a dynamically variable friction coefficient as a parameter for the dynamic gripping force of the robotic arm, thereby significantly improving the stability of grasping the target object. Moreover, this invention can adaptively adjust the grasping process to adapt to different surface states of the target object.

[0004] Another objective of this invention is to provide a single-command grasping method and system for robots based on visual perception. The method and system provide a mechanical contact network constructed using the positional relationship between the current target object and its adjacent objects identified by visual image recognition. Based on the mechanical contact network and the motion state of the adjacent objects detected by actual vision, a mechanical transfer equation is constructed. The mechanical transfer equation is then used in conjunction with the domino effect to determine whether the current grasping action of the robotic arm is subject to a strong disturbance from a neighboring object or a domino disturbance between multiple adjacent objects.

[0005] Another objective of this invention is to provide a single-command grasping method and system for robots based on vision perception. The method and system utilize a vision perception module to identify objects, including spherical and elastic surfaces, and make slight sliding corrections for the maximum static friction force estimation. Therefore, this invention can be adapted to the friction force estimation of contacts including different curved surfaces, and thus provides accurate and stable stress of the robotic finger in the direction perpendicular to the contact surface, greatly improving the stability of the robotic hand grasping.

[0006] To achieve at least one of the above-mentioned objectives, the present invention further provides a single-command grasping method for robots based on visual perception, the method comprising:

[0007] The system uses a visual sensor to acquire image information of the target object and its neighboring objects, and identifies the feature types and location information of the target object and its neighboring objects.

[0008] Based on image information, the characteristic states of the contact surface of the robotic arm and the characteristic states of the grasping surface of the target object are identified. The friction coefficient of the grasping action is adaptively corrected based on the two characteristic states, and the grasping force in the corresponding direction is adjusted according to the corrected friction coefficient.

[0009] Based on the image information, the contact state between each target object and its neighboring objects is identified and detected, and an interactive dynamics model is constructed based on the contact state and the motion characteristics of the neighboring objects.

[0010] Multiple simulated grasping actions and trajectories are pre-constructed. The disturbance risk of each grasping action and trajectory to adjacent objects of the target object is judged based on the interaction dynamics model. Machine learning is used to adjust the grasping actions and trajectories of the robotic arm based on the disturbance risk.

[0011] According to a preferred embodiment of the present invention, a target recognition model is used to identify the surface feature state of the target object and the surface feature state of the contact portion of the robot arm. The surface feature state of the target object includes the material feature state of the surface and the feature state of the adhesive on the surface. The surface feature state of the contact portion of the robot arm includes the material feature state of the contact portion surface and the feature state of the adhesive on the contact surface. A pre-configured material reference friction coefficient for the corresponding material pair is obtained, and an adhesive correction factor is obtained based on the identified feature state of the adhesive. The adhesive correction factor is used to correct the material reference friction coefficient. The corrected material reference friction coefficient is combined with the maximum static friction force to estimate and predict the normal force of the corresponding contact surface, which is used to adaptively control the force and direction of the contact portion of the robot arm.

[0012] According to another preferred embodiment of the present invention, the method for adaptively correcting the friction coefficient includes: performing finite element analysis on the contact surface between the contact part of the manipulator and the surface of the target object to obtain a pressure distribution function of the non-uniform pressure contact surface; based on the pressure distribution function and the corresponding local friction coefficient, performing integral processing of the contact area part of the finite element analysis to obtain an area correction factor for the overall contact surface; the area correction factor makes the complete contact surface of the non-uniform pressure have a uniformly corrected friction coefficient.

[0013] According to another preferred embodiment of the present invention, the method for adaptively correcting the friction coefficient includes: when the surface of the target object is identified as a sphere, calculating a sliding correction factor under minute sliding, calculating spherical contact under minute sliding using a deformed Cattaneo-Mindlin model, further calculating the maximum static friction force under the normal force of the corresponding contact surface when the target object is a sphere based on the sliding correction factor, and controlling the magnitude of the normal force of the corresponding contact surface based on the maximum static friction force and the mass of the target object.

[0014] According to another preferred embodiment of the present invention, the interactive dynamics model construction method includes: acquiring the contact states between multiple adjacent objects of the target object, wherein the contact states include point contact, line contact, and surface contact; constructing corresponding contact pairs for the point contact, line contact, and surface contact; constructing a contact network with the contact pairs as edges and each adjacent object and the corresponding target object as vertices; calculating the relative displacement and relative velocity of each grasping action and trajectory relative to the adjacent objects of the target object at the contact point, contact surface, or contact line; and constructing a mechanical transfer equation, wherein the mechanical transfer equation is as follows:

[0015] ;

[0016] Where the subscripts i and j represent the mechanical interaction between object i and object j, K ij D represents the contact stiffness matrix of object i with respect to object j. ij This represents the damping matrix of object i with respect to object j. Indicates the relative displacement of the contact position. Represents the relative velocity at the point of contact. This represents the force transmitted by object i to physical object j.

[0017] According to another preferred embodiment of the present invention, the interactive dynamics model method includes: after identifying the material characteristics of adjacent objects of the target object using a target recognition algorithm, estimating the mass of each adjacent object by combining the vector object volume to obtain the mass M of the single system, and further calculating the disturbance propagation matrix according to the following formula:

[0018] ;

[0019] Where P ij This represents the degree of influence of object i on the acceleration of object j, where a j F represents the acceleration of object j. i This represents the force acting on object i.

[0020] According to another preferred embodiment of the present invention, the disturbance risk of the target object's adjacent objects includes sliding risk and tipping risk. The sliding risk requires obtaining the force between each contact surface according to the transmission equation of transmission mechanics, calculating the normal pressure and tangential force of the contact surface, calculating the friction coefficient of each contact surface material according to the normal pressure, calculating the maximum static friction force, calculating the ratio of the tangential force to the maximum static friction force, and judging the sliding risk of the current adjacent object according to the ratio.

[0021] The tipping risk assessment requires obtaining the external tipping moment of adjacent objects and the gravity-restoring moment of the adjacent objects themselves. The external tipping moment is obtained by the dot product of the total moment of the external forces on the contact surface of the adjacent objects and the unit vector in the boundary direction. The gravity-restoring moment is obtained by the dot product of the estimated mass of the adjacent objects and the vertical distance from the center of mass to the boundary line. The moment ratio of each adjacent object is calculated based on the external tipping moment and the gravity-restoring moment, and the tipping risk of each adjacent object is determined based on the moment ratio.

[0022] According to another preferred embodiment of the present invention, the force vector of the manipulator's own disturbance to adjacent objects is obtained, and the force vector of the current adjacent object on another adjacent object is calculated based on the estimated mass of each adjacent object, the force vector of the disturbance to the adjacent object, and the ground friction factor. This force vector is used to calculate the corresponding disturbance propagation matrix. The parameter set of each grasping action of the manipulator, the parameter set of grasping trajectory, and the disturbance propagation matrix are input as feature parameters into the machine learning model. The squared difference between the actual motion state results of adjacent objects detected by the image detection module and the prediction results of the machine learning model is used as the loss function to train the machine learning model. The grasping action and grasping trajectory of the manipulator with the least disturbance are selected as the output grasping strategy.

[0023] To achieve at least one of the above-mentioned objectives, the present invention further provides a vision-based robot single-command grasping system, wherein the system executes the vision-based robot single-command grasping method described above.

[0024] The present invention further provides a computer-readable storage medium storing a computer program, which is executed by a processor to implement the above-described single-instruction grasping method for robots based on vision perception. Attached Figure Description

[0025] Figure 1 The diagram shown is a flowchart of a single-command grasping method for robots based on visual perception, according to the present invention. Detailed Implementation

[0026] The following description is intended to disclose the present invention and enable those skilled in the art to implement it. The preferred embodiments described below are merely examples, and other obvious variations will occur to those skilled in the art. The basic principles of the invention defined in the following description can be applied to other embodiments, modifications, improvements, equivalents, and other technical solutions that do not depart from the spirit and scope of the invention.

[0027] It is understood that the term "a" should be understood as "at least one" or "one or more," that is, in one embodiment, the number of an element can be one, while in another embodiment, the number of the element can be multiple, and the term "a" should not be understood as a limitation on the number.

[0028] Please combine Figure 1 This invention discloses a single-command grasping method and system for robots based on visual perception, wherein the method mainly includes the following steps:

[0029] S01. Use a visual sensor to acquire image information of the target object and its adjacent objects, and identify the feature type and location information of the target object and its adjacent objects.

[0030] S02. Based on the image information, identify the characteristic state of the contact surface of the robotic arm and the characteristic state of the grasping surface of the target object. Based on the two characteristic states, adaptively correct the friction coefficient of the grasping action and adjust the grasping force in the corresponding direction according to the corrected friction coefficient.

[0031] S03. Based on the image information, identify and detect the contact state between each target object and adjacent objects, and construct an interactive dynamics model based on the contact state and the motion characteristics of adjacent objects;

[0032] S04. Pre-construct multiple simulated grasping actions and trajectories, determine the disturbance risk of each grasping action and trajectory to adjacent objects of the target object based on the interactive dynamics model, and use machine learning to adjust the robotic arm grasping actions and trajectories according to the disturbance risk.

[0033] Specifically, this invention requires the pre-construction of a multimodal perception system, primarily based on visual perception, for the target object and its adjacent objects. This multimodal perception system includes pressure sensing via tactile sensors at the contact points of the robotic arm, and the robotic arm also possesses a mechanical sensing device for disturbances to the surrounding environment. The visual perception employs image acquisition of the target object and adjacent objects using devices including, but not limited to, RGB-D cameras. This invention also provides a target recognition model, which can employ, but is not limited to, the YOLO model, to identify and distinguish the target object from its adjacent objects. The target recognition model can also identify the material characteristics and adhering properties of the current contact points of the robotic arm and the target object. It should be noted that due to differences in surface materials and varying friction factors between the robotic arm's surface material and target objects of different materials, this invention requires the pre-construction of basic friction factor values ​​for different material pairs. Since the basic friction factor of a material is influenced by its surface morphology, texture, and adhering properties, this invention requires multi-dimensional corrections to the basic friction factor, thereby making the robotic arm's estimation of maximum static friction more accurate and reducing the impact of different surface states on the estimation of friction factors.

[0034] It should be noted that in this invention, a single instruction represents the complete action steps of the robotic arm in executing the corresponding task instruction, including the rotation direction and speed of the traditional robotic arm's 6 degrees of freedom, as well as the extension direction and speed of the robotic arm, etc., which contain different motion parameters and trajectory parameters.

[0035] This invention further employs a vocabulary segmentation network for semantic segmentation based on the identified target object and adjacent objects, and combines this with a physical attribute estimation network to detect and define the contact state between each adjacent object. The contact state definition includes point contact, surface contact, and line contact. This contact state is defined as a contact pair (i,j), representing the contact pair between object i and object j. An interaction dynamics model is constructed by combining contact pairs of different adjacent objects and the target object. This interaction dynamics model includes a mechanical transfer equation, which describes the mechanical motion relationship between adjacent objects with contact pairs currently contacted by the robotic arm. Specifically, this invention uses the mechanical transfer equation to assess the linkage effect of multiple contact pairs of adjacent objects. Since the mechanical transfer equation in this invention is an estimation equation, it may contain errors due to unknown environmental influences or differences in the characteristics of each object. These errors cannot be accurately calculated using specific measurable parameters. Therefore, this invention uses, but is not limited to, the mechanical transfer equation as an analog quantity and the actual mechanical motion state detected by a visual sensor as a detection quantity. A machine learning model is used to fit the analog quantity and the actual detection quantity, enabling this invention to accurately predict the impact of each robotic arm grasping action and path on the motion state of adjacent objects. The machine learning model mentioned therein can be a mature model such as an existing CNN model, which will not be described in detail in this invention.

[0036] It is worth mentioning that the present invention provides calculations of contact surface area correction factor, sliding correction factor and adhesion correction factor, which are used to correct the basic friction factor of the material pair. The corrected friction factor is used to reasonably and accurately estimate the magnitude of the normal pressure of the robot arm, thereby enabling the present invention to significantly improve the robot arm's grasping control of the target object and reduce the problem of inaccurate force control caused by abnormal friction factor.

[0037] Specifically, the calculation method of the area correction factor includes the following steps: using a recognition model to determine whether the contact surface of the current target object is a non-uniform contact surface; if it is determined to be a non-uniform contact surface, then the area correction factor is calculated according to the following formula. calculate:

[0038] ;

[0039] in The pressure distribution function is represented by the pressure probe at the contact point of the robotic arm, where A is the contact area. Local friction coefficient function, P avgThe average pressure value is represented by x, y, and z, which represent the spatial coordinates of the contact surface. It should be noted that the above area correction factor is based on the assumption that the contact surface may have multiple different structures formed by combinations of different materials, such as a contact surface composed of parts metal and glass. The area correction factor is applied to these different contact surfaces. This makes the robotic arm more stable when grasping target objects with non-uniform contact surfaces. It should be noted that the above contact surface structure analysis can be achieved using finite element analysis, which can be done by generating 3D or 2D point clouds from images and then performing finite element analysis. This invention will not elaborate on this further.

[0040] The sliding correction factor The calculation methods include:

[0041] ;

[0042] in Here, v represents the material correlation coefficient, v0 represents the relative tangential velocity, and v0 represents the reference velocity, which is generally preferred to be 0.001 m / s. The sliding correction factor can address the influence of minute sliding on the friction coefficient in some special structures, such as spherical structures. When the target object surface is identified as a sphere, the sliding correction factor under minute sliding is calculated, and the spherical contact under minute sliding is calculated using a modified Cattaneo-Mindlin model. The maximum static friction force of this spherical contact is calculated using the following formula:

[0043] ;

[0044] Where 'a' represents the radius of the contact circle. ,a s Indicates the radius of the sliding zone. U represents the equivalent elastic modulus, R represents the equivalent radius of curvature, and u represents the equivalent elastic modulus. s This indicates the corrected static friction coefficient.

[0045] The method for calculating the adhesion correction factor includes the following steps: pre-constructing the influence coefficients of different adhesion types on friction, determining the magnitude levels of different adhesions, and calculating the adhesion correction factor using the following formula. :

[0046] , where γ represents the frictional influence coefficient corresponding to the adhesive, γ∈[-1,1], and C represents the magnitude level, where C∈[0,1]. The magnitude level represents the coverage ratio of the adhesive, 0 indicates clean and no adhesion, and 1 indicates almost complete adhesion.

[0047] Using the adhesive correction factor Sliding correction factor and area correction factor The static friction coefficient u based on material pairs is obtained by multiplication. i The correction is made, and the corrected static friction coefficient is: The maximum static friction force is further estimated based on the estimated mass mg of the target object. This results in the maximum static friction force corresponding to the normal force N at the corresponding contact surface. Greater than the estimated mass (mg).

[0048] It is worth mentioning that the interactive dynamics model described in this invention constructs a description of the disturbance problem during the grasping process of adjacent objects by a target object, including a mechanical transfer equation and a disturbance propagation matrix. The calculation method for the mechanical transfer equation includes: constructing a contact network with the contact pairs as edges and each adjacent object and the corresponding target object as vertices; calculating the relative displacement and relative velocity of each grasping action and trajectory relative to the adjacent objects of the target object at the contact point, contact surface, or contact line; and constructing the mechanical transfer equation.

[0049] ;

[0050] Where the subscripts i and j represent the mechanical interaction between object i and object j, K ij D represents the contact stiffness matrix of object i with respect to object j. ij This represents the damping matrix of object i with respect to object j. Indicates the relative displacement of the contact position. Represents the relative velocity at the point of contact. Let i represent the force transmitted by object i to object j. In this invention, the tangential force can be preferentially calculated. The above mechanical transfer equation is based on the interaction of the contact surfaces. The mechanical transfer equation of this invention needs to consider the influence of the bottom gravity contact surface on the mechanical transfer equation, especially for scenarios with sliding friction, the extended formula considering the frictional influence of the bottom gravity contact surface is as follows: Where k represents the identifier of the previous level contact pair according to the contact network, r represents the identifier of another contact pair that is in contact with i, and m k U represents the estimated mass of object k. d This represents the coefficient of static friction of the gravitational contact surface. It should be noted that when there is sliding between the corresponding lower contact objects, the corresponding force can be obtained by multiplying the gravity by the coefficient of friction of the bottom gravitational contact surface.

[0051] Furthermore, after identifying the material characteristics of adjacent objects of the target object using the aforementioned target recognition algorithm, the mass of each adjacent object is estimated by combining the vector object volume, and the perturbation propagation matrix is ​​further calculated according to the following formula:

[0052] ;

[0053] Where P ij This represents the degree of influence of object i on the acceleration of object j, where a j F represents the acceleration of object j. i This represents the force acting on object i. The acceleration in the above disturbance propagation matrix can be calculated using the aforementioned mechanical transfer equations and their corresponding extended formulas.

[0054] It should be noted that the disturbance risk of adjacent objects to the target object in this invention includes sliding risk and tipping risk. The sliding risk requires obtaining the forces between each contact surface according to the transmission equation of mechanics, calculating the normal pressure and tangential force of the contact surfaces, calculating the friction coefficient of each contact surface material based on the normal pressure, calculating the maximum static friction force, calculating the ratio of the tangential force to the maximum static friction force, and determining the sliding risk of the current adjacent object based on the ratio. The specific calculation method includes:

[0055] When there exists an external force vector F ext When, calculate its tangential force F t And the normal force N, where the friction coefficient u of the corresponding gravity contact surface is queried through mapping. s Then, calculate its maximum static friction force. And according to the tangential force F t and maximum static friction Calculate the slip risk using the following formula: ,when When >1, it indicates that object j has a high risk of sliding. When = 1, it indicates that object j is in a critical state with a risk of sliding. When <1, it indicates that object j is at low risk of sliding.

[0056] In one preferred embodiment of the present invention, the tipping risk requires obtaining the external tipping moment of adjacent objects and the gravity restoring moment of the adjacent objects themselves. The external tipping moment is obtained by the dot product of the total moment of the external forces on the contact surface of the adjacent objects and the unit vector in the boundary direction. The gravity restoring moment is obtained by the dot product of the estimated mass of the adjacent objects and the vertical distance from their center of mass to the boundary line. The moment ratio of each adjacent object is calculated based on the external tipping moment and the gravity restoring moment, and the tipping risk of each adjacent object is determined based on the moment ratio. The specific calculation method includes:

[0057] , wherein This represents the ratio of the tilting moments of object j around boundary line e. Indicates the external overturning moment. Represents the gravity restoring torque, wherein when the overturning moment is greater than... When the value is less than 1, it indicates that the current risk of tipping is low. When the tipping moment is less than 1, it indicates that the current risk of tipping is low. When the ratio is 1, it indicates a moderate risk of tipping over, which is a critical state. When the tipping moment ratio is... When the value is greater than 1, the risk of tipping over is relatively high. This refers to the external tipping moment. Where r represents the lever arm vector, and F represents the vector from a point on the boundary to the point of application of the external force. ext Represents the external force vector. This represents the unit vector indicating the direction of the boundary line. The restoring torque of gravity is described therein. The calculation formula is: , where m j Let represent the estimated mass of object i, and g represent the acceleration due to gravity. This represents the vertical distance from the centroid of object i to the boundary line.

[0058] It is worth mentioning that the analog quantity calculated based on the above formula is used to initially determine the simulated motion state of the target object and its adjacent objects. However, the present invention can obtain the real motion state of the target object and its adjacent objects based on image data, and perform machine learning based on the real detection quantity and the analog quantity, so that the analog quantity is more consistent with the real detected motion state. Thus, it can accurately predict the influence of the robotic arm on adjacent objects in the corresponding grasping action and path in a relatively black box state. It should be noted that in the present invention, the set of action parameters and trajectory parameters of the successfully grasped object are used as the effective data for the machine learning of the present invention. If the grasping fails, the failed data will not be used to form the training data.

[0059] The specific implementation method includes: obtaining the force vector of the manipulator's own disturbance to adjacent objects; calculating the force vector of the current adjacent object on another adjacent object based on the estimated mass of each adjacent object, the force vector of the disturbance to adjacent objects, and the ground friction factor; using this vector to calculate the corresponding disturbance propagation matrix; inputting the set of single-instruction parameters corresponding to each grasping action and grasping trajectory of the manipulator, along with the disturbance propagation matrix, as feature parameters into a machine learning model; training the machine learning model using the squared difference between the actual motion state results of adjacent objects detected by the image detection module and the prediction results of the machine learning model as the loss function; and selecting the manipulator's grasping action and grasping trajectory with the minimum disturbance as the output grasping strategy. It should be noted that the minimum disturbance can be obtained by directly calculating and comparing the cumulative values ​​of all elements of the disturbance propagation matrix.

[0060] This invention also provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor executes the computer program to implement the steps of any of the data processing methods in the embodiments of this application. Specifically, this terminal device integrates any of the data processing methods provided in the embodiments of this application.

[0061] The terminal device may include a processor with one or more processing cores, a memory with one or more computer-readable storage media, a power supply, and input units, etc. Those skilled in the art will understand that the terminal device structure does not constitute a limitation on the terminal device, and may include more or fewer components than shown in the figures, or combine certain components, or have different component arrangements. Wherein:

[0062] The processor is the control center of the terminal device. It connects various parts of the terminal device via various interfaces and lines, and performs various functions and processes data by running or executing software programs and / or modules stored in memory, and by calling data stored in memory, thereby providing overall monitoring of the terminal device. Optionally, the processor may include one or more processing cores; the processor may be a Central Processing Unit (CPU), or other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate Arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or any conventional processor. Preferably, the processor may integrate an application processor and a modem processor, where the application processor mainly handles the operating system, user interface, and applications, and the modem processor mainly handles wireless communication. It is understood that the modem processor may also not be integrated into the processor.

[0063] Memory can be used to store software programs and modules. The processor executes various functional applications and data processing methods by running the software programs and modules stored in memory. Memory can mainly include a program storage area and a data storage area. The program storage area can store the operating system, application programs required for at least one function (such as sound playback, image playback, etc.), etc.; the data storage area can store data created based on the use of the terminal device. Furthermore, memory can include high-speed random access memory, and can also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, memory can also include a memory controller to provide the processor with access to the memory.

[0064] The terminal device also includes a power supply for powering various components. Preferably, the power supply can be connected to the processor logic through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components. The terminal device may also include an input unit, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.

[0065] Although not shown, the terminal device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor in the terminal device loads the executable files corresponding to the processes of one or more applications into the memory 402 according to the following instructions, and the processor runs the applications stored in the memory.

[0066] Those skilled in the art should understand that the embodiments of the present invention described above and shown in the accompanying drawings are merely examples and do not limit the present invention. The purpose of the present invention has been fully and effectively achieved. The functions and structural principles of the present invention have been shown and explained in the embodiments. Without departing from the stated principles, the implementation of the present invention may have any variations or modifications.

Claims

1. A single-command grasping method for robots based on visual perception, characterized in that, The method includes: The system uses a visual sensor to acquire image information of the target object and its neighboring objects, and identifies the feature types and location information of the target object and its neighboring objects. Based on image information, the characteristic states of the contact surface of the robotic arm and the characteristic states of the grasping surface of the target object are identified. The friction coefficient of the grasping action is adaptively corrected based on the two characteristic states, and the grasping force in the corresponding direction is adjusted according to the corrected friction coefficient. Based on the image information, the contact state between each target object and its neighboring objects is identified and detected, and an interactive dynamics model is constructed based on the contact state and the motion characteristics of the neighboring objects. Multiple simulated grasping actions and trajectories are pre-constructed. The disturbance risk of each grasping action and trajectory to adjacent objects of the target object is judged based on the interaction dynamics model. Machine learning is used to adjust the grasping actions and trajectories of the robotic arm based on the disturbance risk. The disturbance risk of the target object's adjacent objects includes sliding risk and tipping risk. The sliding risk requires obtaining the force between each contact surface according to the transmission mechanics equation, calculating the normal pressure and tangential force of the contact surface, calculating the friction coefficient of each contact surface material based on the normal pressure, calculating the maximum static friction force, calculating the ratio of the tangential force to the maximum static friction force, and judging the sliding risk of the current adjacent object based on the ratio. The tipping risk assessment requires obtaining the external tipping moment of adjacent objects and the gravity-restoring moment of the adjacent objects themselves. The external tipping moment is obtained by the dot product of the total moment of the external forces on the contact surface of the adjacent objects and the unit vector in the boundary direction. The gravity-restoring moment is obtained by the dot product of the estimated mass of the adjacent objects and the vertical distance from the center of mass to the boundary line. The moment ratio of each adjacent object is calculated based on the external tipping moment and the gravity-restoring moment, and the tipping risk of each adjacent object is determined based on the moment ratio.

2. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The surface feature states of the target object and the contact portion of the robot arm are identified using a target recognition model. The surface feature states of the target object include the material feature states of the surface and the feature states of the adhesives on the surface. The surface feature states of the contact portion of the robot arm include the material feature states of the contact portion surface and the feature states of the adhesives on the contact surface. A pre-configured material reference friction coefficient for the corresponding material pair is obtained, and an adhesive correction factor is obtained based on the identified feature states of the adhesives. The adhesive correction factor is used to correct the material reference friction coefficient. The normal force of the corresponding contact surface is estimated based on the corrected material reference friction coefficient and the maximum static friction force, which is used to adaptively control the force and direction of the contact portion of the robot arm.

3. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The adaptive correction method for the friction coefficient includes: performing finite element analysis on the contact surface between the contact part of the robot and the surface of the target object to obtain the pressure distribution function of the non-uniform pressure contact surface; based on the pressure distribution function and the corresponding local friction coefficient, and combined with the integral processing of the contact area part of the finite element analysis, obtaining the area correction factor of the overall contact surface; the area correction factor makes the complete contact surface of non-uniform pressure have a uniformly corrected friction coefficient.

4. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The adaptive correction method for the friction coefficient includes: when the surface of the target object is identified as a sphere, calculating a sliding correction factor under minute sliding, calculating spherical contact under minute sliding using a deformed Cattaneo-Mindlin model, further calculating the maximum static friction force under the normal force of the corresponding contact surface when the target object is a sphere based on the sliding correction factor, and estimating and controlling the magnitude of the normal force of the corresponding contact surface based on the maximum static friction force and the mass of the target object.

5. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The interactive dynamics model construction method includes: acquiring the contact states between multiple adjacent objects of the target object, wherein the contact states include point contact, line contact, and surface contact; constructing corresponding contact pairs for the point contact, line contact, and surface contact; constructing a contact network using the contact pairs as edges and each adjacent object and the corresponding target object as vertices; calculating the relative displacement and relative velocity of each grasping action and trajectory relative to the adjacent objects of the target object at the contact point, contact surface, or contact line; and constructing a mechanical transfer equation, wherein the mechanical transfer equation formula is as follows: ; Where the subscripts i and j represent the mechanical interaction between object i and object j, K ij D represents the contact stiffness matrix of object i with respect to object j. ij This represents the damping matrix of object i with respect to object j. Indicates the relative displacement of the contact position. Represents the relative velocity at the point of contact. This represents the force transmitted by object i to physical object j.

6. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The interactive dynamics model method includes: after identifying the material characteristics of adjacent objects of the target object using a target recognition algorithm, estimating the mass of each adjacent object in conjunction with the object's volume, and further calculating the disturbance propagation matrix according to the following formula: ; Where P ij This represents the degree of influence of object i on the acceleration of object j, where a j F represents the acceleration of object j. i This represents the force acting on object i.

7. The single-command grasping method for robots based on vision perception according to claim 1, characterized in that, The robot arm obtains the force vector of its own disturbance to adjacent objects, and calculates the force vector of the current adjacent object on another adjacent object based on the estimated mass of each adjacent object, the force vector of its disturbance to adjacent objects, and the ground friction factor. This force vector is used to calculate the corresponding disturbance propagation matrix. The robot arm's grasping action parameter set, grasping trajectory parameter set, and disturbance propagation matrix are input as feature parameters into the machine learning model. The machine learning model is trained using the squared difference between the actual motion state results of adjacent objects detected by the image detection module and the prediction results of the machine learning model as the loss function. The robot arm grasping action and grasping trajectory with the least disturbance are selected as the output grasping strategy.

8. A single-command grasping system for robots based on vision perception, characterized in that, The system executes a single-command grasping method for robots based on visual perception, as described in any one of claims 1-7.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that is executed by a processor to implement a single-instruction grasping method for a robot based on vision perception as described in any one of claims 1-7.