Robot teaching methods, devices, equipment and computer-readable storage media

CN116079703BActive Publication Date: 2026-06-30SOUTH CHINA UNIV OF TECH +1

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SOUTH CHINA UNIV OF TECH
Filing Date: 2021-11-05
Publication Date: 2026-06-30

Application Information

Patent Timeline

05 Nov 2021

Application

30 Jun 2026

Publication

CN116079703B

IPC: B25J9/16

AI Tagging

Technology Topics

Simulation Control engineering

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

An underwater acoustic network physical layer simulation method, system and medium of a network simulator
CN122293218AChannel impulse responseSimulation
Sustained behavior support apparatus, sustained behavior support method and program
US20260171212A1
Conference reservation doorplate machine based on face recognition
CN224383731UIndividual entry/exit registers Simulation Artificial intelligence
Arm strength trainer (slider)
CN310063879SSimulation Mechanical engineering
Adjustable high jump training device for sports teaching
CN224370550UJumping apparatusPhysical educationSimulation

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN116079703B_ABST

Patent Text Reader

Abstract

This application discloses a robot teaching method, apparatus, device, and computer-readable storage medium, belonging to the field of robot technology. The method includes: during the process of controlling a physical robot to perform a first teaching action, providing tactile feedback in response to a collision between the physical robot and an obstacle; adjusting the first teaching action using adjustment instructions obtained after providing the tactile feedback to obtain a target teaching action that satisfies non-collision conditions; and completing the teaching of the physical robot based on the target teaching action. This approach increases the automatic detection of collision events and provides tactile feedback to indicate the presence of a collision event when it is detected. The reliability of automatic collision event detection is high, and the adjustment instructions obtained after providing tactile feedback are more reliable adjustment instructions generated by the teacher based on the tactile feedback, resulting in higher quality adjustments to the teaching action and improving the effectiveness of robot teaching.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of robotics, and in particular to a robot teaching method, apparatus, device, and computer-readable storage medium. Background Technology

[0002] With the development of robotics technology, more and more physical robots can perform various tasks such as handling, assembly, and trajectory tracking. Before using a physical robot to complete a task, a teacher needs to teach the physical robot to determine the target teaching action that the physical robot needs to perform to complete the task, and then complete the teaching of the physical robot based on the target teaching action.

[0003] During the teaching process of a physical robot, the teacher can observe in real time the process of the computer device controlling the physical robot to perform a certain taught action. In related technologies, the teacher manually observes collision events between the physical robot and obstacles, and generates adjustment commands after observing the collision events. The computer device uses these adjustment commands to adjust the taught actions to obtain the final target taught action.

[0004] Human observation of collision events has low reliability, easily missing collision events occurring in visual blind spots or mistaking non-collision events for collision events. The reliability of adjustment instructions generated by the instructor after observing a collision event is also low, resulting in poor quality of adjustments to the teaching actions and poor robot teaching effect. Summary of the Invention

[0005] This application provides a robot teaching method, apparatus, device, and computer-readable storage medium, which can be used to improve the quality of adjusting teaching actions, thereby improving the effectiveness of robot teaching. The technical solution is as follows:

[0006] On one hand, embodiments of this application provide a robot teaching method, the method comprising:

[0007] During the process of controlling the physical robot to perform the first taught action, in response to the collision between the physical robot and the obstacle, tactile feedback is provided, which is used to indicate the existence of a collision event;

[0008] The first teaching action is adjusted using the adjustment instructions obtained after providing the tactile feedback to obtain a target teaching action that meets the non-collision condition, and the teaching of the physical robot is completed based on the target teaching action.

[0009] On the other hand, a robot teaching device is provided, the device comprising:

[0010] The control unit is configured to provide tactile feedback in response to a collision between the physical robot and an obstacle during the process of controlling the physical robot to perform a first teaching action, the tactile feedback being used to indicate the presence of a collision event;

[0011] An adjustment unit is used to adjust the first teaching action using the adjustment instructions obtained after providing the tactile feedback, to obtain a target teaching action that satisfies the non-collision condition, and to complete the teaching of the physical robot based on the target teaching action.

[0012] In one possible implementation, the control unit is configured to determine a target current based on the collision force between the physical robot and the obstacle; and apply the target current to the haptic feedback device so that the haptic feedback device provides the haptic feedback under the action of the target current.

[0013] In one possible implementation, the device further includes:

[0014] The acquisition unit is used to acquire the second teaching action corresponding to the virtual robot; and to acquire the first teaching action based on the second teaching action.

[0015] In one possible implementation, the acquisition unit is configured to acquire teaching information, the teaching information including at least one of gesture information and voice information; acquire teaching instructions based on the teaching information, the teaching instructions being used to instruct sub-teaching actions; perform virtual teaching on the virtual robot using the sub-teaching actions indicated by the teaching instructions; and obtain the second teaching action in response to the virtual teaching process satisfying a first condition.

[0016] In one possible implementation, the teaching information includes gesture information and voice information. The acquisition unit is used to acquire fused text based on the gesture information and the voice information; call a classification model to classify the fused text to obtain the matching probability of each candidate instruction, wherein the classification model is trained based on the sample text and the instruction label corresponding to the sample text; and acquire the teaching instruction based on the candidate instruction whose matching probability satisfies the selection condition.

[0017] In one possible implementation, the acquisition unit is used to correct the second teaching action to obtain a corrected second teaching action; and to acquire the first teaching action based on the corrected second teaching action.

[0018] In one possible implementation, the virtual robot is constructed by an augmented reality device based on the physical robot.

[0019] In one possible implementation, the acquisition unit is configured to send the teaching instruction to an augmented reality device, which then controls the virtual robot to execute the sub-teaching action indicated by the teaching instruction.

[0020] In one possible implementation, a force sensor is mounted on the physical robot, and the control unit is further configured to determine that the physical robot has collided with the obstacle in response to the force detected by the force sensor satisfying a collision detection condition.

[0021] In one possible implementation, the classification model is a maximum entropy model.

[0022] On the other hand, a computer device is provided, the computer device including a processor and a memory, the memory storing at least one computer program, the at least one computer program being loaded and executed by the processor to enable the computer device to implement any of the robot teaching methods described above.

[0023] On the other hand, a computer-readable storage medium is also provided, wherein at least one computer program is stored in the computer-readable storage medium, the at least one computer program being loaded and executed by a processor to enable a computer to implement any of the robot teaching methods described above.

[0024] On the other hand, a computer program product is also provided, which includes a computer program or computer instructions, the computer program or computer instructions being loaded and executed by a processor to enable a computer to implement any of the robot teaching methods described above.

[0025] The technical solution provided in this application has at least the following beneficial effects:

[0026] The technical solution provided in this application automatically detects collision events between the robot and obstacles during the process of controlling a physical robot to perform teaching actions. Upon detecting a collision event, tactile feedback is provided to indicate its presence, allowing the teacher to intuitively perceive the collision event based on the tactile feedback. The automatic collision detection is highly reliable, and the adjustment instructions obtained after providing tactile feedback are more reliable adjustment instructions generated by the teacher based on the tactile feedback. The quality of adjusting the teaching actions using these adjustment instructions is high, which is beneficial for improving the effectiveness of robot teaching. Attached Figure Description

[0027] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0028] Figure 1 This is a schematic diagram of the implementation environment of a robot teaching method provided in an embodiment of this application;

[0029] Figure 2 This is a schematic diagram of a robot teaching environment provided in an embodiment of this application;

[0030] Figure 3 This is a flowchart of a robot teaching method provided in an embodiment of this application;

[0031] Figure 4 This is a schematic diagram of a three-dimensional hand model provided in an embodiment of this application;

[0032] Figure 5 This is a schematic diagram of the coordinate system referenced by various subjects in a teaching environment provided in an embodiment of this application;

[0033] Figure 6 This is a schematic diagram of a haptic feedback device provided in an embodiment of this application;

[0034] Figure 7 This is a schematic diagram of a robot teaching device provided in an embodiment of this application;

[0035] Figure 8 This is a schematic diagram of a robot teaching device provided in an embodiment of this application;

[0036] Figure 9 This is a schematic diagram of the structure of a terminal provided in an embodiment of this application;

[0037] Figure 10 This is a schematic diagram of the structure of a server provided in an embodiment of this application. Detailed Implementation

[0038] To make the objectives, technical solutions, and advantages of this application clearer, the embodiments of this application will be described in further detail below with reference to the accompanying drawings.

[0039] It should be noted that the terms "first," "second," etc., used in this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0040] Figure 1 A schematic diagram of the implementation environment of a robot teaching method provided in an embodiment of this application is shown. The implementation environment includes: a computer device 11, a physical robot 12, an augmented reality device 13, a motion-sensing device 14, and a haptic feedback device 15.

[0041] In this application, the physical robot 12 refers to a device used to perform certain tasks in place of humans. For example, the physical robot is a robot used in industry, also known as an industrial robot. This application does not limit the types of tasks that the physical robot 12 can perform, as this depends on the actual application scenario and the structure of the physical robot. For example, the tasks that the physical robot 12 can perform include, but are not limited to, assembly tasks and trajectory tracking tasks. This application does not limit the structure of the physical robot 12. For example, the physical robot 12 is a robot with six degrees of freedom, that is, a robot with six joints. Of course, the physical robot 12 can also be a robot with other structures.

[0042] Augmented Reality (AR) device 13 is a device integrating augmented reality technology. Augmented reality technology is a new technology that seamlessly integrates real and virtual environments. Its goal is to overlay a virtual environment onto a real environment on a screen for user interaction. In this embodiment, the augmented reality device 13 can construct a virtual robot based on a physical robot located in a real environment, add motion attributes to the virtual robot, and then control the movement of the virtual robot in the same way as controlling a physical robot. This embodiment does not limit the type of augmented reality device 13; exemplarily, augmented reality device 13 refers to AR glasses, such as HoloLens (a head-mounted display). Exemplarily, the augmented reality device 13 is worn by a teacher, who can observe not only the movement of the virtual robot but also the movement of the physical robot 12.

[0043] The motion-sensing device 14 is used to acquire teaching information from the instructor, such as gesture information and voice information. Exemplarily, the motion-sensing device 14 has multiple functions such as real-time motion capture, image recognition, microphone input, and voice recognition. This application embodiment does not limit the type of motion-sensing device 14; exemplarily, the motion-sensing device 14 is Kinect (a motion-sensing device), LeepMotion (a motion-sensing device), etc. Exemplarily, the motion-sensing device 14 is connected to the physical robot 12 to facilitate the acquisition of teaching information from the instructor. This application embodiment does not limit the position of the motion-sensing device 14 connected to the physical robot 12. Exemplarily, if the physical robot 12 is a robot with six joints, and the fifth joint of the physical robot 12 is a translational joint and the sixth joint is a rotational joint, the motion-sensing device 14 is connected to the fifth joint of the physical robot 12 to ensure the stability of the motion-sensing device 14. Of course, in exemplary embodiments, the motion-sensing device 14 may not be connected to the physical robot 12, for example, it may be placed on a fixed object.

[0044] The haptic feedback device 15 is used to provide haptic feedback to the teacher. The teacher can receive haptic feedback by wearing the haptic feedback device 15. This application embodiment does not limit the way the haptic feedback device 15 is worn. Exemplarily, the haptic feedback device 15 is worn on the teacher's finger. In this case, the haptic feedback device 15 can also be called a fingertip haptic feedback device.

[0045] Computer device 11 is used to implement the robot teaching method provided in the embodiments of this application. In an exemplary embodiment, computer device 11 can acquire teaching information collected by somatosensory device 14, convert the teaching information into teaching instructions for controlling a virtual robot, and send the teaching instructions to augmented reality device 13 so that augmented reality device 13 controls the virtual robot according to the teaching instructions. Computer device 11 can also acquire teaching actions corresponding to the virtual robot, convert the teaching actions corresponding to the virtual robot into teaching actions corresponding to the physical robot 12, and control the physical robot 12 to perform the teaching actions. Computer device 11 can also respond to the physical robot 12 colliding with an obstacle during the performance of the teaching actions by controlling haptic feedback device 15 to provide haptic feedback to the teacher.

[0046] The computer device 11 can be a terminal or a server, and this application embodiment does not limit this. For example, the terminal can be any electronic product capable of human-computer interaction with a user through one or more methods such as a keyboard, touchpad, touchscreen, remote control, voice interaction, or handwriting device, such as a PC (Personal Computer), mobile phone, smartphone, PDA (Personal Digital Assistant), wearable device, PPC (Pocket PC), tablet computer, smart car system, smart TV, smart speaker, in-vehicle terminal, etc. The server can be a single server, a server cluster consisting of multiple servers, or a cloud computing service center. The computer device 11 establishes communication connections with the physical robot 12, augmented reality device 13, motion sensing device 14, and haptic feedback device 15 via wired or wireless networks.

[0047] Those skilled in the art should understand that the computer device 11, physical robot 12, augmented reality device 13, somatosensory device 14, and haptic feedback device 15 described above are merely examples. Other existing or future devices that are applicable to this application should also be included within the scope of protection of this application, and are hereby incorporated by reference.

[0048] In an exemplary embodiment, for the case where a physical robot performs a trajectory tracking task on a production line, a robot teaching environment is as follows: Figure 2 As shown, the teaching environment includes a physical robot, motion-sensing devices, augmented reality devices, haptic feedback devices, a production line, and a virtual robot. The virtual robot is constructed from the physical robot using augmented reality devices, with its base overlapping the physical robot's base. The motion-sensing devices are mounted on the fifth joint of the physical robot. The augmented reality devices are worn on the teacher's eyes, providing visual feedback. The haptic feedback devices are worn on the teacher's fingers, providing tactile feedback. The production line includes trajectory-tracking accessories.

[0049] Based on the above Figure 1 The implementation environment shown in this application embodiment provides a robot teaching method, which is executed by a computer device 11. For example... Figure 3 As shown, the robot teaching method provided in this application embodiment includes the following steps 301 and 302.

[0050] In step 301, during the process of controlling the physical robot to perform the first teaching action, tactile feedback is provided in response to the collision between the physical robot and the obstacle. The tactile feedback is used to indicate the existence of a collision event.

[0051] The first teaching action refers to a specific teaching action that the physical robot needs to perform, determined during the teaching process. For example, the first teaching action consists of one or more sub-teaching actions. This first teaching action may be the final teaching action or it may be a teaching action that requires further adjustment; this application embodiment does not limit this.

[0052] In an exemplary embodiment, a first teaching action needs to be obtained before performing step 301. This embodiment does not limit the method of obtaining the first teaching action. Exemplarily, the first teaching action is obtained by the teacher manually programming on the teach pendant. Exemplarily, the first teaching action is obtained based on a second teaching action corresponding to the virtual robot.

[0053] This application embodiment uses the example of obtaining the first teaching action based on the second teaching action corresponding to the virtual robot. In this case, it can be regarded as using an offline-online fusion teaching method to realize the teaching of the physical robot. Through offline-online fusion teaching, the teacher can safely perform virtual teaching of the virtual robot in a real scene. Subsequently, the physical robot reproduces the movement of the virtual robot to complete the teaching of the physical robot.

[0054] Before using a blended online / offline teaching method, a virtual robot needs to be constructed based on the physical robot. For example, the construction of the virtual robot is performed by an augmented reality (AR) device; that is, the virtual robot is constructed by the AR device based on the physical robot. By wearing the AR device, the teacher can observe not only the physical robot in the real environment but also the virtual robot superimposed on it. For example, within the AR device's field of view, the base of the constructed virtual robot coincides with the base of the physical robot in the real environment.

[0055] When the first teaching action is obtained based on the second teaching action corresponding to the virtual robot, the steps to be performed before executing step 301 include: obtaining the second teaching action corresponding to the virtual robot; and obtaining the first teaching action based on the second teaching action.

[0056] The second teaching action corresponding to the virtual robot refers to a specific teaching action that the virtual robot needs to perform in order to complete the virtual task. The virtual task is obtained by virtualizing the task that the physical robot needs to perform. For example, if the task that the physical robot needs to perform is for the end effector of the physical robot to track a target trajectory, then the virtual task is for the end effector of the virtual robot to track a virtual trajectory, where the virtual trajectory coincides with the target trajectory.

[0057] In one possible implementation, the process of obtaining the second teaching action corresponding to the virtual robot includes the following steps 3001 to 3004.

[0058] Step 3001: Obtain teaching information, which includes at least one of gesture information and voice information.

[0059] Gesture information is used to represent the teacher's gestures, and voice information is used to represent the teacher's voice. The teacher teaches the physical robot through at least one natural interaction method, either gestures or voice. When the teacher teaches the physical robot through gestures alone, the teaching information acquired by the computer device includes only gesture information; when the teacher teaches the physical robot through voice alone, the teaching information acquired by the computer device includes only voice information; when the teacher teaches the physical robot through both gestures and voice, the teaching information acquired by the computer device includes both gesture and voice information. Exemplarily, the teacher's gestures can refer to dynamic gestures used to draw the trajectory the robot needs to follow, or static gestures used to indicate the direction of the robot's movement; this application embodiment does not limit this.

[0060] This application does not limit the natural interaction method used by the teacher to teach the physical robot, as long as it can achieve the teaching of the physical robot.

[0061] For example, the teaching information obtained in step 3001 refers to the teaching information obtained within a certain time period. If the teaching information includes gesture information, the number of gesture information included in the teaching information may be one or more; if the teaching information includes voice information, the number of voice information included in the teaching information may be one or more.

[0062] For example, the gesture information is acquired as follows: the motion-sensing device captures the gesture images of the instructor at an image acquisition frequency, and sends the captured gesture images to a computer device; the computer device performs gesture recognition on the gesture images, and acquires gesture information based on the recognized information. The number of gesture information acquired by the computer device within a certain time period is the same as the number of gesture images captured by the motion-sensing device within that time period. For example, if the duration of a certain time period is 3 seconds and the image acquisition frequency is 40 images per second, then the computer device can acquire 120 gesture information within that time period.

[0063] The image acquisition frequency is set based on experience or adjusted flexibly according to the type of motion-sensing device. This embodiment does not limit this; for example, the image acquisition frequency is 40 images per second. Exemplarily, the motion-sensing device coordinate system referenced by the motion-sensing device differs from the hand coordinate system referenced by the instructor's hand. During the acquisition of gesture images, the motion-sensing device transforms the instructor's hand position from the hand coordinate system to the motion-sensing device coordinate system based on the transformation relationship between the hand coordinate system and the motion-sensing device coordinate system, ensuring the accuracy of subsequent teaching. The transformation relationship between the hand coordinate system and the motion-sensing device coordinate system is pre-calibrated.

[0064] For example, the hand coordinate system is a coordinate system determined based on the three-dimensional hand model. The three-dimensional hand model is as follows: Figure 4 As shown, the 3D hand model can provide the positions of the index fingertip A, the metacarpal joint B, and the center of the palm C. Connecting A to B yields vector AB. C is taken as the origin of the hand coordinate system, and the axis parallel to vector AB is taken as the Y-axis of the hand coordinate system (Y). H The axis perpendicular to vector AB within plane ABC is taken as the X-axis of the hand coordinate system (X). H Since the X, Y, and Z axes of the hand coordinate system are perpendicular to each other, it is possible to determine the direction of the hand coordinate system based on the Y axis. H and X H Determine the Z-axis (Zi) of the hand coordinate system. H It should be noted that... Figure 4 X shown B R Y B R and Z B R These refer to the three coordinate axes of the reference coordinate system that the base of the physical robot is referencing.

[0065] For example, the motion-sensing device is a cuboid-shaped device. The coordinate system of the motion-sensing device has its origin at the center point of the device and its Z-axis is the axis parallel to the optical axis of the device. K The axis parallel to the long side of the motion-sensing device is used as the Y-axis (Y). K ), will be with Z K and Y K The axis that is perpendicular to each other is taken as the X-axis (X K ).

[0066] For example, a computer device uses a gesture recognition model to recognize gestures from images. The gesture recognition model can output information representing the recognized gestures. The gesture recognition model can be trained using sample gesture images and information labels through supervised training.

[0067] For example, a computer device integrates a gesture tracking system (e.g., 3 Gear Systems), which uses the gesture tracking system to perform gesture recognition on gesture images. For example, the gesture tracking system operates in a client / server mode, whereby the computer device can communicate with the gesture recognition server via a UDP (User Datagram Protocol) socket and obtain information representing the gesture through the API (Application Programming Interface) provided by the gesture recognition server to achieve gesture recognition of the gesture image.

[0068] After performing gesture recognition on the gesture image, information representing the gesture in the gesture image can be obtained. This application embodiment does not limit the representation form of the gesture information. For example, the information representing a gesture includes the state of the index fingertip. For example, the state of the index fingertip includes, but is not limited to, the position, velocity, and acceleration of the index fingertip in the motion-sensing device coordinate system. The velocity and acceleration can be calculated from two adjacent gesture images. For example, the position of the index fingertip in the motion-sensing device coordinate system is uniquely determined by the position of the index fingertip on the X-axis, Y-axis, and Z-axis of the motion-sensing coordinate system, the velocity of the index fingertip in the motion-sensing device coordinate system is uniquely determined by the velocity of the index fingertip on the X-axis, Y-axis, and Z-axis of the motion-sensing coordinate system, and the acceleration of the index fingertip in the motion-sensing device coordinate system is uniquely determined by the acceleration of the index fingertip on the X-axis, Y-axis, and Z-axis of the motion-sensing coordinate system. For example, let x... k Let x represent the information of the gesture at time k. k Using equation (1):

[0069] (1)

[0070] in, These represent the position, velocity, and acceleration of the index fingertip on the X-axis of the somatosensory coordinate system, respectively. These represent the position, velocity, and acceleration of the index fingertip on the Y-axis of the somatosensory coordinate system, respectively. These represent the position, velocity, and acceleration of the index fingertip on the Z-axis of the somatosensory coordinate system, respectively. This represents the transpose of a matrix.

[0071] In an exemplary embodiment, the information used to characterize the gesture may include not only the state of the index fingertip, but also the direction the index finger is pointing, etc. This application embodiment does not limit this.

[0072] After performing gesture recognition on the gesture image, gesture information is obtained based on the recognized information. In one possible implementation, obtaining gesture information based on the recognized information means directly using the recognized information as the gesture information. In this case, the obtained gesture information is the original recognized information itself, requiring no additional processing, resulting in high efficiency in obtaining gesture information.

[0073] In another possible implementation, the gesture information is obtained based on the recognized information by filtering the recognized information and using the filtered information as the gesture information. The recognized information may have certain measurement errors and noise. By filtering the recognized information, more reliable gesture information can be obtained, facilitating the acquisition of more accurate teaching commands. It should be noted that when there are multiple gesture images, there are also multiple pieces of recognized information. Filtering the recognized information means filtering each piece of recognized information. This embodiment uses one piece of recognized information as an example for illustration.

[0074] The filtering method for the identified information can be set based on experience or flexibly adjusted according to the actual application scenario, and this application embodiment does not limit it in this regard. For example, linear filtering (e.g., Kalman filtering) or nonlinear filtering (e.g., particle filtering) can be applied to the identified information.

[0075] In an exemplary embodiment, the process of filtering the identified information to obtain gesture information is implemented based on equation (2):

[0076] (2)

[0077] in, This represents the gesture information obtained after filtering. The information obtained from the identification is represented by equation (1); Represents the state transition matrix; Represents the input matrix; This represents the control input model applied to the input matrix; Represents the system input matrix; This represents the process noise matrix.

[0078] For example, during the filtering process of the identified information, the state transition matrix As shown in equation (3):

[0079] (3)

[0080] Where t represents the image acquisition time interval of the motion sensing device.

[0081] Since there is no control input for the position state, the system input matrix is... As shown in equation (4):

[0082] (4)

[0083] Process noise matrix As shown in equation (5):

[0084] (5)

[0085] in, These represent the process noise of the index fingertip's acceleration along the X, Y, and Z axes of the motion-sensing device's coordinate system, respectively. The value can be set based on experience or adjusted flexibly according to the actual application scenario. This application embodiment does not limit this.

[0086] By substituting equations (1), (3), (4) and (5) into equation (2), the gesture information can be obtained.

[0087] For example, the method of acquiring voice information is as follows: a voice acquisition device acquires the voice spoken by the teacher within a certain period of time, and sends the voice information used to represent the acquired voice to a computer device; the computer device acquires the voice information. This application embodiment does not limit the representation form of the voice information used to represent the voice, and it can refer to audio composed of voice. This application embodiment does not limit the type of voice acquisition device; for example, a voice acquisition device refers to a microphone array. For example, the voice acquisition device is built into a device that establishes a communication connection with the computer device, the voice acquisition device is built into a motion-sensing device, or the voice acquisition device is built into an augmented reality device.

[0088] Based on the above methods of acquiring gesture and voice information, and the specific circumstances of teaching information including gesture and voice information, teaching information can be acquired.

[0089] Step 3002: Based on the teaching information, obtain the teaching instruction, which is used to instruct the sub-teaching action.

[0090] After acquiring the teaching information, teaching instructions for controlling the virtual robot are obtained based on the teaching information. These teaching instructions instruct sub-teaching actions, which are actions that the virtual robot needs to perform. In an exemplary embodiment, since the connection relationships between the joints of the physical robot are known, and the connection relationships between the joints of the virtual robot are the same as those of the physical robot, the connection relationships between the joints of the virtual robot are also known. Based on the connection relationships between the joints and the sub-teaching actions performed by the end effector, the actions of each joint of the virtual robot can be calculated based on inverse kinematics. Exemplarily, the connection relationships between the joints of the physical robot are determined by modeling the physical robot, for example, using a Denavit-Hartenberg (DH) model.

[0091] This application does not limit the representation of teaching instructions in its embodiments, as long as it can clearly indicate a sub-teaching action. For example, a teaching instruction consists of four attributes (C... opt C dir C val C unit Definition of C. opt Indicates the type of child teaching action; C dir Indicates the direction of the child's taught action; C val Indicates the action value; C unit This represents the action unit. For example, the teaching instruction is represented by a coordinate point that indicates the position of the virtual robot's end effector, which is a coordinate point in the coordinate system referenced by the virtual robot's end effector.

[0092] For example, the process of obtaining teaching instructions based on teaching information refers to the process of obtaining teaching instructions by comprehensively considering all information in the teaching information. This method can improve the accuracy of the obtained teaching instructions. For example, when the teaching information only includes gesture information, the teaching instruction is obtained by considering the gesture information; when the teaching information only includes voice information, the teaching instruction is obtained by considering the voice information; when the teaching information includes both gesture information and voice information, the teaching instruction is obtained by comprehensively considering both gesture information and voice information.

[0093] In an exemplary embodiment, when the teaching information includes gesture information and voice information, the process of obtaining teaching instructions based on the teaching information includes the following steps 1 and 2.

[0094] Step 1: Obtain the fused text based on gesture and voice information.

[0095] The fused text is the text upon which the teaching instructions are based, obtained by comprehensively considering gesture information and voice information. In an exemplary embodiment, obtaining fused text based on gesture information and voice information can mean obtaining fused text based on all acquired gesture information and all acquired voice information; it can also mean obtaining fused text based on the latest reference number of gesture information and all acquired voice information. This application embodiment does not limit this. The reference number is set based on experience or can be flexibly adjusted according to the application scenario. For example, the reference number is 2. The latest reference number of gesture information refers to the gesture information obtained based on the latest reference number of gesture images collected by the motion sensing device within a certain period of time.

[0096] In an exemplary embodiment, the method for obtaining fused text based on gesture information and voice information is as follows: obtaining first text based on gesture information; obtaining second text based on voice information; and fusing the first text and the second text to obtain fused text. Exemplarily, the method for obtaining the first text based on gesture information is as follows: converting the direction in the gesture information into directional text, converting the coordinate points in the gesture information into location text, and constructing the first text from the directional text and location text. Exemplarily, the method for obtaining the second text based on voice information is as follows: recognizing the second text corresponding to the voice information. This application embodiment does not limit the method for recognizing the text corresponding to the voice information. Exemplarily, a speech recognition SDK (Software Development Kit) is invoked to recognize the text corresponding to the voice information. Exemplarily, a speech recognition model is invoked to recognize the text corresponding to the voice information.

[0097] It should be noted that when there are multiple gesture information pieces, a first text is obtained based on each gesture information piece, and there are multiple first text pieces; when there are multiple voice information pieces, a second text is obtained based on each voice information piece, and there are multiple second text pieces.

[0098] After obtaining the first text and the second text, the first text and the second text are merged to obtain the merged text. For example, if there are multiple instances of both the first text and the second text, the multiple instances of the first text and the multiple instances of the second text are merged to obtain the merged text.

[0099] For example, the method of fusing the first text and the second text is as follows: input the first text and the second text into a text fusion model or text fusion program, and then use the text fusion model or text fusion program to fuse the first text and the second text. For example, the text fusion model or text fusion program is constructed based on fusion experience and is capable of fusing text obtained based on gesture information and text obtained based on speech information.

[0100] Step 2: Call the classification model to classify the fused text and obtain the matching probability of each candidate instruction. The classification model is trained based on the sample text and the instruction label corresponding to the sample text. Based on the candidate instructions that meet the selection criteria according to the matching probability, the teaching instructions are obtained.

[0101] After acquiring the fused text, a classification model is invoked to classify the fused text, obtaining a classification result. This classification result includes the matching probability of each candidate instruction. Each candidate instruction refers to a pre-set, selectable teaching instruction, which can be set based on experience or flexibly adjusted according to the actual application scenario of robot teaching. This embodiment of the application does not limit this.

[0102] In an exemplary embodiment, the method of calling the classification model to classify the fused text is as follows: extract the text features of the fused text, input the text features of the fused text into the classification model for classification processing, and obtain the classification result output by the classification model. This application embodiment does not limit the method of extracting the text features of a certain text. For example, the text features of a certain text are extracted based on the IF-IDF (Term Frequency Inverse Document Frequency) algorithm. For example, the process of extracting the text features of a certain text based on the IF-IDF algorithm is implemented based on equation (6):

[0103] (6)

[0104] in, The frequency of words in text i; This represents the number of times text i appears in document j in the corpus; This represents the total number of text elements contained in document j. Indicates the frequency of the reverse file of text i; This indicates the number of documents in the corpus. This indicates the number of documents in the corpus containing text i; This represents the textual features of text i.

[0105] It should be noted that the above description of extracting text features of a certain text is an exemplary description, and the embodiments of this application are not limited to this. For example, the word2vec model (a word vector model) can also be used to extract text features of a certain text.

[0106] The classification results include the matching probability of each candidate instruction. The higher the matching probability of a candidate instruction, the higher the degree of matching between the candidate instruction and the fused text. After obtaining the matching probabilities of each candidate instruction, teaching instructions are obtained based on the candidate instructions whose matching probabilities meet the selection criteria. The matching probability selection criteria are set based on experience or can be flexibly adjusted according to the application scenario; this embodiment does not limit this.

[0107] In an exemplary embodiment, a matching probability satisfying the selection criterion means that the matching probability is among the top K (K is an integer not less than 1) largest matching probabilities. In this case, the number of candidate instructions whose matching probabilities satisfy the selection criterion is K. In an exemplary embodiment, a matching probability satisfying the selection criterion means that the matching probability is not less than a probability threshold. The probability threshold is set empirically or flexibly adjusted according to the application scenario; for example, the probability threshold is 0.8.

[0108] The number of candidate instructions whose matching probability satisfies the selection criteria may be one or more, and this embodiment does not limit this. If the number of candidate instructions whose matching probability satisfies the selection criteria is one, the teaching instruction is obtained based on the candidate instructions whose matching probability satisfies the selection criteria by directly using one candidate instruction whose matching probability satisfies the selection criteria as the teaching instruction. If the number of candidate instructions whose matching probability satisfies the selection criteria is multiple, the teaching instruction is obtained based on the candidate instructions whose matching probability satisfies the selection criteria by fusing multiple candidate instructions whose matching probability satisfies the selection criteria to obtain the teaching instruction. For example, fusing multiple candidate instructions whose matching probability satisfies the selection criteria means fusing candidate instructions of the target type among the multiple candidate instructions whose matching probability satisfies the selection criteria. The target type refers to the instruction type that is most frequently matched among the multiple candidate instructions whose matching probability satisfies the selection criteria.

[0109] The classification model is trained in a supervised manner based on sample text and its corresponding instruction labels. The instruction label corresponding to the sample text is one of the candidate instructions from among various candidate instructions. Before performing step 2 above, the classification model needs to be trained first. This application embodiment does not limit the type of classification model; the training process differs for different types of classification models. For example, the classification model may be a neural network model, a support vector machine model, a Naive Bayes model, a maximum entropy model, etc.

[0110] For example, the maximum entropy model is used as a classification model for illustration. The core idea of the maximum entropy model is to satisfy known conditions when predicting the probability distribution of random variables. At this time, the information entropy of the probability distribution is the largest, which preserves all possibilities and minimizes the risk of prediction. Suppose x is a text feature of a sample text and y is the corresponding instruction label. The maximum entropy model models the conditional probability p(y|x) to obtain the most uniform distribution model. In the maximum entropy model, the conditional entropy H(p(y|x)) needs to be introduced to measure the uniformity of the distribution of the conditional probability p(y|x). The formula for calculating H(p(y|x)) is shown in Equation (7):

[0111] (7)

[0112] in, This represents the empirical distribution of text feature x in the training set; This represents the conditional probability distribution that needs to be solved in the maximum entropy model.

[0113] The problem of solving the maximum entropy model can be summarized as the optimization problem represented by equation (8):

[0114] (8)

[0115] in, This represents the feature function constructed based on the sample text; n (n is an integer not less than 1) is the number of feature functions.

[0116] According to the Lagrange multiplier method, the maximum entropy probability distribution can be obtained under the constraint of equation (8). , The calculation formula is shown in equation (9):

[0117] (9)

[0118] in, Represents the i-th characteristic function; express The weights; Represents the normalization factor. The calculation formula is shown in equation (10):

[0119] (10)

[0120] By studying the sample text, the weight values of each feature function can be obtained. Based on the weight values of each feature function and equations (9) and (10), the maximum entropy probability distribution can be calculated. This leads to the maximum entropy model.

[0121] The above description only uses the example of teaching information including gesture information and voice information to introduce the implementation method of obtaining teaching instructions. When the teaching information only includes gesture information, the teaching instructions are directly obtained based on the first text; when the teaching information only includes voice information, the teaching instructions are directly obtained based on the second text.

[0122] By way of example, based on the method provided in the embodiments of this application, the teacher does not need to always give complete commands during the teaching process, allowing the use of appropriate default values. The computer device can fill in the missing semantics through the context of the command. For example, the teacher first issues the command "move 3mm in this direction" and points in a direction P. If the next command is given, "continue moving 1mm", the computer device combines this command with the previous command, and the instruction can be interpreted as "move 1mm along direction P". In this way, the teacher does not need to pay too much attention to ensure the semantic integrity of each command, which is more in line with the habits of human daily communication and improves the naturalness of the teaching process.

[0123] In an exemplary embodiment, when the teaching information includes gesture information, dynamic coordinate registration is required before obtaining teaching instructions based on the teaching information, so that coordinate points in the hand coordinate system can be transformed to other coordinate systems. This embodiment takes the motion-sensing device mounted on the fifth joint of a physical robot as an example. The coordinate systems referenced by each subject (device or the teacher's hand) in the teaching environment are as follows: Figure 5 As shown.

[0124] The reference coordinate system of the robot's base is consistent with the world coordinate system, and the three axes of the reference coordinate system are denoted as X, Y, Z, F, G, and C. B R Y B R and Z B R The three axes of the fifth joint coordinate system referenced by the fifth joint of the physical robot are denoted as X5. R Y5 R and Z5 R The three axes of the physical end effector coordinate system referenced by the end effector of the physical robot are denoted as X, X, and X respectively. E R Y E R and Z E R The three axes of the motion-sensing device's coordinate system are denoted as X, Y, Z, F, and G. K Y K and Z K The three axes of the augmented reality device's coordinate system, which is referenced by the augmented reality device, are denoted as X, Y, Z, and F. A Y A and Z AThe three axes of the calibration box coordinate system referenced by the calibration box are denoted as X, X, and X, respectively. C Y C and Z C The three axes of the hand coordinate system referenced by the demonstrator's hand are denoted as X, X, ... H Y H and Z H The three axes of the virtual end effector coordinate system referenced by the virtual robot's end effector are denoted as X, Y, X, and X respectively. E V Y E V and Z E V (Not shown in the figure). The calibration box is used to determine the position of the physical robot so that the augmented reality device can build and display a virtual robot in the instructor's field of vision with its base overlapping the base of the physical robot, based on the position of the calibration box.

[0125] In order to obtain Figure 5 The transformation relationships between various coordinate systems in the system need to be established. For example, the transformation order between different coordinate systems is as follows: hand coordinate system, motion sensing device coordinate system, fifth joint coordinate system, reference coordinate system, calibration box coordinate system, and augmented reality device coordinate system. The motion sensing device is fixed on the fifth joint of the physical robot, and the transformation relationship between the motion sensing device coordinate system and the fifth joint coordinate system has been pre-calibrated. Using the forward kinematics model of the physical robot, the transformation relationship between the fifth joint coordinate system and the reference coordinate system can be established. The relationship between the calibration box coordinate system and the reference coordinate system has also been pre-calibrated. After the augmented reality device glasses capture the calibration box, the transformation relationship between the augmented reality device coordinate system and the calibration box coordinate system can be calibrated.

[0126] The reference coordinate system of the virtual robot's base coincides with the reference coordinate system of the physical robot's base. Therefore, based on the transformation relationship between the coordinate systems, the coordinate values in the hand coordinate system can be converted into coordinate values in the virtual robot's reference coordinate system, and used for teaching the virtual robot. For example, after converting the coordinate values in the hand coordinate system to coordinate values in the reference coordinate system, based on the relationships between the virtual robot's joints, the coordinate values in the reference coordinate system can be further converted into coordinate values in the virtual end effector coordinate system referenced by the virtual robot's end effector.

[0127] Step 3003: Perform virtual teaching on the virtual robot using the sub-teaching actions indicated by the teaching instructions.

[0128] After receiving the teaching instruction, the virtual robot is virtually taught using the sub-teaching actions indicated by the teaching instruction, so that the virtual robot executes the sub-teaching actions indicated by the teaching instruction. For example, the implementation of virtual teaching of the virtual robot using sub-teaching actions indicated by the teaching instruction includes: a computer device sending the teaching instruction to an augmented reality device, which then controls the virtual robot to execute the sub-teaching actions indicated by the teaching instruction.

[0129] Since the virtual robot is constructed from a physical robot using augmented reality (AR) devices, the process of controlling the virtual robot to perform actions is implemented by the AR devices. After receiving a teaching instruction from a computer, the AR device can identify the sub-teaching action indicated by the instruction and then control the virtual robot to execute that sub-teaching action. For example, the sub-teaching action indicated by the instruction refers to the action that the virtual robot's end effector needs to perform, and controlling the virtual robot to execute the sub-teaching action means controlling the virtual robot's end effector to execute the sub-teaching action indicated by the instruction. During the process of controlling the virtual robot's end effector to execute the sub-teaching action indicated by the instruction, one or more joints of the virtual robot also perform actions. The actions performed by one or more joints are obtained by analyzing the sub-teaching action indicated by the instruction through an inverse kinematics model.

[0130] Step 3004: In response to the virtual teaching process satisfying the first condition, the second teaching action is obtained.

[0131] After virtual teaching the virtual robot using the sub-teaching actions indicated by the teaching instructions, it is determined whether the virtual teaching process meets the first condition. If the virtual teaching process does not meet the first condition, a new teaching instruction is obtained according to steps 3001 to 3003, and then the virtual robot is virtually taught using the sub-teaching actions indicated by the new teaching instruction until the virtual teaching process meets the first condition. When the virtual teaching process meets the first condition, the second teaching action corresponding to the virtual robot is obtained. The second teaching action consists of each sub-teaching action executed sequentially by the virtual robot during the virtual teaching process.

[0132] In exemplary embodiments, the virtual teaching process satisfying the first condition is set based on experience or flexibly adjusted according to the actual application scenario, and this application embodiment does not limit this. For example, the virtual teaching process satisfying the first condition means that no teaching information is obtained within a reference time period. The reference time period is set based on experience or flexibly adjusted according to the actual application scenario, and this application embodiment does not limit this; for example, the reference time period is 3 minutes. For example, the virtual teaching process satisfying the first condition means that the end effector of the virtual robot reaches the target location point. The target location point is the location where the end effector of the virtual robot needs to be when completing the virtual task. The target location point is determined according to the type of virtual task, and this application embodiment does not limit this.

[0133] After acquiring the second teaching action, a first teaching action that the robot needs to execute is acquired based on the second teaching action. The number of sub-teaching actions included in the first teaching action is the same as the number of sub-teaching actions included in the second teaching action. In one possible implementation, acquiring the first teaching action based on the second teaching action may mean directly acquiring the first teaching action based on the second teaching action; or it may mean first acquiring the corrected second teaching action, and then acquiring the first teaching action based on the corrected second teaching action. This application embodiment does not limit this.

[0134] For example, obtaining the first teaching action based on the second teaching action refers to obtaining the first teaching action directly based on the second teaching action. The process of obtaining the first teaching action is as follows: the second teaching action is transformed from the virtual end effector coordinate system to the physical end effector coordinate system to obtain the first teaching action. For example, the second teaching action is first transformed from the virtual end effector coordinate system to the reference coordinate system to obtain the third teaching action, and then the third teaching action is transformed from the reference coordinate system to the physical end effector coordinate system to obtain the first teaching action.

[0135] For example, in the case of obtaining the first teaching action based on the second teaching action, which means first obtaining the corrected second teaching action, and then obtaining the first teaching action based on the corrected second teaching action, the second teaching action needs to be corrected before obtaining the first teaching action. The second teaching action may contain one or more sub-teaching actions with lower accuracy. Correcting the second teaching action can correct the sub-teaching actions with lower accuracy, resulting in a more accurate second teaching action. For example, the computer device can correct the second teaching action automatically based on human intent, or it can correct it based on correction instructions from a staff member; this application embodiment does not limit this. After obtaining the corrected second teaching action, the first teaching action is obtained based on the corrected second teaching action. The principle of obtaining the first teaching action based on the corrected second teaching action is the same as the principle of directly obtaining the first teaching action based on the second teaching action, and will not be repeated here.

[0136] In an exemplary embodiment, the process of correcting the second teaching action includes, but is not limited to, workpiece alignment and trajectory correction. Workpiece alignment refers to providing automatic alignment between the virtual workpiece and the real hole position for assembly tasks involving placing a workpiece into a hole. First, a segmentation algorithm (such as the watershed algorithm) is applied to segment the image of the real workpiece, dividing it into different regions. Then, the regions and their corresponding edges and centroids are extracted. Based on the depth image, the coordinates of the edge points in the motion sensing device coordinate system are calculated and then converted to the world coordinate system. Finally, the position and orientation of the virtual workpiece can be automatically adjusted to align it with the hole, thus completing workpiece alignment. The second teaching action is corrected from the angle that aligns the virtual workpiece with the hole position. Trajectory correction refers to inferring a teaching trajectory for certain trajectory tracking tasks and correcting the second teaching action based on the difference between the inferred teaching trajectory and the trajectory used to execute the second teaching action.

[0137] In an exemplary embodiment, after acquiring the second teaching action, it is determined whether the second teaching action satisfies the second condition. If it is determined that the second teaching action satisfies the second condition, the process of acquiring the first teaching action based on the second teaching action is then executed. That is, in response to the second teaching action satisfying the second condition, the first teaching action is acquired based on the second teaching action. In this case, if the second teaching action does not satisfy the second condition, virtual teaching of the virtual robot continues until a second teaching action that satisfies the second condition is obtained. Satisfying the second condition is set based on experience or flexibly adjusted according to the application scenario. For example, the second teaching action satisfies the second condition if the number of sub-teaching actions included in the second teaching action is not greater than a quantity threshold, which is set based on experience. This method of setting the second condition can avoid making the task execution process too cumbersome.

[0138] For example, the second teaching action satisfies the second condition if the difference between the task completed by the virtual robot through the execution of the second teaching action and the virtual task is not greater than a difference threshold. The method for determining the difference between the two tasks is related to the type of task, and this application embodiment does not limit this. For example, the difference between two trajectory tracking tasks refers to the difference between the tracked trajectories. The difference threshold is set based on experience, and this method of setting the second condition can ensure the accuracy of the second teaching action.

[0139] After acquiring the first taught action, the computer device controls the physical robot to execute the first taught action. In one possible implementation, controlling the physical robot to execute the first taught action means controlling the end effector of the physical robot to execute the first taught action. During the process of controlling the end effector of the physical robot to execute the first taught action, one or more joints of the physical robot also perform actions. The actions performed by the one or more joints are obtained analytically from the first taught action through an inverse kinematics model.

[0140] In an exemplary embodiment, the physical robot is directly connected to a computer device, in which case the computer device can directly control the physical robot to perform the first taught action. In another exemplary embodiment, the physical robot is connected to the computer device via a controller, in which case the computer device sends the first taught action to the controller, which then controls the physical robot to perform the first taught action.

[0141] During the process of controlling the physical robot to perform the first taught action, the computer device monitors in real time whether the physical robot collides with an obstacle. In response to a collision, tactile feedback is provided to indicate the presence of a collision event. For example, providing tactile feedback means providing tactile feedback to the teacher. By providing tactile feedback, the teacher can intuitively perceive the collision event, which helps to generate more targeted and reliable adjustment commands, thereby improving the effectiveness of teaching the physical robot.

[0142] In one possible implementation, a force sensor is mounted on the robot. In response to the force detected by the force sensor satisfying collision detection conditions, a collision between the robot and an obstacle is determined. The location of the force sensor on the robot is related to the part of the robot that the force sensor needs to detect that may collide with the obstacle. For example, a collision between the robot and an obstacle refers to a collision between the robot's end effector and the obstacle. That is, the part of the robot that the force sensor needs to detect that may collide with the obstacle is the end effector. In this case, the force sensor is mounted on the end effector. For example, an obstacle refers to a subject that the robot does not want to collide with during the robot's task. For example, in a trajectory tracking task, the subject that the robot does not want to collide with can be a workpiece. In this case, a collision between the robot and the workpiece may occur because the robot grasps the workpiece. In this case, the collision force between the robot and the obstacle can be called a grasping force.

[0143] A force sensor is used to detect force and sends the detected force to a computer device. The computer device can determine whether the force detected by the force sensor meets the collision detection conditions. If the force detected by the force sensor meets the collision detection conditions, it is determined that the physical robot has collided with the obstacle. For example, the force meeting the collision detection conditions means that the magnitude of the force is not less than a first threshold. The first threshold is set empirically; if the first threshold is 0, then all forces detected by the force sensor meet the collision detection conditions. For example, the force meeting the collision detection conditions means that the stage in which the force is detected is a specific stage in the process of controlling the physical robot to perform the first taught action. This specific stage refers to the stage where a collision with the obstacle is not desired, and is related to the actual task to be performed; this application embodiment does not limit this aspect.

[0144] When a collision is determined between the physical robot and an obstacle, tactile feedback is provided. In an exemplary embodiment, the instructor wears a tactile feedback device, and the computer device providing tactile feedback means that the computer device controls the tactile feedback device to provide tactile feedback. In one possible implementation, the process of controlling the tactile feedback device to provide tactile feedback includes: determining a target current based on the collision force between the physical robot and the obstacle; and applying the target current to the tactile feedback device so that the tactile feedback device provides tactile feedback under the action of the target current. The collision force between the physical robot and the obstacle is the force detected by a force sensor. In one possible implementation, the target current is determined based on the collision force by calculating the target current according to Maxwell's equations based on the collision force.

[0145] In an exemplary embodiment, the haptic feedback device includes a coil and a magnet. The target current refers to the current to be applied to the coil in the haptic feedback device. Exemplarily, in the case where the haptic feedback device includes multiple coils, the number of target currents is also multiple. After current is applied, the coil in the haptic feedback device enables the magnet in the haptic feedback device to generate a magnetic force, which is used to provide haptic feedback to the teacher.

[0146] In an exemplary embodiment, a schematic diagram of the haptic feedback device is shown below. Figure 6 As shown. Figure 6 (1) shows a schematic diagram of a tactile feedback device worn on the finger of the instructor; Figure 6 (2) shows a line diagram of a haptic feedback device; Figure 6 (3) shows a product diagram of a haptic feedback device. Figure 6 As shown in (2) and (3), the haptic feedback device includes a coil, a magnet, and a pad. The coil and magnet are located at four rotatable points, and the pad is located at the point of contact with the finger. The magnet can generate magnetic force under the action of an electric current applied to the coil.

[0147] This application does not limit the size and structure of the coils, nor the size and type of the magnets, and can be flexibly adjusted according to the actual application scenario. For example, each coil has a diameter of 12.5 mm and a height of 34.5 mm, and the coil is composed of 350 turns of enameled copper. The magnets are cylindrical neodymium magnets with a diameter of 10 mm and a height of 2 mm. This haptic feedback device adjusts the fingertip tactile sensation by controlling the current passing through the coils. The instructor can feel the collision between the end effector of the physical robot and obstacles simply by wearing the haptic feedback device, without affecting the immersion of the operation.

[0148] This application embodiment enables the use of haptic feedback devices to provide haptic feedback to the teacher in human-computer interaction. The haptic feedback device allows for haptic feedback through a non-contact gesture interface, transmitting remotely measured collision events to the teacher via sensory input. The teacher can teach the robot using gestures, then receive visual feedback through an augmented reality device, and obtain haptic feedback through a haptic feedback device fixed to the fingertip for remote operation. During teaching, relying solely on visual feedback can easily lead to accidental collisions and damage to the workpiece within blind spots. With haptic feedback, the teacher can quickly detect collision events and adjust teaching actions accordingly.

[0149] In an exemplary embodiment, during the process of a computer device controlling a physical robot to perform a first taught action, the teacher can receive not only tactile feedback but also visual feedback through an augmented reality device. Based on the visual feedback, the teacher can observe the movement of the physical robot in real time. The combination of visual and tactile feedback provides the teacher with more comprehensive feedback, thereby improving the reliability of the adjustment commands generated by the teacher.

[0150] In step 302, the first teaching action is adjusted using the adjustment instructions obtained after providing tactile feedback to obtain the target teaching action that meets the non-collision condition, and the teaching of the physical robot is completed based on the target teaching action.

[0151] After providing tactile feedback, the computer device can adjust the first taught action using the adjustment instructions obtained after providing the tactile feedback. These adjustment instructions are those generated by the instructor based on the tactile feedback prompts. Because the instructor can intuitively perceive a collision between the robot and an obstacle during the execution of the first taught action, they can generate more reliable adjustment instructions based on the feedback. This allows the computer device to obtain an adjusted taught action that effectively reduces the probability of collisions between the robot and obstacles.

[0152] In an exemplary embodiment, since the teacher can observe the movement of the physical robot in real time based on visual feedback, the teacher can, after receiving tactile feedback, combine the tactile feedback prompts with the observed movement to determine the possible causes of the collision, thereby generating an adjustment command that can effectively reduce the probability of collision.

[0153] The ultimate goal of adjusting the first taught action using the adjustment instructions obtained after providing tactile feedback is to obtain a target taught action that satisfies the non-collision condition. This target taught action is the action that the physical robot needs to perform to achieve the task. After obtaining the target taught action that satisfies the non-collision condition, the teaching of the physical robot is completed based on the target taught action. For example, completing the teaching of the physical robot based on the target taught action means controlling the physical robot to execute the target taught action and having the physical robot record the way the target taught action is executed, so that the physical robot can complete the corresponding task by automatically executing the target taught action.

[0154] In an exemplary instance, adjustment instructions can be generated through at least one natural interaction method, such as gestures or voice, and this application embodiment does not limit this. After providing haptic feedback, the computer device can interact with a motion-sensing device or an augmented reality device to obtain adjustment instructions generated by the teacher based on the haptic feedback prompts.

[0155] In an exemplary embodiment, the adjustment instruction generated by the teacher based on tactile feedback is an instruction for directly adjusting the first teaching action, which clearly indicates how to adjust the first teaching action. In this case, the computer device adjusting the first teaching action using the adjustment instruction obtained after providing tactile feedback means directly adjusting the first teaching action using the adjustment instruction obtained after providing tactile feedback.

[0156] In an exemplary embodiment, when the first teaching action is obtained based on the second teaching action corresponding to the virtual robot, the adjustment instruction generated by the teacher based on the tactile feedback may also be an instruction for adjusting the second teaching action, clearly indicating how to adjust the second teaching action. In this case, the computer device adjusting the first teaching action using the adjustment instruction obtained after providing tactile feedback means that the computer device adjusts the second teaching action using the adjustment instruction obtained after providing tactile feedback, and obtains the adjusted first teaching action based on the adjusted second teaching action, thereby indirectly realizing the adjustment of the first teaching action.

[0157] After adjusting the first taught action using adjustment instructions obtained after providing tactile feedback, it is determined whether the adjusted first taught action satisfies the non-collision condition. In an exemplary embodiment, a taught action satisfying the non-collision condition includes that the physical robot does not collide with an obstacle during the execution of the taught action. In an exemplary embodiment, a taught action satisfying the non-collision condition includes not only that the physical robot does not collide with an obstacle during the execution of the taught action, but also that the difference between the task completed by the physical robot and the actual task after the execution of the taught action is less than a threshold. In either case, if a taught action satisfies the non-collision condition, it can at least be guaranteed that the physical robot will not collide with an obstacle during the execution of the taught action.

[0158] After determining the adjusted first teaching action, it can be judged whether the adjusted first teaching action meets the non-collision condition. If the adjusted first teaching action meets the non-collision condition, it is taken as the target teaching action that meets the non-collision condition. If the adjusted first teaching action does not meet the non-collision condition, it means that the adjusted first teaching action still needs to be adjusted. For example, in the subsequent adjustment process, if the computer device provides tactile feedback to the teacher, the teacher generates an adjustment command by combining visual feedback and tactile feedback. If the computer device does not provide tactile feedback to the teacher, the teacher can generate an adjustment command based on visual feedback. This application embodiment does not limit this. After adjusting the adjusted first teaching action, it is judged whether the adjusted first teaching action meets the non-collision condition, and so on, until a target teaching action that meets the non-collision condition is obtained.

[0159] It should be noted that this application embodiment uses the example of a collision between the physical robot and an obstacle during the process of controlling the physical robot to perform the first teaching action as an example, and this application embodiment is not limited to this. In the exemplary embodiment, during the process of controlling the physical robot to perform the first teaching action, the physical robot may not collide with the obstacle. In this case, the computer device does not need to provide tactile feedback. The computer device can determine whether the first teaching action meets the non-collision condition. If the first teaching action meets the non-collision condition, it can be directly used as the target teaching action; if the first teaching action does not meet the non-collision condition, a prompt requiring adjustment can be sent to the teacher, and then the first teaching action can be adjusted using the adjustment instructions generated by the teacher based on visual feedback. In either case, the ultimate goal is to obtain the target teaching action that meets the non-collision condition, so as to complete the teaching of the physical robot based on the target teaching action.

[0160] For example, embodiments of this application can employ an online-offline fusion approach to teach robots. Through virtual-real fusion interaction technology, the teacher can safely and directly teach the virtual robot in a real-world scenario, after which the physical robot reproduces the virtual robot's movements to complete the teaching process. This teaching method combines gesture and voice interaction, providing both visual and tactile feedback to the teacher. Furthermore, this teaching method can quickly verify the teaching results while ensuring the teacher's safety and avoiding damage to the physical robot or workpiece. If an error exists between the physical robot's movement and the movement to be taught, the teacher can fine-tune the physical robot's movement in real time using a gesture and voice teaching fusion algorithm.

[0161] The robot teaching method provided in this application automatically detects collision events between the robot and obstacles during the process of controlling a physical robot to perform teaching actions. When a collision event is detected, tactile feedback is provided to indicate its presence, allowing the teacher to intuitively perceive the collision event based on the tactile feedback. The automatic collision event detection is highly reliable, and the adjustment instructions obtained after providing tactile feedback are more reliable adjustment instructions generated by the teacher based on the tactile feedback. The quality of adjusting the teaching actions using these adjustment instructions is high, which is beneficial for improving the effectiveness of robot teaching.

[0162] See Figure 7 This application provides a robot teaching device, which includes:

[0163] The control unit 701 is used to provide tactile feedback in response to a collision between the physical robot and an obstacle during the process of controlling the physical robot to perform the first teaching action. The tactile feedback is used to indicate the existence of a collision event.

[0164] The adjustment unit 702 is used to adjust the first teaching action using the adjustment instructions obtained after providing tactile feedback, so as to obtain the target teaching action that meets the non-collision condition, and to complete the teaching of the physical robot based on the target teaching action.

[0165] In one possible implementation, the control unit 701 is used to determine a target current based on the collision force between the physical robot and the obstacle; and to apply the target current to the haptic feedback device so that the haptic feedback device provides haptic feedback under the action of the target current.

[0166] In one possible implementation, see [link to relevant documentation]. Figure 8 The device also includes:

[0167] The acquisition unit 703 is used to acquire the second teaching action corresponding to the virtual robot; and acquire the first teaching action based on the second teaching action.

[0168] In one possible implementation, the acquisition unit 703 is used to acquire teaching information, which includes at least one of gesture information and voice information; based on the teaching information, acquire teaching instructions, which are used to instruct sub-teaching actions; perform virtual teaching on the virtual robot using the sub-teaching actions indicated by the teaching instructions; and obtain a second teaching action in response to the virtual teaching process satisfying a first condition.

[0169] In one possible implementation, the teaching information includes gesture information and voice information. The acquisition unit 703 is used to acquire fused text based on the gesture information and voice information; call a classification model to classify the fused text to obtain the matching probability of each candidate instruction. The classification model is trained based on the sample text and the instruction label corresponding to the sample text; and acquire the teaching instruction based on the candidate instruction whose matching probability meets the selection condition.

[0170] In one possible implementation, the acquisition unit 703 is used to correct the second teaching action to obtain the corrected second teaching action; and to acquire the first teaching action based on the corrected second teaching action.

[0171] In one possible implementation, the virtual robot is constructed by augmented reality devices based on a physical robot.

[0172] In one possible implementation, the acquisition unit 703 is used to send teaching instructions to the augmented reality device, which then controls the virtual robot to execute the sub-teaching actions indicated by the teaching instructions.

[0173] In one possible implementation, a force sensor is mounted on the physical robot, and the control unit 701 is further configured to determine that the physical robot has collided with an obstacle in response to the force detected by the force sensor satisfying the collision detection conditions.

[0174] In one possible implementation, the classification model is the maximum entropy model.

[0175] The robot teaching device provided in this application automatically detects collision events between the robot and obstacles during the process of controlling a physical robot to perform teaching actions. When a collision event is detected, it provides tactile feedback to indicate the presence of the collision, allowing the teacher to intuitively perceive the collision event based on the tactile feedback. The automatic collision detection is highly reliable, and the adjustment instructions obtained after providing tactile feedback are more reliable adjustment instructions generated by the teacher based on the tactile feedback. The quality of adjusting the teaching actions using these adjustment instructions is high, which is beneficial for improving the effectiveness of robot teaching.

[0176] It should be noted that the apparatus provided in the above embodiments is only illustrated by the division of the above functional units. In practical applications, the above functions can be assigned to different functional units as needed, that is, the internal structure of the device can be divided into different functional units to complete all or part of the functions described above. In addition, the apparatus and method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process can be found in the method embodiments, which will not be repeated here.

[0177] In an exemplary embodiment, a computer device is also provided, comprising a processor and a memory, wherein at least one computer program is stored in the memory. The at least one computer program is loaded and executed by one or more processors to enable the computer device to implement any of the robot teaching methods described above. The computer device can be a terminal or a server, and this embodiment does not limit this. The structures of the terminal and the server will be described separately below.

[0178] Figure 9 This is a schematic diagram of the structure of a terminal provided in an embodiment of this application. The terminal can be: a PC, mobile phone, smartphone, PDA, wearable device, PPC, tablet computer, smart car infotainment system, smart TV, smart speaker, or in-vehicle terminal. The terminal may also be referred to as user equipment, portable terminal, laptop terminal, desktop terminal, or other names.

[0179] Typically, a terminal includes a processor 901 and a memory 902.

[0180] Processor 901 may include one or more processing cores, such as a quad-core processor or an octa-core processor. Processor 901 may be implemented using at least one hardware form selected from DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array). Processor 901 may also include a main processor and a coprocessor. The main processor, also known as the CPU, is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, processor 901 may integrate a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, processor 901 may also include an AI (Artificial Intelligence) processor, which is used to handle computational operations related to machine learning.

[0181] The memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In some embodiments, the non-transitory computer-readable storage media in the memory 902 are used to store at least one instruction, which is executed by the processor 901 to cause the terminal to implement the robot teaching method provided in the method embodiments of this application.

[0182] In some embodiments, the terminal may also optionally include: a peripheral device interface 903 and at least one peripheral device. The processor 901, memory 902, and peripheral device interface 903 can be connected via a bus or signal line. Each peripheral device can be connected to the peripheral device interface 903 via a bus, signal line, or circuit board. Specifically, the peripheral device includes at least one of: a radio frequency circuit 904, a display screen 905, a camera assembly 906, an audio circuit 907, and a power supply 909.

[0183] Peripheral device interface 903 can be used to connect at least one I / O (Input / Output) related peripheral device to processor 901 and memory 902. In some embodiments, processor 901, memory 902 and peripheral device interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or two of processor 901, memory 902 and peripheral device interface 903 can be implemented on separate chips or circuit boards, which is not limited in this embodiment.

[0184] The radio frequency (RF) circuit 904 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The RF circuit 904 communicates with communication networks and other communication devices via electromagnetic signals. The RF circuit 904 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals back into electrical signals. Optionally, the RF circuit 904 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, etc. The RF circuit 904 can communicate with other terminals through at least one wireless communication protocol. This wireless communication protocol includes, but is not limited to: metropolitan area networks (MANs), various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks (WLANs), and / or WiFi (Wireless Fidelity) networks. In some embodiments, the RF circuit 904 may also include circuitry related to NFC (Near Field Communication), which is not limited in this application.

[0185] Display screen 905 is used to display a UI (User Interface). This UI may include graphics, text, icons, videos, and any combination thereof. When display screen 905 is a touch display screen, it also has the ability to collect touch signals on or above its surface. These touch signals can be input as control signals to processor 901 for processing. In this case, display screen 905 can also be used to provide virtual buttons and / or a virtual keyboard, also known as soft buttons and / or a soft keyboard. In some embodiments, there may be one display screen 905, located on the front panel of the terminal; in other embodiments, there may be at least two display screens 905, located on different surfaces of the terminal or in a folded design; in still other embodiments, display screen 905 may be a flexible display screen, located on a curved or folded surface of the terminal. Furthermore, display screen 905 may be configured as a non-rectangular, irregular shape, i.e., a non-rectangular screen. Display screen 905 may be made of materials such as LCD (Liquid Crystal Display) or OLED (Organic Light-Emitting Diode).

[0186] The camera assembly 906 is used to acquire images or videos. Optionally, the camera assembly 906 includes a front-facing camera and a rear-facing camera. Typically, the front-facing camera is located on the front panel of the terminal, and the rear-facing camera is located on the back of the terminal. In some embodiments, there are at least two rear-facing cameras, which are any one of a main camera, a depth-sensing camera, a wide-angle camera, and a telephoto camera, to achieve background blurring by fusion of the main camera and the depth-sensing camera, panoramic shooting by fusion of the main camera and the wide-angle camera, VR (Virtual Reality) shooting, or other fusion shooting functions. In some embodiments, the camera assembly 906 may also include a flash. The flash can be a single-color temperature flash or a dual-color temperature flash. A dual-color temperature flash refers to a combination of a warm-light flash and a cool-light flash, which can be used for light compensation at different color temperatures.

[0187] The audio circuit 907 may include a microphone and a speaker. The microphone is used to collect sound waves from the user and the environment, converting them into electrical signals that are input to the processor 901 for processing, or to the radio frequency circuit 904 for voice communication. For stereo sound acquisition or noise reduction purposes, multiple microphones may be used, each positioned at a different location on the terminal. The microphone may also be an array microphone or an omnidirectional microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The speaker may be a traditional film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, it can convert electrical signals not only into audible sound waves but also into inaudible sound waves for purposes such as distance measurement. In some embodiments, the audio circuit 907 may also include a headphone jack.

[0188] The power supply 909 is used to power the various components in the terminal. The power supply 909 can be AC power, DC power, a disposable battery, or a rechargeable battery. When the power supply 909 includes a rechargeable battery, the rechargeable battery can support wired or wireless charging. The rechargeable battery can also be used to support fast charging technology.

[0189] In some embodiments, the terminal further includes one or more sensors 910. The one or more sensors 910 include, but are not limited to: an acceleration sensor 911, a gyroscope sensor 912, a pressure sensor 913, an optical sensor 915, and a proximity sensor 916.

[0190] Accelerometer 911 can detect the magnitude of acceleration along the three coordinate axes of a coordinate system established by the terminal. For example, accelerometer 911 can be used to detect the components of gravitational acceleration along the three coordinate axes. Processor 901 can control display screen 905 to display the user interface in either a landscape or portrait view based on the gravitational acceleration signal acquired by accelerometer 911. Accelerometer 911 can also be used for games or for acquiring user motion data.

[0191] The gyroscope sensor 912 can detect the terminal's orientation and rotation angle. The gyroscope sensor 912, in conjunction with the accelerometer sensor 911, can collect the user's 3D movements on the terminal. Based on the data collected by the gyroscope sensor 912, the processor 901 can perform the following functions: motion sensing (e.g., changing the UI based on the user's tilt), image stabilization during shooting, game control, and inertial navigation.

[0192] The pressure sensor 913 can be disposed on the side bezel of the terminal and / or on the lower layer of the display screen 905. When the pressure sensor 913 is disposed on the side bezel of the terminal, it can detect the user's grip signal on the terminal, and the processor 901 can perform left / right hand recognition or quick operation based on the grip signal collected by the pressure sensor 913. When the pressure sensor 913 is disposed on the lower layer of the display screen 905, the processor 901 can control the operable controls on the UI interface based on the user's pressure operation on the display screen 905. The operable controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.

[0193] An optical sensor 915 is used to collect ambient light intensity. In one embodiment, the processor 901 can control the display brightness of the display screen 905 based on the ambient light intensity collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display brightness of the display screen 905 is increased; when the ambient light intensity is low, the display brightness of the display screen 905 is decreased. In another embodiment, the processor 901 can also dynamically adjust the shooting parameters of the camera assembly 906 based on the ambient light intensity collected by the optical sensor 915.

[0194] The proximity sensor 916, also known as a distance sensor, is typically located on the front panel of the terminal. The proximity sensor 916 is used to detect the distance between the user and the front of the terminal. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front of the terminal is gradually decreasing, the processor 901 controls the display screen 905 to switch from a screen-on state to a screen-off state; when the proximity sensor 916 detects that the distance between the user and the front of the terminal is gradually increasing, the processor 901 controls the display screen 905 to switch from a screen-off state to a screen-on state.

[0195] Those skilled in the art will understand that Figure 9 The structure shown does not constitute a limitation on the terminal and may include more or fewer components than shown, or combine certain components, or use different component arrangements.

[0196] Figure 10This is a schematic diagram of a server structure provided in an embodiment of this application. The server can vary significantly due to differences in configuration or performance. It may include one or more Central Processing Units (CPUs) 1001 and one or more memories 1002. The one or more memories 1002 store at least one computer program, which is loaded and executed by the one or more processors 1001 to enable the server to implement the robot teaching methods provided in the various method embodiments described above. Of course, the server may also have wired or wireless network interfaces, a keyboard, and input / output interfaces for input and output. The server may also include other components for implementing device functions, which will not be elaborated upon here.

[0197] In an exemplary embodiment, a computer-readable storage medium is also provided, which stores at least one computer program that is loaded and executed by a processor of a computer device to enable the computer to implement any of the robot teaching methods described above.

[0198] In one possible implementation, the aforementioned computer-readable storage medium may be a read-only memory (ROM), a random access memory (RAM), a compact disc read-only memory (CD-ROM), magnetic tape, floppy disk, and optical data storage device, etc.

[0199] In an exemplary embodiment, a computer program product is also provided, which includes a computer program or computer instructions that are loaded and executed by a processor to enable a computer to implement any of the robot teaching methods described above.

[0200] It should be understood that "multiple" as used in this article refers to two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, or B alone. The character " / " generally indicates that the preceding and following related objects have an "or" relationship.

[0201] The above description is merely an exemplary embodiment of this application and is not intended to limit this application. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.

Claims

1. A robot teaching method, characterized in that, The method includes: Acquire teaching information from the instructor, including voice information and gesture images; The gesture image is subjected to gesture recognition to obtain information representing the gesture. The information representing the gesture includes the position, velocity, acceleration of the fingertip of the instructor's index finger, and the direction pointed by the fingertip. The information representing the gesture is filtered to obtain gesture information. The direction in the gesture information is converted into directional text, and the coordinate points in the gesture information are converted into position text. The directional text and the position text constitute the first text. Obtain the second text corresponding to the voice information; The first text and the second text are merged to obtain the merged text as the text information corresponding to the teaching information; wherein, for multiple teaching information that are acquired consecutively, if the command corresponding to the current teaching information lacks semantics, the command corresponding to the previous teaching information is combined to complete the missing semantics of the command corresponding to the current teaching information. The text information is classified using a classification model to obtain the matching probability of each candidate instruction; teaching instructions are obtained based on the candidate instructions whose matching probabilities meet the selection criteria. Based on the teaching instructions, determine the second teaching action corresponding to the virtual robot; based on the second teaching action, obtain the first teaching action corresponding to the physical robot; During the process of controlling the physical robot to perform the first teaching action, tactile feedback is provided in response to the collision between the physical robot and an obstacle, and the tactile feedback is used to indicate the existence of a collision event; The first teaching action is adjusted using the adjustment instructions obtained after providing the tactile feedback to obtain a target teaching action that meets the non-collision condition. The adjustment instructions refer to the instructions generated by the teacher based on the tactile feedback prompts after receiving the tactile feedback. The teaching of the physical robot is completed based on the target teaching action.

2. The method according to claim 1, characterized in that, The provision of tactile feedback includes: The target current is determined based on the collision force between the physical robot and the obstacle; The target current is applied to the haptic feedback device so that the haptic feedback device provides the haptic feedback under the action of the target current.

3. The method according to claim 1, characterized in that, The teaching instruction is used to instruct the sub-teaching action; The step of determining the second teaching action corresponding to the virtual robot based on the teaching instruction includes: The virtual robot is virtually taught using the sub-teaching actions indicated by the teaching instructions; In response to the virtual teaching process satisfying the first condition, the second teaching action is obtained.

4. The method according to claim 3, characterized in that, The virtual teaching of the virtual robot using the sub-teaching actions indicated by the teaching instructions includes: The teaching instruction is sent to the augmented reality device, which then controls the virtual robot to execute the sub-teaching action indicated by the teaching instruction.

5. The method according to any one of claims 1 to 4, characterized in that, The step of obtaining the first teaching action corresponding to the physical robot based on the second teaching action includes: The second teaching action is corrected to obtain the corrected second teaching action; Based on the corrected second teaching action, the first teaching action corresponding to the physical robot is obtained.

6. The method according to any one of claims 1 to 4, characterized in that, The virtual robot is constructed by augmented reality devices based on the physical robot.

7. The method according to any one of claims 1 to 4, characterized in that, The physical robot is equipped with a force sensor, and before providing tactile feedback in response to a collision between the physical robot and an obstacle, the method further includes: In response to the force detected by the force sensor satisfying the collision detection conditions, it is determined that the physical robot has collided with the obstacle.

8. The method according to any one of claims 1 to 4, characterized in that, The classification model is a maximum entropy model.

9. A robot teaching device, characterized in that, The device includes: The acquisition unit is configured to acquire teaching information from the instructor, including voice information and gesture images; perform gesture recognition on the gesture images to obtain information representing the gestures, including the position, velocity, acceleration, and direction indicated by the instructor's index fingertip; filter the information representing the gestures to obtain gesture information; convert the direction in the gesture information into directional text and the coordinate points in the gesture information into location text, the directional text and the location text constituting a first text; acquire a second text corresponding to the voice information; and process the first text... The text is fused with the second text to obtain the fused text, which serves as the text information corresponding to the teaching information. Specifically, for multiple consecutively acquired teaching information messages, if the command corresponding to the current teaching information lacks semantics, the missing semantics of the command corresponding to the current teaching information are completed by combining the command corresponding to the previous teaching information message. A classification model is invoked to classify the text information, obtaining the matching probability of each candidate instruction. Based on the candidate instructions whose matching probabilities satisfy the selection criteria, a teaching instruction is obtained. Based on the teaching instruction, a second teaching action corresponding to the virtual robot is determined. Based on the second teaching action, a first teaching action corresponding to the physical robot is obtained. The control unit is configured to provide tactile feedback in response to a collision between the physical robot and an obstacle during the process of controlling the physical robot to perform the first teaching action, wherein the tactile feedback is used to indicate the presence of a collision event; An adjustment unit is used to adjust the first teaching action using an adjustment instruction obtained after providing the tactile feedback, so as to obtain a target teaching action that meets the non-collision condition. The adjustment instruction refers to the instruction generated by the teacher based on the tactile feedback prompt after receiving the tactile feedback, and the teaching of the physical robot is completed based on the target teaching action.

10. The apparatus according to claim 9, characterized in that, The control unit is configured to determine a target current based on the collision force between the physical robot and the obstacle; and to apply the target current to the tactile feedback device so that the tactile feedback device provides the tactile feedback under the action of the target current.

11. The apparatus according to claim 9, characterized in that, The teaching instruction is used to instruct the sub-teaching action; The acquisition unit is used to perform virtual teaching on the virtual robot using the sub-teaching actions indicated by the teaching instructions; In response to the virtual teaching process satisfying the first condition, the second teaching action is obtained.

12. The apparatus according to any one of claims 9 to 11, characterized in that, The physical robot is equipped with force sensors; The control unit is further configured to determine that the physical robot has collided with the obstacle in response to the force detected by the force sensor satisfying the collision detection condition.

13. A computer device, characterized in that, The computer device includes a processor and a memory, the memory storing at least one computer program, which is loaded and executed by the processor to enable the computer device to implement the robot teaching method as described in any one of claims 1 to 8.

14. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores at least one computer program, which is loaded and executed by a processor to enable the computer to implement the robot teaching method as described in any one of claims 1 to 8.

15. A computer program product, characterized in that, The computer program product includes a computer program that is loaded and executed by a processor to enable the computer to implement the robot teaching method as described in any one of claims 1 to 8.