A voice-programmed multi-legged robot system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By establishing a relative coordinate system and a speech recognition module in the multi-legged robot system, combined with a programming module and servo kinematic design, natural language action recognition and real-time control were achieved. This solved the problems of high threshold and poor interactivity for user-defined gait, and improved the flexibility and interactivity of robot movements.

CN117001685BActive Publication Date: 2026-06-23SUZHOU HUAMAI ROBOT TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SUZHOU HUAMAI ROBOT TECH CO LTD
Filing Date: 2023-08-24
Publication Date: 2026-06-23

Application Information

Patent Timeline

24 Aug 2023

Application

23 Jun 2026

Publication

CN117001685B

IPC: B25J11/00; B62D57/032; B25J13/00

AI Tagging

Application Domain

Manipulator Vehicles

Technology Topics

Robotic systemsLegged robot

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Robotic systems, devices, and methods for vascular access
US12661194B2Guide needlesSurgical furnitureRobotic systemsRobotic arm
Hand and robot system
US12661796B2Programme-controlled manipulator Gripping headsRobotic systemsRobotic hand
Robot systems and methods for controlling robot systems
JP2026101475AManipulatorRobotic systemsSimulation
Robot system
WO2026126696A1Arc welding apparatusRobotic systemsMedicine
Integrated robotic insufflation and smoke evacuation
US12661200B2CannulasDiagnosticsRobotic systemsPhysical medicine and rehabilitation

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing multi-legged robot products have a high barrier to entry for users to develop custom gait, and existing voice-programmed robots have difficulty recognizing natural language, poor interactivity, low degree of freedom, and difficulty in achieving real-time control and a good human-computer interaction experience for everyone.

Method used

By establishing a relative coordinate system, recognizing voice commands based on the speaker, constructing a relative coordinate system to obtain the gait trajectory of the robot's feet, using a voice recognition module and a programming module to implement natural language actions, setting foot numbers for the multi-legged robot, generating various actions through static and dynamic action editing units, and designing gait trajectories by combining the forward and inverse kinematic equations of the servo motors.

Benefits of technology

It lowers the barrier to voice programming, enhances the interactivity between robots and users, enables natural language action recognition and real-time control, and improves the flexibility and completeness of robot actions.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN117001685B_ABST

Patent Text Reader

Abstract

The present application relates to the technical field of voice programming robot, in particular to a voice programming multi-legged robot system, which has a voice recognition module and a programming module, wherein the voice recognition module analyzes and processes the collected voice information to obtain voice instructions, the programming module establishes a relative coordinate system according to the voice instructions, obtains the coordinates of each point of the robot foot through the relative coordinate system, and obtains the gait trajectory of the robot foot. The present application establishes a relative coordinate system, so that the robot recognizes voice instructions based on the object of voice emitter, makes actions more in line with natural language, and enhances the interactivity.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of voice-programmed robot technology, and in particular to a voice-programmed multi-legged robot system. Background Technology

[0002] Among existing walking robots, many can perform a variety of complex gait movements, and their gait control is quite mature. However, these mature products present a certain barrier to entry for users with ideas for custom gait development. Ordinary users need to spend a significant amount of time and effort learning the relevant programming languages and basic robotics principles to achieve custom control. Conversely, some products with a lower development barrier for users inherently restrict the robot's degrees of freedom, either by reducing the number of servos in some joints or changing the adjustable angle of the servos, sacrificing the product's flexibility and stability for the user's customization needs. Furthermore, most users currently control multi-legged robots using handles, and some products cannot even achieve real-time control, making it difficult to provide a good human-machine interaction experience for everyone. The use of handles also places a high demand on the user's operational skills. Moreover, existing voice-programmed robots can only recognize relatively mechanical and fixed language, have difficulty recognizing natural language, cannot perform the actions required by natural language, and have poor interactivity. Summary of the Invention

[0003] This invention establishes a relative coordinate system, enabling the robot to recognize voice commands based on the voice sender, thus making actions that are more in line with natural language and enhancing interactivity.

[0004] To address the aforementioned problems in the prior art, this invention provides a voice-programmable multi-legged robot system, which has the following features:

[0005] Speech recognition module;

[0006] Programming module;

[0007] The voice recognition module analyzes and processes the collected voice information to obtain voice commands. The programming module identifies the initial location of the voice command issuer based on the voice command, constructs a coordinate system based on the initial location, obtains a relative coordinate system, obtains the toe coordinates and joint coordinates of the robot foot through the relative coordinate system, obtains the robot foot gait trajectory using the toe coordinates and joint coordinates, and calls the robot foot gait trajectory according to the voice command.

[0008] Furthermore, the relative coordinate system is constructed based on the speaker of the voice information as the reference object.

[0009] Furthermore, the relative coordinate system includes a global coordinate system and a local coordinate system. The global coordinate system is a coordinate system constructed based on the initial orientation of the robot as a whole and the voice command, and obtains the initial coordinates of each toe. The local coordinate system is a coordinate system constructed with the initial coordinates of the toes as the origin.

[0010] Furthermore, the programming module includes an action editing unit, which comprises a static action editing unit, a dynamic action editing unit, and a stop-motion action editing subunit. The static action editing unit is used to generate static target actions of the robot's legs based on voice commands; the dynamic action editing unit is used to generate motion trajectories of the robot's legs based on voice commands; and the stop-motion action editing unit is used to generate stop-motion actions based on voice commands and music information.

[0011] Furthermore, the static motion editing unit includes a numbering assignment subunit, a static target motion storage subunit, a static target motion presentation subunit, and a static motion correction subunit. The numbering assignment subunit is used to assign a voice number to each robot leg according to a voice command, forming a mapping table that corresponds one-to-one between the robot leg and the assigned voice number. The static target motion storage subunit is used to set the angle and orientation of the robot leg according to a voice command and record the angle and orientation. The static target motion presentation subunit is used to call the angle and orientation of the robot leg in the static target motion storage subunit according to a voice command and present the static target motion. The static motion correction subunit is used to record the static target motion after the robot leg's motion has changed when the robot leg is in a suspended state and an external force is applied to the robot leg.

[0012] Furthermore, the dynamic motion editing unit includes an intermediate motion generation subunit and an intermediate motion storage subunit. The intermediate motion generation subunit is used to establish multiple sub-motions between two adjacent static target actions at a certain time interval, wherein the sub-motions are static actions captured by the motion trajectories of the two adjacent static target actions; the intermediate motion storage subunit is used to record multiple sub-motions.

[0013] Furthermore, the checkpoint action editing unit includes a static target action dwell time assignment subunit, a time interval assignment subunit between two adjacent static target actions, and a sub-action time assignment subunit. The static target action dwell time assignment subunit is used to assign the dwell time of the static target action; the time interval assignment subunit between two adjacent static target actions is used to assign the time of the process between two adjacent static target actions; and the sub-action time assignment subunit is used to assign the time of all sub-actions between two static target actions.

[0014] Furthermore, when the dwell time of the second static target action in two adjacent static target actions accounts for 70%-90% of the interval between the two static target actions, the motion process of the two adjacent static target actions is a rigid action process.

[0015] Furthermore, when the dwell time of the second static target action in two adjacent static target actions accounts for 0%-69% of the interval between the two adjacent static target actions, the action process of the two adjacent static target actions is a flexible action process.

[0016] Furthermore, the robot has n legs, where n is greater than 3, and each robot leg has 3 servo motors, for a total of 3n servo motors.

[0017] The beneficial effects of the present invention are as follows: (1) The present invention establishes a relative coordinate system based on the voice speaker, obtains the gait trajectory of each robot foot through the relative coordinate system, and makes actions that conform to natural language, thereby reducing the threshold of voice programming and making it easy to popularize.

[0018] (2) This invention simplifies voice commands by assigning numbers to the robot's feet and assigning action commands through the numbers, making voice programming easier and allowing multiple static actions to be set, making the robot's actions more complete.

[0019] (3) The present invention sets multiple sub-actions between two adjacent static actions to make the two actions coherent, making the robot's actions smoother and improving interactivity.

[0020] (4) By setting the proportion of static motion dwell time in the entire motion process time, the present invention can make the two motion processes flexible or rigid, enabling the robot to perform more complex actions and improve interactivity. Attached Figure Description

[0021] Figure 1 The speech programming flowchart provided by this invention;

[0022] Figure 2 This is a schematic diagram of the speech recognition module provided by the present invention;

[0023] Figure 3 This is a schematic diagram of the robot foot joint coordinates provided by the present invention;

[0024] Figure 4 A schematic diagram illustrating the time allocation for the timing action programming provided by this invention;

[0025] Figure 5 This is a schematic diagram of the overall structure of the robot provided by the present invention. Detailed Implementation

[0026] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0027] Example 1:

[0028] Reference Figure 1 As shown, this invention discloses a voice-programmable multi-legged robot system, comprising a voice recognition module and a programming module. The programming module identifies the initial position of the voice command sender based on the voice command, constructs a coordinate system based on the initial position, obtains a relative coordinate system, acquires the toe coordinates and joint coordinates of the robot's legs using the relative coordinate system, and uses the toe coordinates and joint coordinates to obtain the robot's leg gait trajectory. The robot then invokes the gait trajectory according to the voice command to realize the voice command action. By establishing a relative coordinate system with the voice information sender as the object, the robot can recognize the user's natural language and more accurately perform the actions requested by the user, thus lowering the barrier to voice programming.

[0029] The above system is applied to multi-legged robots, referring to... Figure 5 This robot is an n-legged robot (n>3), where each leg has 3 servo motors, employing 3n servo motors with 3 degrees of freedom per leg. The gait movements of these 3n servo motors are encapsulated for user-defined gait programming. The robot includes a voice acquisition unit, a voice recognition chip, a microcontroller, and embedded development chips. Among these, for example... Figure 2 As shown, the voice acquisition device is a microphone, specifically an omnidirectional microphone. The voice recognition chip is an LD3320 (LDV7), which includes modules for spectrum analysis, feature extraction, voice recognition, and keyword listing. It receives voice information from the microphone, recognizes the information, and generates a recognition result. This result is sent to a microcontroller (STC11L08), which processes the recognition result and prints different commands via serial port. These commands are then sent to an embedded development chip via serial port. The embedded development chip connects to and parses the voice module via UART and GPIO. By modifying the module's built-in project program to add keywords to the LDV7 module, the user executes the `User_handle` function, and the embedded development chip recognizes the command and instructs the servo motor to perform the corresponding action. The OpenAI official Python library is also configured and installed. The ChatGPT API is accessed via HTTP requests, the desired model is selected, and the necessary parameters are configured to obtain a text response. Finally, the text is converted to speech and sent back to the user.

[0030] This invention utilizes 3n servos with different degrees of freedom, encapsulates common and special movements using gait, and implements real-time control via a handle and voice commands based on the FreeRTOS operating system and an embedded development chip. The design of the 3n servos requires obtaining their forward and inverse kinematic equations and gait trajectories. The specific method is as follows:

[0031] For gait design:

[0032] First, establish an overall coordinate system with the robot's center as the origin and the plane containing the toes of all robot feet as the horizontal plane. This allows us to obtain the coordinates of each robot foot's toe and its gait trajectory. By setting control points, two of which are the toe's landing points, we use Matlab software to obtain the required Bézier curve and its equation. This equation represents the gait trajectory of the toe and the Bézier curve equation.

[0033] Next, by designing the swing phase, support phase, gait frequency, stride length, duty cycle and stride length of the required gait, the gait principle of each foot as a whole is obtained, the timing diagram of the multi-foot gait as a whole is drawn up, and the gait program is written.

[0034] Then, a coordinate transformation is performed: the toe coordinates in the global coordinate system are transformed to the toe coordinates in the corresponding local coordinate system of the foot.

[0035] For obtaining the forward kinematics equations:

[0036] First, establish the initial angles of 3n joints in each foot, and then establish a local coordinate system for that foot, such as... Figure 3 As shown, the DH parameter method is used to analyze the relevant DH parameter table corresponding to the coordinates from joint 1 to joint 3 ({A1} to {A3}). Substituting these parameters into the DH transformation matrix, three corresponding homogeneous transformation matrices are obtained. Multiplying these three homogeneous transformation matrices (using Matlab for analysis and processing) yields the final transformation matrix from coordinates {A1} to {A3}. The 3x3 matrix in the upper left corner of the 4x4 matrix represents the rotation matrix of the end effector in the base coordinate system, and the 3x1 matrix in the upper right corner represents the spatial position coordinates of the end effector in the base coordinate system. This 3x1 matrix contains the relationship between the rotation angles of each joint and the toe coordinates, which is the required forward motion equation.

[0037] For obtaining the inverse kinematics equations:

[0038] First, establish the initial coordinates of the n toes of each foot, establish a local coordinate system, use the forward kinematics method to obtain the 3x1 matrix and the initial coordinates of the toes, and use Matlab to solve for the relationship between the rotation angles of each joint, which is the required inverse kinematic equation.

[0039] This invention incorporates initially established coordinates, angles, forward and inverse kinematic equations, and multi-legged gait trajectory design into the program. Based on the designed and acquired trajectory programs for various gaits and movements, these gaits and movements are encapsulated. The invention allows for the updating and downloading of gait movements via a self-developed Android app, then connects to the upper and lower control units via serial port transmission (interface converter) to download the updated gait movements from the app to the embedded system of the lower control unit (robot). The gait movements stored in the Android app can be used for user-defined programming, while those transmitted to the lower control unit can be used for voice control and remote control via a controller. User-defined gait movements can also be downloaded to the lower control unit for execution. In addition to serial port transmission, users can also transmit data between the upper and lower control units via Bluetooth or Wi-Fi.

[0040] Furthermore, to address the high complexity and low degree of freedom of robot voice programming, this embodiment allows for the setting of static actions for the robot based on the aforementioned system. Specifically, firstly, the robot's legs are numbered. The user can directly assign a number to each robot leg using voice commands, establishing a robot leg-number mapping table. This mapping table is temporarily stored, and there is a one-to-one correspondence between the numbers and the robot legs. For example, in this embodiment, there are six robot legs, which can be assigned numbers like "1st leg," "2nd leg," "3rd leg," "4th leg," "5th leg," and "6th leg" in a clockwise direction. After assigning numbers to each robot leg, the user can program the robot leg under that number by issuing voice information containing the number and associated action. When the user issues the voice information for the number, the robot leg corresponding to that number performs the corresponding action. Specifically, when voice information with a corresponding number and action is received, the robot foot corresponding to that number performs the action requested in the voice information. For example, if the user says "Lift the first foot," the robot foot numbered "first foot" will lift. The user can record the action by issuing voice information, thus forming a static target action. Specifically, the user can define the robot's actions and have the robot record the defined actions. For example, if the user says "Lift the first foot, this is the first action," the robot will record this static target action and mark it as the first action. The user can continue programming for the second action, then say "Lift the leg behind this leg, this is the second action." At this time, the robot will use the local coordinate system of the first foot as a reference to identify the foot required to perform the action in the voice information as the second foot, and lift the second foot accordingly, recording the robot's action and marking it as the second action. This process can be repeated to program and record multiple static actions for the robot, thus forming a complete set of actions, such as editing a complete set of broadcast gymnastics.

[0041] To increase the accuracy of the robot's movements, when the robot's foot makes a corresponding static movement, the robot can control the servo motor to make the robot's foot hover. The angle and orientation of the servo motor in this hovering state are recorded. However, the robot's foot movement at this time may not be the movement required by the user. At this time, the servo motor changes to editable state, and the user can directly correct the robot's foot. The correction movement is recorded, and the already recorded movement is replaced.

[0042] To make two adjacent static actions more coherent and to make the robot's movements more closely resemble the user's desired actions, this embodiment sets multiple sub-actions between two adjacent static actions. Five to eight sub-actions can be set. Specifically, a sub-action is a static action captured from the motion trajectory of two adjacent static target actions. For example, five sub-actions can be set between the first and second actions. These five sub-actions are actions captured during the movement from the first action to the second action. These actions can be captured at the same or different intervals, and the angle and orientation of the servo motor corresponding to each of the five sub-actions are recorded. This ensures that the robot performs the five sub-actions during the transition from the first action to the second action.

[0043] To enhance the rhythm of the robot's movements and enable it to perform more complex and diverse actions and movement variations, this embodiment refers to... Figure 4 Voice programming for setting timing actions allows users to issue rhythmic voice commands, assigning the robot a rhythm between two adjacent static actions. For example, assigning a 4-beat rhythm between the first and second actions (1, 2, 3, 4). Under this rhythm, the robot needs to complete the first action, the second action, and all sub-actions in between. Different timing actions can be achieved by changing the duration of the intermediate sub-actions. Specifically, taking the first and second actions as an example, let the time interval between the first and second actions be T. On the T time axis, define the dwell time of the first action as T01 and the dwell time of the second action as T10. The time between T01 and T10 represents the duration of all sub-actions. By setting the ratio of T10 to T, the intensity of the change from the first action to the second action can be adjusted. Specifically, defining T10 as 70%-90% of T represents a rigid action process, allowing for mechanical movements; defining T10 as 0%-69% of T represents a flexible action process, allowing for gentler movements. After defining the beat by voice, users can further define the proportion of the T10 in the entire beat by voice, thereby changing the action process to flexible or rigid, and improving interactivity.

[0044] In the description of the embodiments of the present invention, it should be understood that the terms "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "center," "top," "bottom," "top," "bottom," "inner," "outer," "inner side," and "outer side," etc., indicating the orientation or positional relationship, are based on the orientation or positional relationship shown in the accompanying drawings and are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of the present invention. "Inner side" refers to the interior or enclosed area or space. "Outer perimeter" refers to the area surrounding a specific component or specific area.

[0045] In the description of embodiments of the present invention, the terms "first," "second," "third," and "fourth" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of indicated technical features. Thus, a feature defined as "first," "second," "third," or "fourth" may explicitly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise stated, "a plurality of" means two or more.

[0046] In the description of the embodiments of the present invention, it should be noted that, unless otherwise explicitly specified and limited, the terms "installation," "connection," "joining," and "assembly" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal communication between two components. Those skilled in the art can understand the specific meaning of the above terms in the present invention based on the specific circumstances.

[0047] In the description of embodiments of the present invention, specific features, structures, materials or characteristics may be combined in any suitable manner in one or more embodiments or examples.

[0048] In the description of the embodiments of the present invention, it should be understood that "-" and "~" represent a range of two values, and this range includes the endpoints. For example, "AB" represents a range greater than or equal to A and less than or equal to B. "A~B" represents a range greater than or equal to A and less than or equal to B.

[0049] In the description of embodiments of the present invention, the term "and / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Additionally, the character " / " in this document generally indicates that the preceding and following related objects have an "or" relationship.

[0050] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A voice-programmable multi-legged robot system, characterized in that, have: Speech recognition module; Programming module; The voice recognition module analyzes and processes the collected voice information to obtain voice commands. The programming module identifies the initial position of the voice command issuer based on the voice command, constructs a coordinate system based on the initial position, obtains a relative coordinate system, obtains the toe coordinates and joint coordinates of the robot foot through the relative coordinate system, obtains the robot foot gait trajectory using the toe coordinates and joint coordinates, and calls the robot foot gait trajectory according to the voice command. The programming module includes an action editing unit, which comprises a static action editing unit, a dynamic action editing unit, and a timing action editing unit. The static action editing unit is used to generate static target actions for the robot's legs based on voice commands; the dynamic action editing unit is used to generate motion trajectories for the robot's legs based on voice commands; and the timing action editing unit is used to generate timing actions based on voice commands and music information. The static motion editing unit includes a numbering assignment subunit, a static target motion storage subunit, a static target motion presentation subunit, and a static motion correction subunit. The numbering assignment subunit assigns a voice number to each robot leg according to a voice command, forming a mapping table that corresponds one-to-one between the robot leg and the assigned voice number. The static target motion storage subunit sets the angle and orientation of the robot leg according to a voice command and records the angle and orientation. The static target motion presentation subunit retrieves the angle and orientation of the robot leg from the static target motion storage subunit according to a voice command and presents the static target motion. The static motion correction subunit records the static target motion after the robot leg's motion has changed when the robot leg is in a suspended state and an external force is applied to the robot leg. The dynamic motion editing unit includes an intermediate motion generation subunit and an intermediate motion storage subunit. The intermediate motion generation subunit is used to create multiple sub-motions between two adjacent static target actions at a certain time interval, wherein the sub-motions are static actions captured by the motion trajectories of two adjacent static target actions; the intermediate motion storage subunit is used to record multiple sub-motions. The checkpoint action editing unit includes a static target action dwell time assignment subunit, a time interval assignment subunit between two adjacent static target actions, and a sub-action time assignment subunit. The static target action dwell time assignment subunit is used to assign the dwell time of the static target action; the time interval assignment subunit between two adjacent static target actions is used to assign the time of the process between two adjacent static target actions; and the sub-action time assignment subunit is used to assign the time of all sub-actions between two static target actions.

2. The voice-programmable multi-legged robot system according to claim 1, characterized in that: The relative coordinate system is constructed with the voice command issuer as the reference object.

3. The voice-programmable multi-legged robot system according to claim 2, characterized in that: The relative coordinate system includes a global coordinate system and a local coordinate system. The global coordinate system is a coordinate system constructed based on the initial orientation of the robot as a whole and the voice command, and obtains the initial coordinates of each toe. The local coordinate system is a coordinate system constructed with the initial coordinates of the toes as the origin.

4. The voice-programmable multi-legged robot system according to claim 1, characterized in that: When the dwell time of the second static target action in two adjacent static target actions accounts for 70%-90% of the interval between the two adjacent static target actions, the motion process of the two adjacent static target actions is a rigid action process.

5. The voice-programmable multi-legged robot system according to claim 1, characterized in that: When the dwell time of the second static target action in two adjacent static target actions accounts for 0%-69% of the interval between the two adjacent static target actions, the action process of the two adjacent static target actions is a flexible action process.

6. The voice-programmable multi-legged robot system according to claim 1, characterized in that: The robot has n legs, where n is greater than 3, and each leg has 3 servo motors, for a total of 3n servo motors.