Hand joint angle estimation method
By extracting short- and long-term features of surface electromyography signals using a parallel deep learning network structure, the long-range dependency problem in hand joint angle estimation is solved, achieving accurate estimation of multiple degrees of freedom and improving computational speed, which is applicable to robotic hand control and rehabilitation training.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
- Filing Date
- 2022-11-18
- Publication Date
- 2026-06-23
Smart Images

Figure CN116108337B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for estimating the angle of a hand joint. Background Technology
[0002] For decades, robotic hands have been widely researched and applied in search and rescue, industry, and prosthetics. Surface electromyography (EMG) signals, with their characteristics of anticipating movement and being easy to collect, are ideal physiological signals for extracting human movement intentions. They have numerous applications in rehabilitation medicine and human-computer interaction, and are widely used worldwide for the dexterous control of robotic hands. For a long time, EMG control often used discrete motion recognition to achieve robotic hand control. However, the hand, as the most distinctive human organ, possesses high flexibility. Multi-degree-of-freedom real-time continuous control is the future direction of robotic hand development, as it can provide more natural and intuitive control. Many recent studies have established algorithms to map EMG signals to finger joint angles in order to achieve continuous real-time control strategies. Furthermore, considering applications for amputees and the difficulty in collecting hand EMG signals, arm EMG is often used to establish regression algorithms to estimate joint angles during hand movements. Since the muscles in the arm are active during hand movement, estimating hand joint angles using arm EMG signals is feasible.
[0003] Current research suggests two main approaches to real-time joint angle estimation using electromyography (EMG) signals: model-based methods and data-based methods. Model-based methods most commonly utilize physiological models such as the Hill model and the Huxley model. These methods explain the generation process of human movement, with parameters representing properties of the skeletal and muscular systems, such as muscle fiber length and tendon length. Data-based methods, on the other hand, primarily employ supervised methods based on machine learning and deep learning. These methods directly establish regression algorithms between EMG signals from the skin surface and continuous movement, offering simplicity and reliability. Currently, regression algorithms such as Gaussian process regression and long short-term memory networks are widely used in the task of real-time estimation of hand joint angles during movement.
[0004] While model-based joint angle estimation methods offer strong interpretability, the models contain numerous parameters that are difficult to measure directly. Currently, they are only applicable to motion estimation of joints with low degrees of freedom. However, human hand movements require coordination of many joints, necessitating estimation methods that support a wider range of degrees of freedom. Data-driven methods can often estimate relatively more degrees of freedom. Existing inventions use Gaussian process regression to map skin surface electromyography (EMG) signals to multi-degree-of-freedom finger joint angles. However, during movement, the current joint angle may be correlated with previous EMG signals. Most existing deep learning networks for estimating hand joint angles based on EMG incorporate Long Short-Term Memory (LSTM) networks to learn the overall temporal characteristics of the EMG sequence. LSM networks belong to recurrent neural network (RNN) structures. When the sequence is long, due to the long-range dependency problem of RNNs, some information from the input sequence is easily lost. Furthermore, recurrent structures cannot fully utilize the speed advantage of GPU parallel computing. Summary of the Invention
[0005] In view of this, it is necessary to provide a hand joint angle estimation method that can overcome the long-range dependency problem and use a fully parallel structure to optimize the computation speed.
[0006] This invention provides a method for estimating hand joint angles, comprising the following steps: a. acquiring surface electromyography (SEMG) data and hand joint angle data; b. processing the acquired SEMG data and hand joint angle data; c. generating training set samples and test set samples based on the processed SEMG data and joint angle data; d. training a prediction model using the generated training set samples to estimate continuous finger movements; e. saving the optimal prediction model trained, inputting it into the test set samples, and evaluating its regression performance.
[0007] Specifically, step a includes:
[0008] The electromyography (EMG) sensor uses the differential EMG electrodes in the Delsys EMG acquisition system to acquire surface EMG data. The acquired surface EMG data comes from the EMG signals of the extensor muscles of the fingers, flexor muscles of the fingers, biceps brachii, triceps brachii, and a ring of muscles in the forearm 2-6 cm from the elbow.
[0009] Hand joint angle data were collected using the CyberGlove II data glove.
[0010] Specifically, step b includes:
[0011] For surface electromyography (EMG) data: A fourth-order Butterworth filter (5–450 Hz) was used for bandpass filtering of the EMG signal for baseline correction and noise removal. The surface EMG signal was amplified using the u-law logarithmic scaling method. The u-law algorithm formula is as follows:
[0012]
[0013] Where t is time, x t It is the surface electromyography signal, and μ is a parameter that is set manually.
[0014] Specifically, step b further includes:
[0015] For hand joint angle data: First, the data is resampled to 2000Hz to ensure that the electromyography (EMG) and joint angle sequences are synchronized in time; then, a 2Hz zero-phase low-pass filter is used to smooth the original joint angle signal to avoid step jitter in the signal and make it more like the normal movement curve of the human body; finally, the maximum and minimum values of the collected EMG and joint angle data are recorded for normalization of training and testing data.
[0016] Specifically, step c includes:
[0017] A sliding window is used to generate a surface electromyography (EMG) signal sequence and a joint angle sequence with a window length of 2000 sampling points. The sliding window step size is 100 sampling points. The joint angle and surface EMG signal data in each sliding window are used as a sample data. The dimension of the joint angle vector represents the estimated number of joint angles.
[0018] Specifically, the prediction model consists of four long-term and short-term feature fusion modules, a multi-scale convolution module, and a joint angle regression module;
[0019] The long-term and short-term feature fusion module is used to extract the long-term and short-term time-series features of the sequence and fuse them.
[0020] The multi-scale convolution module is a multi-scale one-dimensional convolutional neural network, which can fuse information from convolutional kernels of different sizes and achieve dimensionality transformation; finally, the estimated multi-degree-of-freedom joint angles are output through the joint angle regression module.
[0021] The joint angle regression module consists of two convolution operations, one dimension averaging operation, and three fully connected layers. The convolution kernel size is 3, the dimension averaging operation calculates the average value of the channel dimensions of the sequence, and the last fully connected layer generates the joint angle sequence.
[0022] Specifically, the long-term and short-term feature fusion module mainly consists of two branches. The data is first encoded by features, then passed through the long-term feature branch and the short-term feature branch, and finally the output result is obtained through feature decoding. The multi-scale convolution module has four branches, each consisting of two dilated convolutions. The kernel size of the four branches is 3, and the dilation rate parameters of the dilated convolutions of the four branches are 1, 2, 4 and 8, respectively. Finally, the sequences obtained from each branch are concatenated along the feature direction and input into the joint angle regression module.
[0023] Specifically, the long-term feature branch first uses a standard multi-head self-attention mechanism (MSA) to divide the surface electromyography signal features into separate non-overlapping regions, and calculates the self-attention value in each region. Then, a two-layer multilayer perceptron (MLP) is used; MSA with displacement windows is used to extract cross-regional connectivity information between every two consecutive regions. The output of the displacement MSA also passes through a two-layer multilayer perceptron (MLP) to obtain the output result.
[0024] Specifically, the short-term feature branch mainly consists of convolution operations. The input features are first fed into two parallel branches. The input of the first branch is linearly encoded through a convolution with a kernel size of 1, and then directly connected to the end of the short-term feature branch. The input of the other branch is first linearly encoded through a convolution with a kernel size of 1, then through N sub-modules, and then through a convolution with a kernel size of 1 to obtain the output. Each sub-module consists of two convolutions with a kernel size of 3 and a residual connection. Finally, the outputs of the two branches are concatenated along the channel dimension to obtain the final output of the short-term feature branch.
[0025] Specifically, step d includes:
[0026] The continuously estimated joint angle curves are compared with the actual joint angle curves obtained by the joint angle sensors. The Pearson correlation coefficient, root mean square error, and coefficient of determination are used as the evaluation criteria for this regression task to evaluate the regression performance.
[0027] The beneficial effects of this application include:
[0028] Firstly, it enables the estimation of multi-degree-of-freedom joint angles during hand movements: more precise estimation of hand joint angles is achieved using a supervised nonparametric network, which is easy to implement and can estimate the angles of hand joints with more degrees of freedom.
[0029] Secondly, compared with the currently commonly used data-driven models, the accuracy of joint angle estimation is improved: This application extracts long-term and short-term features in the time dimension simultaneously through parallel branches, and obtains higher estimation accuracy by extracting richer features from surface electromyography signals and performing feature fusion.
[0030] Third, it enables parallel computation and improves computing speed. Due to its dual-branch parallel design, compared to traditional neural networks, it can achieve parallel computation and improve real-time computing speed. Attached Figure Description
[0031] Figure 1 This is a flowchart of the hand joint angle estimation method of the present invention;
[0032] Figure 2 A schematic diagram of a hand joint angle estimation method provided in an embodiment of the present invention;
[0033] Figure 3 A schematic diagram of the prediction model provided in an embodiment of the present invention;
[0034] Figure 4(a) is a schematic diagram of the structure of the long-short time feature fusion module provided in an embodiment of the present invention;
[0035] Figure 4(b) is a schematic diagram of the structure of the long-term feature branch provided in the embodiment of the present invention;
[0036] Figure 4(c) is a schematic diagram of the structure of the short-time feature branch provided in the embodiment of the present invention. Detailed Implementation
[0037] The present invention will now be described in further detail with reference to the accompanying drawings and specific embodiments.
[0038] See Figure 1 The diagram shown is a flowchart of a preferred embodiment of the hand joint angle estimation method of the present invention.
[0039] Please combine them together Figure 2 Step S1: Obtain surface electromyography data and hand joint angle data. Specifically:
[0040] In this embodiment, the electromyography (EMG) sensor uses differential EMG electrodes in the Delsys EMG acquisition system to collect surface EMG data. The collected surface EMG data comes from the EMG signals of the extensor and flexor muscles of the fingers, biceps brachii, triceps brachii, and a ring of muscles in the forearm 2-6 cm from the elbow. The EMG signal sampling frequency is 2000 Hz. Hand joint angle data is collected using the CyberGlove II data glove at a sampling frequency of 20 Hz.
[0041] Step S2 involves processing the acquired surface electromyography data and hand joint angle data. Specifically:
[0042] For surface electromyography (EMG) data: A fourth-order Butterworth filter (5–450 Hz) was used for bandpass filtering of the EMG signals for baseline correction and noise removal. Because some channels of the surface EMG signals have small amplitudes, the u-law logarithmic scaling method was used to amplify the surface EMG signals. The formula for the u-law algorithm is as follows:
[0043]
[0044] Where t is time, x t It is the surface electromyography signal, and μ is a parameter that is set manually. In this embodiment, μ = 256.
[0045] For hand joint angle data: First, resample to 2000Hz to ensure that the electromyography and joint angle sequences are synchronized in time; then, use a 2Hz zero-phase low-pass filter to smooth the original joint angle signal to avoid step jitter in the signal and make it more like the normal movement curve of the human body; finally, further record the maximum and minimum values of the collected electromyography data and joint angle data for normalization of training and testing data.
[0046] Step S3: Based on the processed surface electromyography data and joint angle data, generate training set samples and test set samples. Specifically:
[0047] In this embodiment, a sliding window is used to generate a surface electromyography (EMG) signal sequence and a joint angle sequence with a window length of 2000 sampling points. The sliding window step size is 100 sampling points. The joint angle and surface EMG signal data in each sliding window are used as a sample data. The dimension of the joint angle vector represents the estimated number of joint angles.
[0048] In this embodiment, 80% of the data samples are used as training set samples, and 20% of the data samples are used as test set samples.
[0049] Step S4: Train the prediction model using the generated training set samples to estimate continuous finger movements. Specifically:
[0050] like Figure 3 As shown, the prediction model consists of four long-term and short-term feature fusion modules, a multi-scale convolution module, and a joint angle regression module.
[0051] The long-term and short-term feature fusion module extracts and fuses the long-term and short-term temporal features of the sequence. The multi-scale convolution module is a multi-scale one-dimensional convolutional neural network that can fuse information from convolutional kernels of different sizes and achieve dimensionality transformation. Finally, the estimated multi-degree-of-freedom joint angles are output through the joint angle regression module. The long-term and short-term feature fusion module mainly consists of two branches. The data is first encoded through features, then processed through the long-term feature branch and the short-term feature branch, and finally decoded to obtain the output result. The multi-scale convolution module has four branches, each consisting of two dilated convolutions. The kernel size of all four branches is 3, and the dilation parameters of the dilated convolutions in the four branches are 1, 2, 4, and 8, respectively. Finally, the sequences obtained from each branch are concatenated along the feature direction and input into the joint angle regression module. The joint angle regression module consists of two convolution operations, one dimensionality averaging operation, and three fully connected layers. The kernel size of all convolutions is 3, the dimensionality averaging operation calculates the average value of the channel dimensions of the sequence, and the last fully connected layer generates the joint angle sequence.
[0052] In this embodiment, the last vector of the sequence is used as the joint angle of the network output, and the mean squared error loss is calculated from this vector and then the network is iterated and updated through a gradient optimization algorithm.
[0053] The specific structure of the long-term and short-term feature fusion module is shown in Figure 4(a). First, through a reconstruction operation, the input sequence X∈R is transformed. c*l The time dimension is compressed to half its original value, while the channel dimension is doubled, resulting in the output. The reconstructed data is encoded using a convolutional operation and then split into two equal-sized parts along the channel dimension. These parts are fed into two parallel branches: a long-term feature branch and a short-term feature branch. The long-term feature branch captures sparse global features from the entire time series, while the short-term feature branch extracts local information from adjacent regions. The outputs of the two parallel branches are then concatenated along the channel dimension, along with a residual connection. Finally, a convolutional decoding operation is performed to obtain multi-level features.
[0054] The specific structure of the long-term feature branch is shown in Figure 4(b). First, the surface electromyography signal features are divided into separate non-overlapping regions using a standard multi-head self-attention mechanism (MSA), and the self-attention value is calculated in each region. Then, a two-layer multilayer perceptron (MLP) is used. Next, an MSA with a displacement window is used to extract cross-regional connectivity information between every two consecutive regions. The output of the displacement MSA is also passed through a two-layer multilayer perceptron (MLP) to obtain the output result. It should be noted that after each MSA and MLP, there is a layer normalization (LN) operation and a residual connection. In this embodiment, the receptive field of all MSA and displacement MSA is set to 25.
[0055] The specific structure of the short-time feature branch is shown in Figure 4(c), mainly consisting of convolution operations. The input features are first fed into two parallel branches. The input of the first branch is linearly encoded through a convolution with a kernel size of 1, and then directly connected to the end of the short-time feature branch. The input of the other branch is first linearly encoded through a convolution with a kernel size of 1, then passed through N sub-modules, and then through a convolution with a kernel size of 1 to obtain the output. Each sub-module consists of two convolutions with a kernel size of 3 and a residual connection. Finally, the outputs of the two branches are concatenated along the channel dimension to obtain the final output of the short-time feature branch.
[0056] Step S5: Save the best prediction model trained and input the test set samples. Compare the continuously estimated joint angle curves with the actual joint angle curves obtained from the joint angle sensors. Use the Pearson correlation coefficient (CC), root mean square error (RMSE), and coefficient of determination (R²) as three performance indicators to evaluate the regression performance of this regression task. Specifically:
[0057] 1) Pearson correlation coefficient (CC). CC measures the linear correlation between estimated finger joint angles and corresponding actual data. The formula is as follows:
[0058]
[0059] The closer the CC value is to 1, the closer the predicted finger movement trajectory is to the actual trajectory, and the higher the estimation accuracy.
[0060] 2) Root Mean Square Error (RMSE) (°). RMSE is used to evaluate the deviation between the estimated and measured finger joint angles, and the unit is degrees (°). The formula for calculating RMSE is:
[0061]
[0062] 3) Coefficient of Determination (R²). As a comprehensive evaluation index measuring the overall accuracy of the model, R² ranges from 0 to 1. R² represents the percentage change in the true value explained by the estimated value. The larger the R² value, the better the estimation performance. Its formula is:
[0063]
[0064] In the above formula, N represents the sample size, p i The sample points representing the predicted finger joint angles, g i Sample points representing the actual angles of finger joints.
[0065] It is worth noting that this application can be used for industrial and aerospace robot control and operation, manipulating a robotic hand to grasp objects of different sizes based on electromyographic information from the human body surface. Continuous estimation of hand joint angles using surface electromyographic signals makes the entire control process natural and dexterous, enabling efficient interaction between humans and robotic hands.
[0066] This application can provide more intelligent and humane rehabilitation training equipment for patients with motor dysfunction. It introduces electromyography pattern recognition technology into the control link of the rehabilitation training system, and trains the model by using the patient's residual surface electromyography signals and the virtual hand joint angle information when the patient imagines completing the action, so as to help patients with motor dysfunction complete rehabilitation training more actively.
[0067] This application employs a parallelizable deep learning network method built using a one-dimensional convolutional neural network and a multi-head attention mechanism. It establishes a mapping relationship between surface electromyography (EMG) signals and multiple hand joint angles, enabling real-time estimation of hand joint angles. One-dimensional convolution is used to extract short-term temporal features from the input sequence, while an attention mechanism is utilized to extract long-term temporal features. The long-term and short-term features are then fused to improve estimation accuracy. The entire network is free from long-range dependencies, and due to its fully parallel structure, further optimization of computational speed can be achieved through hardware and software modifications.
[0068] Although the present invention has been described with reference to the present preferred embodiments, those skilled in the art should understand that the above preferred embodiments are only used to illustrate the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for estimating the angle of a hand joint, characterized in that, The method includes the following steps: a. Obtain surface electromyography data and hand joint angle data; b. Process the acquired surface electromyography data and hand joint angle data; c. Based on the processed surface electromyography data and joint angle data, generate training set samples and test set samples; d. Train a prediction model using the generated training set samples to estimate continuous finger movements; e. Save the best-performing prediction model from the training dataset, input the test set samples, and evaluate the regression performance; where: The prediction model consists of four long-term and short-term feature fusion modules, a multi-scale convolution module, and a joint angle regression module. The long-term and short-term feature fusion module is used to extract the long-term and short-term time-series features of the sequence and fuse them. The multi-scale convolution module is a multi-scale one-dimensional convolutional neural network, which can fuse information from convolutional kernels of different sizes and achieve dimensionality transformation; finally, the estimated multi-degree-of-freedom joint angles are output through the joint angle regression module. The joint angle regression module consists of two convolution operations, one dimension averaging operation, and three fully connected layers. The convolution kernel size is 3, the dimension averaging operation calculates the average value of the channel dimensions of the sequence, and the last fully connected layer generates the joint angle sequence. The long-term and short-term feature fusion module mainly consists of two branches. The data is first encoded by features, then processed through the long-term feature branch and the short-term feature branch, and finally decoded to obtain the output result. The multi-scale convolution module has four branches, each consisting of two dilated convolutions. The kernel size of the four branches is 3, and the dilation rate parameters of the dilated convolutions of the four branches are 1, 2, 4 and 8, respectively. Finally, the sequences obtained from each branch are concatenated along the feature direction and input into the joint angle regression module. The long-term feature branch first uses a standard multi-head self-attention mechanism to divide the surface electromyography signal features into separate non-overlapping regions and calculates the self-attention value in each region. Then, a two-layer multilayer perceptron is used. A multi-head self-attention mechanism with displacement windows is used to extract cross-regional connectivity information between every two consecutive regions. The output of the displacement multi-head self-attention mechanism is also passed through a two-layer multilayer perceptron to obtain the output result.
2. The hand joint angle estimation method as described in claim 1, characterized in that, Step a specifically includes: The electromyography (EMG) sensor uses the differential EMG electrodes in the Delsys EMG acquisition system to acquire surface EMG data. The acquired surface EMG data comes from the EMG signals of the extensor muscles of the fingers, flexor muscles of the fingers, biceps brachii, triceps brachii, and a ring of muscles in the forearm 2-6 cm from the elbow. Hand joint angle data were collected using the CyberGlove II data glove.
3. The hand joint angle estimation method as described in claim 2, characterized in that, Step b specifically includes: For surface electromyography (EMG) data: A fourth-order Butterworth filter (5–450 Hz) was used for bandpass filtering of the EMG signal for baseline correction and noise removal. The surface EMG signal was amplified using the u-law logarithmic scaling method. The u-law algorithm formula is as follows: in It is time. It is a surface electromyography signal. These are parameters that are set manually.
4. The hand joint angle estimation method as described in claim 3, characterized in that, Step b further includes: For hand joint angle data: First, the data is resampled to 2000Hz to ensure that the electromyography (EMG) and joint angle sequences are synchronized in time; then, a 2Hz zero-phase low-pass filter is used to smooth the original joint angle signal to avoid step jitter in the signal and make it more like the normal movement curve of the human body; finally, the maximum and minimum values of the collected EMG and joint angle data are recorded for normalization of training and testing data.
5. The hand joint angle estimation method as described in claim 4, characterized in that, Step c specifically includes: A sliding window is used to generate a surface electromyography (EMG) signal sequence and a joint angle sequence with a window length of 2000 sampling points. The sliding window step size is 100 sampling points. The joint angle and surface EMG signal data in each sliding window are used as a sample data. The dimension of the joint angle vector represents the estimated number of joint angles.
6. The hand joint angle estimation method as described in claim 5, characterized in that, The short-term feature branch mainly consists of convolution operations. The input features are first fed into two parallel branches. The input of the first branch is linearly encoded by a convolution with a kernel size of 1, and then directly connected to the end of the short-term feature branch. The input of the other branch is first linearly encoded by a convolution with a kernel size of 1, then passed through N sub-modules, and then through a convolution with a kernel size of 1 to obtain the output. Each sub-module consists of two convolutions with a kernel size of 3 and a residual connection. Finally, the outputs of the two branches are concatenated along the channel dimension to obtain the final output of the short-term feature branch.
7. The hand joint angle estimation method as described in claim 6, characterized in that, Step d specifically includes: The continuously estimated joint angle curves are compared with the actual joint angle curves obtained by the joint angle sensors. The Pearson correlation coefficient, root mean square error, and coefficient of determination are used as the evaluation criteria for this regression task to evaluate the regression performance.