Human pose recognition method and system based on double-attention structured position encoding

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining dual-attention structured position coding and a global self-attention model with a pose capture device and processing terminal, the problem of human pose recognition accuracy in situations with multiple people and occlusion was solved, achieving high-accuracy pose state judgment.

CN121600558BActive Publication Date: 2026-06-19SHANGHAI YUANKONG AUTOMATION TECH

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SHANGHAI YUANKONG AUTOMATION TECH
Filing Date: 2026-01-29
Publication Date: 2026-06-19

Smart Images

Figure CN121600558B_ABST

Patent Text Reader

Abstract

This application relates to a human posture recognition method and system based on dual-attention structured position coding, belonging to the technical field of computer vision. The method includes: controlling a posture capture device to acquire human posture images; mapping each human posture image to an image sequence number to obtain a human posture image library; inputting the human posture image library into a human keypoint detection network to obtain a human joint coordinate library; performing structured position coding on the human joint coordinate library to obtain a standard human joint position type relation library; inputting the standard human joint position type relation library into a global self-attention model to obtain a dual-attention structured position coding library; inputting the human posture image library into a main coding network to obtain a global visual feature vector library; interacting the dual-attention structured position coding library and the global visual feature vector library to obtain overall posture state results; and analyzing the overall posture state results to obtain the target rehabilitation posture result. This application improves the accuracy of posture analysis results.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer vision technology, and in particular to a method and system for human pose recognition based on dual attention structured position coding. Background Technology

[0002] Dual-attention structured location coding refers to an encoding method that integrates dual-attention mechanisms with structured location information. A global self-attention model includes two attention branches, each capable of capturing information features from different dimensions. Through data integration, the two attention data are fused into a single dataset, resulting in a more comprehensive processing model. Structured location coding is a location coding scheme adapted to the inherent structural characteristics of the data. By fusing the three-dimensional coordinates of human joints, human skeletal data, and the connections between joints, it allows for the retrieval of the three-dimensional coordinates of human joints while also revealing the relationships between different joints.

[0003] In related technologies, human posture recognition technology refers to image processing technology that extracts and analyzes feature data such as human skeletal nodes, human contours, and action sequences from acquired images or videos to identify the human posture and movement state of a target person.

[0004] Regarding the aforementioned technologies, when observing the posture of a target observer, if there is more than one person in the observation scene at the same time, the system will be unable to identify the target observer, thus failing to accurately determine the human posture state of the target observer. Furthermore, if some or all of the people in the observation scene are obstructed by fixed equipment, the joint data output during the analysis and processing of the image data of each person will be incomplete, affecting the identification of the target observer and consequently reducing the accuracy of the output human posture state of the target observer. There is still room for improvement. Summary of the Invention

[0005] To improve the accuracy of pose analysis results, this application provides a human pose recognition method and system based on dual-attention structured position coding.

[0006] Firstly, this application provides a human pose recognition method based on dual-attention structured position coding, employing the following technical solution:

[0007] A human pose recognition method based on dual-attention structured position coding includes:

[0008] Control the preset posture capture device to acquire human posture images in the preset rehabilitation room;

[0009] Human pose images are matched one-to-one with preset image numbers to generate a human pose image library;

[0010] Input the human pose image library into a preset human key point detection network to generate a human joint coordinate library;

[0011] The data in the human joint coordinate library is processed by structured position encoding to generate a standard human joint position type relation library;

[0012] Input the data from the standard human joint position type relation library into the preset global self-attention model to generate a dual-attention structured position encoding library;

[0013] The human pose image library is input into a preset master encoding network to generate a global visual feature vector library;

[0014] The dual-attention structured position encoding library is interacted with the global visual feature vector library to generate the overall pose state results;

[0015] The results of all posture states are analyzed to generate the target rehabilitation posture results.

[0016] Optionally, the steps of controlling a preset posture capture device to acquire human posture images of a preset rehabilitation room include:

[0017] Collect individual human posture images and the number of people in the rehabilitation room;

[0018] A single human pose image is input into a human keypoint detection network to generate the location and number of joints in a single human body.

[0019] Determine whether the number of joints in a single human body matches the preset standard number of joint positions in a human body;

[0020] If they match, the single human posture image is determined as the actual single human posture image, and the next single human posture image is collected and the judgment is repeated according to the number of people in the rehabilitation room.

[0021] If they are inconsistent, the position of a single human joint will be compared with the preset standard human joint position to generate a single occluded joint position.

[0022] The posture capture device is adjusted based on the position of a single occluded joint, and the single human posture image is continuously acquired for cyclical judgment.

[0023] The actual single human pose images are aggregated to generate human pose images.

[0024] Optionally, the steps of adjusting the pose capture device based on the position of a single occluded joint and continuing to acquire single human pose images for cyclical judgment include:

[0025] The actual capture position of the attitude capture device is obtained;

[0026] The locations of single occluded joints are classified according to the preset target rehabilitation areas to generate the locations of primary and non-primary occluded nodes.

[0027] Count the number of primary occlusion nodes and the number of secondary occlusion nodes to generate the total number of primary occlusion nodes and the total number of secondary occlusion nodes;

[0028] Calculate the product of the number of primary occlusion nodes and the number of non-primary occlusion nodes with the preset node weight parameters, and add the products together to generate the total correction parameters;

[0029] The positions of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture positions are analyzed to generate adjustment parameters for the attitude capture device.

[0030] Adjust the posture capture device according to the adjustment parameters, and control the posture capture device to acquire single human posture images.

[0031] Optionally, the steps of analyzing the positions of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture positions to generate adjustment parameters for the attitude capture device include:

[0032] The positions of the main occluded nodes and the actual capture positions are substituted into a preset arcsine function for calculation and summarization to generate the angle difference of the main occluded nodes;

[0033] Substitute the positions of non-primary occlusion nodes and the actual capture positions into the arcsine function for calculation and summation to generate the angle difference of non-primary occlusion nodes;

[0034] Calculate the product of the angle difference between the main occluded nodes, the angle difference between the non-main occluded nodes, and the node weight parameters, and sum the products to generate the total node correction angle;

[0035] Calculate the sum between the total node correction angle and the total correction parameter and the preset right angle to generate the horizontal adjustment angle;

[0036] The preset safety detection distance, the location of the main obstruction node, the location of the non-main obstruction node, the actual capture location, and the total correction parameters are substituted into the preset distance adjustment calculation formula to generate the device adjustment distance.

[0037] The total node correction angle, the preset height of the rehabilitation patient, and the total correction parameters are substituted into the preset vertical angle calculation formula to generate the vertical adjustment angle.

[0038] The horizontal adjustment angle, equipment adjustment distance, and vertical adjustment angle are summarized to generate adjustment parameters.

[0039] Optionally, the steps of analyzing all posture state results to generate the target rehabilitation posture result include:

[0040] Determine whether the overall posture state results meet the preset requirements for a single posture state result;

[0041] If the conditions are met, then the overall posture state results will be determined as the target rehabilitation posture results.

[0042] If the conditions are not met, then find all attitude data in the total attitude state results.

[0043] Matching posture data is generated by searching a pre-set database of normal human postures based on all posture data.

[0044] The matched pose data and the overall pose data are analyzed to generate abnormal pose data;

[0045] Collect rehabilitation schedule;

[0046] Based on the rehabilitation schedule, the target rehabilitation posture data range is found in the preset rehabilitation schedule posture data relationship;

[0047] The range of abnormal posture data and target rehabilitation posture data is analyzed to generate the target rehabilitation posture results.

[0048] Optionally, the steps of analyzing the matched pose data and the overall pose data to generate anomalous pose data include:

[0049] Find the single pose data and the single matched pose data in the total pose data and the matched pose data respectively;

[0050] Calculate the difference between the single pose data and the corresponding data in the corresponding single matched pose data and summarize them to generate pose data difference values;

[0051] Calculate the average value of the attitude data differences and summarize them to generate the attitude data difference rate;

[0052] The attitude data difference rate is mapped one-to-one with all attitude data to generate attitude data difference relationships;

[0053] Based on the differences in attitude data, the attitude data difference rate is compared with the preset range of normal attitude data difference rate and summarized to generate abnormal attitude data.

[0054] Optionally, the steps of analyzing the abnormal posture data and the target rehabilitation posture data range to generate the target rehabilitation posture results include:

[0055] Find the range of motion angles and frequency of motion of the target joint within the range of target rehabilitation posture results;

[0056] Find the actual joint range of motion and the actual joint frequency in the abnormal posture data;

[0057] The actual joint range of motion and the actual joint range of motion are compared with the corresponding target joint range of motion and the target joint range of motion to generate comparison results.

[0058] Determine whether the comparison results are consistent with the preset data determination results;

[0059] If they are inconsistent, continue generating comparison results and iteratively judging;

[0060] If they match, the abnormal posture data corresponding to the comparison results will be identified as rehabilitation posture data.

[0061] The target rehabilitation posture is identified from the overall posture results based on the rehabilitation posture data.

[0062] Secondly, this application provides a human pose recognition system based on dual-attention structured position coding, employing the following technical solution:

[0063] A human pose recognition system based on dual-attention structured position coding, comprising:

[0064] The acquisition module is used to acquire images of human posture.

[0065] A memory for storing a program of the human pose recognition method based on dual attention structured position coding as described in any of the preceding claims;

[0066] The processor and the program in the memory can be loaded and executed by the processor to implement the human pose recognition method based on dual attention structured position coding as described in any of the above.

[0067] In summary, this application includes at least one of the following beneficial technical effects:

[0068] 1. By mapping human posture images one-to-one with preset image numbers, a human posture image library is generated. The human posture image library is input into a human keypoint detection network to obtain a human joint coordinate library. The data in the human joint coordinate library is processed by structured position encoding to obtain a standard human joint position type relation library. The data in the standard human joint position type relation library is input into a global self-attention model to obtain a dual-attention structured position encoding library. The human posture image library is input into a main encoding network to obtain a global visual feature vector library. The dual-attention structured position encoding library and the global visual feature vector library are interacted to obtain the overall posture state results. The overall posture state results are then filtered and analyzed to obtain the target rehabilitation posture results for the target rehabilitation personnel, thereby improving the accuracy of posture state judgment for the target rehabilitation personnel.

[0069] 2. By acquiring individual human posture images and the number of people in the rehabilitation room, the individual human posture images are input into the human keypoint detection network to obtain the joint positions and number of joints. It is then determined whether the number of joints matches the preset standard number of joint positions. If they match, the individual human posture image is identified as the actual individual human posture image, and the process continues to acquire the next individual human posture image based on the number of people in the rehabilitation room, repeating the judgment. If they do not match, the joint positions of the individual human posture images are compared with the standard joint positions to obtain the positions of the occluded joints. The posture capture device is adjusted based on the positions of the occluded joints, and the process continues to acquire individual human posture images, repeating the judgment. The actual individual human posture images are then summarized to obtain the human posture image. This allows for adjustments to the posture capture device based on the actual occlusion of each person in the rehabilitation room, and individual images are taken of each person to obtain unoccluded human posture images, thereby improving the accuracy of posture state analysis for the target rehabilitation personnel.

[0070] 3. By determining whether the overall posture state results meet the requirements of a single posture state result, if they do, the overall posture state results are determined as the target rehabilitation posture results; if they do not, the overall posture data is retrieved from the overall posture state results, and a matching search is performed in the normal human posture database based on the overall posture data to obtain matching posture data. The matching posture data and the overall posture data are analyzed to obtain abnormal posture data. The rehabilitation schedule is collected, and the target rehabilitation posture data range is found in the preset rehabilitation schedule posture data relationship based on the rehabilitation schedule. The abnormal posture data and the target rehabilitation posture data range are then analyzed to obtain the target rehabilitation posture results. In this way, through multi-layer matching screening, the accuracy of identifying the target rehabilitation personnel is improved. Attached Figure Description

[0071] Figure 1This is a flowchart of a human pose recognition method based on dual-attention structured position coding in an embodiment of this application.

[0072] Figure 2 This is a flowchart of the steps in this application embodiment to control a preset posture capture device to acquire human posture images in a preset rehabilitation room.

[0073] Figure 3 This is a flowchart of the steps in this application embodiment to adjust the posture capture device based on the position of a single occluded joint and continue to collect a single human posture image for cyclical judgment.

[0074] Figure 4 This is a flowchart of the steps in this application embodiment to analyze the positions of the main occlusion nodes, the positions of the non-main occlusion nodes, the total correction parameters, and the actual capture positions to generate the adjustment parameters of the attitude capture device.

[0075] Figure 5 This is a flowchart of the steps in this application embodiment to analyze the overall posture state results in order to generate the target rehabilitation posture result.

[0076] Figure 6 This is a flowchart of the steps in this application embodiment to analyze the matched posture data and the overall posture data to generate abnormal posture data.

[0077] Figure 7 This is a flowchart of the steps in this application embodiment to analyze the range of abnormal posture data and target rehabilitation posture data to generate the target rehabilitation posture result. Detailed Implementation

[0078] To make the purpose, technical solution, and advantages of this application clearer, the following description is provided in conjunction with the appendix. Figures 1 to 7 The present application will be further described in detail below with reference to embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the application.

[0079] This application discloses a human posture recognition method based on dual-attention structured position coding. This method mainly addresses the problems of multiple targets and joint occlusion when recognizing human posture states. Specifically, it discloses a rehabilitation room, a posture capture device, and a processing terminal. The processing terminal is communicatively connected to the posture capture device to realize data interaction and control. After the posture capture device sends the image data of each person in the rehabilitation room to the processing terminal, the processing terminal inputs the image data into the relevant calculation model and interacts with the generated data. The aim is to quickly and reasonably obtain the posture state detection results of the target rehabilitation person, thereby avoiding the problem of inaccurate posture detection results caused by multiple people in the rehabilitation room or occlusion.

[0080] Reference Figure 1 This application discloses a human pose recognition method based on dual-attention structured position coding, comprising the following steps:

[0081] Step S100: Control the preset posture capture device to acquire human posture images of the preset rehabilitation room.

[0082] Among them, the human posture image refers to a set of full-frame RGB images of the appearance of all personnel in the rehabilitation room. This is obtained by capturing images of the appearance of all personnel using posture capture equipment and then summarizing all the captured images. For specific methods, please refer to [link to relevant documentation]. Figure 2 This process provides data support for subsequent processing of human posture images to determine the posture information of the target person.

[0083] Posture capture equipment refers to a device used in rehabilitation rooms to capture images of human appearance. It consists of a robotic arm that can change the horizontal position, horizontal angle, and vertical elevation angle of the camera, as well as a camera for capturing images.

[0084] A rehabilitation room is an enclosed space used to enable patients who need rehabilitation to undergo rehabilitation training using relevant rehabilitation equipment, and to assess the patient's posture during training.

[0085] Step S101: Match the human posture images with preset image numbers to generate a human posture image library.

[0086] The human posture image library refers to a collection of numbered images generated from the images of all personnel in the rehabilitation room. The processing terminal maps the image sequence number to each image in the human posture image library and embeds the corresponding image sequence number into the image. Then, all the embedded images are summarized to obtain the human posture image library. In one embodiment, if there are 5 images stored in the human posture image library, the five sequence numbers 0, 1, 2, 3, and 4 are taken from the image sequence number and embedded into the five images without repetition. The embedded images are then summarized to obtain the human posture image library, thereby providing data support for the subsequent determination of the joint coordinates of each person.

[0087] An image number is a set of data used to number an image. In one embodiment, the image number consists of 0 to 99, 100 non-negative integers.

[0088] Step S102: Input the human pose image library into the preset human key point detection network to generate a human joint coordinate library.

[0089] The human joint coordinate library refers to a database that stores the coordinates of human skeletal joints and joint IDs in each image of the human pose image library according to the image sequence number. The processing terminal inputs the images in the human pose image library into the human keypoint detection network according to the image sequence number, and then assigns the corresponding joint category ID to the output joint coordinates. The corresponding joint category ID is embedded into the corresponding joint coordinate. The joint coordinates of the embedded ID under the image sequence number are summarized to obtain the human joint coordinate library for each person. Finally, all the libraries are summarized according to the image sequence number to obtain the human joint coordinate library.

[0090] The joint category ID refers to the correspondence between different types of joints and their corresponding serial numbers. In one embodiment, when the joint is the right knee, the corresponding joint category ID is 6, and when the joint is the left shoulder, the corresponding joint category ID is 10.

[0091] Human keypoint detection networks are neural network models used to automatically locate joints and endpoints of the human skeleton in images and output the corresponding joint or endpoint coordinates. They consist of an input layer, a feature extraction layer, a feature fusion layer, and a prediction layer. The input layer receives the acquired image; the feature extraction layer extracts feature data from low to high levels; the feature fusion layer optimizes feature representation by combining feature details and semantic advantages from different levels; and the prediction layer maps the fused features to coordinates, thus outputting the final human joint coordinates.

[0092] Step S103: Perform structured position encoding on the data in the human joint coordinate library to generate a standard human joint position type relation library.

[0093] The standard human joint type relation library refers to a coordinate database that standardizes the coordinates and corresponding joint category IDs in the human joint coordinate library and reflects the relationships between human joints. The processing terminal projects the joint coordinates of each group in the human joint coordinate library into a set feature vector according to the image sequence number. Then, the corresponding joint category ID is converted into a category embedding vector through a trainable embedding lookup table. Finally, the set feature vector and the category embedding vector are projected to a unified dimension through MLP to obtain the database of the current image sequence number. Finally, the databases corresponding to all image sequences are summarized to obtain the standard human joint position type relation library.

[0094] Step S104: Input the data from the standard human joint position type relation library into the preset global self-attention model to generate a dual-attention structured position coding library.

[0095] The dual-attention structured position coding library refers to a standard human joint position type relationship library that integrates dual-attention mechanism and structured position coding, outputting a library that reflects the relationship between each person's joints and the environment, as well as the connections between each person's joints. The processing terminal sequentially inputs each set of data from the standard human joint position type relationship library into the global self-attention model according to the image sequence number. Each category embedding vector in this set of data is then subjected to attention with other category embedding vectors in the same set, and feature information is aggregated. This ensures that each category embedding vector incorporates global human skeletal pattern information. The category embedding vectors in this set, incorporating global human skeletal pattern information, are then concatenated with the set of feature vectors and compressed using MLP to obtain the dual-attention structured position coding corresponding to that image sequence number. The same operation is then performed on each set of data in the standard human joint position type library corresponding to the remaining image sequences. All processed dual-attention structured position codings are then mapped one-to-one with the image sequences and summarized to obtain the dual-attention structured position coding library.

[0096] A global self-attention model refers to a model that includes two attention branches, each capable of capturing information features from different dimensions. Through data integration, the two attention data are fused into a single dataset, resulting in a more comprehensive processing model. The global self-attention model comprises an input encoding layer, a global attention interaction layer, a feature enhancement layer, and an output layer. The input encoding layer converts the received data from a standard human joint position type relational database into fixed-dimensional embedding vectors. Then, the global attention interaction layer uses a multi-head self-attention mechanism to compute the global dependency weights of all vector elements in parallel, aggregating other related information to output a feature vector that integrates the global context. The feature enhancement layer performs a non-linear transformation on the feature vector output from the previous layer to enhance the model's expressive power. The non-linearly transformed vector is then processed through layer normalization and residual connections to output standardized data that retains the original features. Finally, the output layer converts the enhanced features into the final output format according to the task objective and outputs the data.

[0097] Step S105: Input the human pose image library into the preset master encoding network to generate a global visual feature vector library.

[0098] The global visual feature vector library refers to the data set used to store the human body edge image, joint motion angle, and joint activity frequency of the human body posture image corresponding to the image number in the human body posture image library. By processing the terminal, the human body edge image, joint motion angle, and joint activity frequency in the corresponding human body posture image are grouped, stored, and summarized according to the image number, and the global visual feature vector library can be obtained.

[0099] Human body edge image refers to an image that can represent the outline of the human body in a human posture image; joint motion angle refers to the set of motion angles of each joint in a human posture image. By processing the human posture images in the human posture image library sequentially according to the image number, the human body edge image and joint motion angle of each image can be extracted.

[0100] Joint activity frequency refers to the data set of the number of times each joint moves per unit time in a human posture image. In one embodiment, the processing terminal inputs multiple frames of human posture images corresponding to the image sequence number collected per unit time into the MMAction2 model, and matches and summarizes the output frequency data with the corresponding joint coordinates to obtain the joint activity frequency corresponding to the image sequence number.

[0101] The master encoder network (MEN) is a fundamental data processing model capable of extracting core information from human pose images. The MEN consists of an input layer, a multi-level hidden coding layer, and an output adaptation layer. The input layer receives the original image or video data and directly passes it to the next layer. The multi-level hidden coding layer contains multiple hidden layers, which perform non-linear transformations and feature compression on the input data layer by layer, continuously extracting high-order abstract features, and finally outputting a low-dimensional core encoding vector. Finally, the output adaptation layer outputs the core encoding vector from the previous layer.

[0102] Step S106: Interact the dual-attention structured position encoding library with the global visual feature vector library to generate the overall pose state results.

[0103] The overall pose state result refers to the data set used to store human pose information corresponding to the image sequence number. The processing terminal interacts with the dual attention structured position encoding library and the corresponding global visual feature vector library in sequence according to the image sequence number, and stores and summarizes them according to the image sequence number to obtain the overall pose state result.

[0104] Step S107: Analyze the overall posture results to generate the target rehabilitation posture results.

[0105] Among them, the target rehabilitation posture result refers to the data set used to store the human posture information of the target rehabilitation personnel. After the processing terminal determines the overall posture state result, the processing terminal analyzes the overall posture state result. The specific method is as follows: Figure 5 The steps are used to determine the outcome of the target rehabilitation posture.

[0106] Targeted rehabilitation personnel refer to patients in the rehabilitation room who require rehabilitation training.

[0107] Reference Figure 2The steps for controlling a preset posture capture device to acquire human posture images in a preset rehabilitation room include:

[0108] Step S200: Acquire images of individual human postures and the number of people in the rehabilitation room.

[0109] Among them, a single human posture image refers to an image of the appearance of a person in the rehabilitation room. A single human posture image can be obtained by controlling the posture capture device to take pictures of the person in the rehabilitation room.

[0110] The number of people in the rehabilitation room refers to the total number of people present in the rehabilitation room. In one embodiment, the number of people in the rehabilitation room can be obtained by controlling the posture capture device to capture a panoramic image of the rehabilitation room and then inputting the panoramic image of the rehabilitation room into the number of people statistics model.

[0111] Step S201: Input a single human pose image into a human keypoint detection network to generate the single human joint position and the single human joint number.

[0112] The single human joint position refers to the set of data containing the position coordinates of human joints and the corresponding joint type IDs captured from a single human pose image. The single human pose image is input into the human keypoint detection network through the processing terminal, and the output coordinates are assigned the corresponding joint category IDs. The corresponding joint category IDs are then embedded into the corresponding coordinates. Finally, all coordinates with embedded IDs are summed to obtain the single human joint position.

[0113] The number of joints in a single human body refers to the amount of data stored in a single joint location. The number of joints in a single human body can be obtained by counting the joint locations in a single human body through a processing terminal.

[0114] Step S202: Determine whether the number of joints in a single human body is consistent with the preset standard number of joint positions in a human body.

[0115] The standard number of human joint positions refers to the number of joint positions that need to be collected on a person without any obstruction in order to obtain accurate posture data. In one embodiment, the standard number of human joint positions is 33.

[0116] After the processing terminal determines the number of joints in a single human body, it determines whether the number of joints in a single human body is consistent with the number of joint positions in a standard human body, thereby determining whether the single human body pose image is occluded.

[0117] Step S2021: If they match, the single human posture image is determined as the actual single human posture image, and the next single human posture image is collected and cyclically judged according to the number of people in the rehabilitation room.

[0118] If the processing terminal determines that the number of joints in a single human body is consistent with the number of joint positions in a standard human body, it means that the single human body posture image is not occluded. Therefore, the processing terminal determines the single human body posture image as the actual single human body posture image, and controls the posture capture device to collect the next single human body posture image in the rehabilitation room according to the number of people in the rehabilitation room for cyclic judgment, thereby providing support for the subsequent determination of human body posture images.

[0119] A single human pose image actually refers to an unobstructed image of a person's appearance captured by a pose capture device.

[0120] The next single human posture image refers to the appearance image obtained after taking pictures of other people in the rehabilitation room while the posture capture device is currently running. The processing terminal counts the single human posture images that have been judged and subtracts the number of judged images from the number of people in the rehabilitation room to obtain the remaining number of judgments. It is then determined whether the remaining number of judgments is greater than 0, thereby determining whether it is necessary to continue controlling the posture capture device to collect images. If the processing terminal determines that the remaining number of judgments is equal to 0, the collection of image data in the rehabilitation room is stopped; if the processing terminal determines that the remaining number of judgments is greater than 0, the image data of the remaining people is collected again according to the remaining number of judgments, and the number of single human posture images that have been judged continues to accumulate, thus providing data support for the subsequent cyclical judgment of the next single human posture image to determine whether the image is occluded.

[0121] Step S2022: If they are inconsistent, compare the position of a single human joint with the preset standard human joint position to generate a single occluded joint position.

[0122] If the processing terminal determines that the number of joints in a single human body is inconsistent with the number of joint positions in a standard human body, it indicates that the single human body posture image is occluded. Therefore, the processing terminal compares the position of the single human body joint with the preset standard human body joint position to generate the position of the single occluded joint, thereby providing data support for subsequent adjustment of the posture capture device.

[0123] A single occluded joint position refers to the data set of occluded joint position coordinates and corresponding joint category IDs in a single human pose image. The processing terminal maps the joint category IDs in the single human joint position to the joint category IDs in the standard human joint position. The joint category IDs that are not matched in the standard human joint position are extracted and summarized to generate occluded joint category IDs. The occluded joint category IDs and the single human joint position are then input into the Occlusion-Net model, and the output coordinates are mapped to the occluded joint category IDs and summarized to obtain the single occluded joint position.

[0124] The Occlusion-Net model is a model used to infer the coordinates of occluded joints in a single human pose image. The Occlusion-Net model consists of an initial keypoint detection layer, a graph encoder network layer, a graph decoder network layer, and a 3D graph network layer. It starts by inputting the location of a single human joint into the initial keypoint detection layer for preliminary detection, outputting the visible initial joint location information to the next layer. The graph encoder network layer then identifies invisible connections within the initial joint location information, analyzing the connections between joints through the graph network structure. It categorizes occluded joints and their associated paths, thus clarifying the association logic of occluded areas and providing a basis for subsequent layer-level analysis of occluded joint locations. The graph decoder network layer receives the association logic output from the graph encoder network layer and combines it with the initial joint location information from the initial keypoint detection layer to accurately correct the coordinates of occluded joints. Finally, the 3D graph network layer utilizes a self-supervised reprojection loss function to learn and upgrade the joint coordinates output from the previous layer from two dimensions to three dimensions, which are then output.

[0125] Standard human joint position refers to the set of data on the position coordinates of human joints and their corresponding joint type IDs, collected by a posture capture device under unobstructed conditions.

[0126] Step S20221: Adjust the posture capture device according to the position of a single occluded joint, and continue to collect single human posture images for cyclic judgment.

[0127] In this process, after the processing terminal determines the position of a single occluded joint, the attitude capture device is adjusted based on that position. The specific method is described in [reference needed]. Figure 3 The process involves taking steps and then using the terminal to control the posture capture device to collect single human posture images for cyclical judgment, thereby obtaining the actual single human posture image.

[0128] Step S203: Summarize the actual single human pose images to generate a human pose image.

[0129] The process involves counting the actual single human posture images determined by the processing terminal to generate the actual number of images acquired, and comparing this number with the number of people in the rehabilitation room. When the number of people in the rehabilitation room matches the actual number of images acquired, the processing terminal summarizes all the actual single human posture images to obtain the human posture image.

[0130] Reference Figure 3The steps of adjusting the posture capture device based on the position of a single occluded joint and continuously acquiring single human posture images for cyclical judgment include:

[0131] Step S300: Collect the actual capture position of the attitude capture device.

[0132] The actual capture position refers to the three-dimensional coordinates of the center of the posture capture device. In one embodiment, the two vertical sides of the horizontal floor in the rehabilitation room are defined as the x-axis and y-axis, and the intersection is taken as the origin of the coordinate system. The vertical distance from the center of the posture capture device to the x-axis and y-axis, as well as the height between the center of the posture capture device and the horizontal floor, are measured by a laser rangefinder. The vertical distance from the x-axis is then defined as the x-axis coordinate, the vertical distance from the y-axis is defined as the y-axis coordinate, and the height between the center of the posture capture device and the horizontal floor is defined as the z-axis coordinate. The actual capture position is obtained by summarizing the data.

[0133] Step S301: Classify the single occluded joint positions according to the preset target rehabilitation area to generate the main occluded node positions and non-main occluded node positions.

[0134] The main occlusion node position refers to the single occlusion joint position within the range of the target rehabilitation part of the target rehabilitation person. The target rehabilitation part is mapped onto a single human posture image by the processing terminal, and the single occlusion joint position is highlighted in the single human posture image. The single occlusion joint position within the range of the target rehabilitation part is found and extracted in the processed single human posture image. Finally, all the data is summarized to obtain the main occlusion node position.

[0135] Non-primary occlusion node locations refer to single occlusion joint locations that are not within the range of the target rehabilitation site of the target rehabilitation patient. By matching the primary occlusion node locations with the single occlusion joint locations through the processing terminal, and then extracting and summarizing the unmatched single occlusion joint locations, the non-primary occlusion node locations can be obtained.

[0136] The target rehabilitation site refers to the body part that the target rehabilitation person needs to rehabilitate. In one embodiment, the target rehabilitation site is the lower limb.

[0137] Step S302: Count the number of main occlusion node positions and non-main occlusion node positions to generate the number of main occlusion nodes and non-main occlusion nodes.

[0138] The number of primary occlusion nodes refers to the number of position coordinates in the primary occlusion node locations. The number of primary occlusion nodes can be obtained by the processing terminal performing a non-repeating count of the position coordinates in the primary occlusion node locations.

[0139] The number of non-primary occlusion nodes refers to the number of position coordinates in the non-primary occlusion node locations. The number of non-primary occlusion nodes can be obtained by the processing terminal counting the position coordinates in the non-primary occlusion node locations without repetition.

[0140] Step S303: Calculate the product between the number of primary occlusion nodes and the number of non-primary occlusion nodes and the preset node weight parameters, and add the products together to generate the total correction parameter.

[0141] The total correction parameter refers to the relevant calculation data required to adjust the acquisition position of the attitude capture device. The processing terminal finds the main weight parameter and non-main weight parameter in the node weight parameter, multiplies the number of main occluded nodes with the main weight parameter to obtain the main correction parameter, multiplies the number of non-main occluded nodes with the non-main weight parameter to obtain the non-main correction parameter, and finally adds the main correction parameter and non-main correction parameter to obtain the total correction parameter.

[0142] Node weight parameters refer to the data set used to store primary weight parameters and non-primary weight parameters. In one embodiment, the primary weight parameter is 1 and the non-primary weight parameter is 0.7.

[0143] The primary weight parameter refers to the corrected weight corresponding to the location of the primary occluded node; the non-primary weight parameter refers to the corrected weight corresponding to the location of the non-primary occluded node.

[0144] Step S304: Analyze the positions of the main occlusion nodes, the non-main occlusion nodes, the total correction parameters, and the actual capture positions to generate adjustment parameters for the attitude capture device.

[0145] The adjustment parameters refer to the adjustment data used to adjust the posture capture device so that the captured single human posture image is unobstructed. These adjustment parameters are obtained by analyzing the positions of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture positions using a processing terminal. Specific methods are described in [reference needed]. Figure 4 This process provides data support for subsequent adjustments to the attitude capture device.

[0146] Step S305: Adjust the posture capture device according to the adjustment parameters, and control the posture capture device to acquire a single human posture image.

[0147] After the processing terminal determines the adjustment parameters, it controls the posture capture device to make adjustments based on the adjustment parameters, and controls the posture capture device to continue to acquire single human posture images, thereby providing data support for subsequent determination of whether a single human posture image is occluded, so as to continue to adjust the posture capture device.

[0148] Reference Figure 4 The steps for generating adjustment parameters for the attitude capture device by analyzing the positions of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture positions include:

[0149] Step S400: Substitute the main occlusion node position and the actual capture position into the preset arcsine function for calculation and summarization to generate the main occlusion node angle difference.

[0150] The main occlusion node angle difference refers to the set of horizontal angles between all nodes in the main occlusion node position and the actual capture position. This is calculated by the processing terminal using the sine value of the horizontal angle between each coordinate in the main occlusion position and the actual capture position, and then substituted into the arcsine function. By performing calculations and summarizing the results, the angle difference of the main occlusion nodes can be obtained, where, It refers to the horizontal angle between the position of a major occlusion node and the actual capture position. It refers to the sine value of the horizontal angle between a certain main occlusion node position and the actual capture position.

[0151] Step S401: Substitute the positions of the non-primary occlusion nodes and the actual capture positions into the arcsine function for calculation and summarization to generate the angle difference of the non-primary occlusion nodes.

[0152] The non-primary occlusion node angle difference refers to the set of horizontal angles between all nodes in the non-primary occlusion node position and the actual capture position. This is achieved by processing the terminal to calculate the sine value of the horizontal angle between each coordinate in the non-primary occlusion position and the actual capture position, and then substituting the obtained sine value into the arcsine function. By performing calculations and summarizing the results, the angle difference of the main occlusion nodes can be obtained, where, It refers to the horizontal angle between the position of a non-primary occlusion node and the actual capture position. It refers to the sine value of the horizontal angle between a non-primary occlusion node position and the actual capture position.

[0153] Step S402: Calculate the product of the angle difference between the main occluded nodes, the angle difference between the non-main occluded nodes, and the node weight parameters, and add the products together to generate the total node correction angle.

[0154] The total node correction angle refers to the total angle correction amount for correcting the angles of major occlusion nodes and non-major occlusion nodes. The processing terminal multiplies all angle data in the angle difference of major occlusion points with the major weight parameter in the node weight parameters, and then sums the products to obtain the major occlusion angle correction amount. Then, the processing terminal multiplies all angle data in the angle difference of non-major occlusion points with the non-major weight parameter in the node weight parameters, and then sums the products to obtain the non-major occlusion angle correction amount. Finally, the major occlusion angle correction amount and the non-major occlusion angle correction amount are added together to obtain the total node correction angle.

[0155] Step S403: Calculate the sum between the quotient of the total node correction angle and the total correction parameter and the preset right angle to generate the horizontal adjustment angle.

[0156] The horizontal adjustment angle refers to the angle value that the attitude capture device needs to adjust in the horizontal direction. The average correction angle is obtained by dividing the total node correction angle by the total correction parameter through the processing terminal. The horizontal adjustment angle is then obtained by adding the average correction angle to the right angle.

[0157] A right angle is the angle measured in degrees when the included angle is a right angle. In one embodiment, the right angle is 90°.

[0158] Step S404: Substitute the preset safety detection distance, main occlusion node position, non-main occlusion node position, actual capture position and total correction parameter into the preset distance adjustment calculation formula to generate the device adjustment distance.

[0159] The device adjustment distance refers to the distance that the attitude capture device needs to move when making adjustments. The processing terminal substitutes the preset safety detection distance, the position of the main occlusion node, the position of the non-main occlusion node, the actual capture position, and the total correction parameters into the distance adjustment calculation formula. Calculations are performed to generate the device adjustment distance, where, This refers to the equipment adjustment distance. This refers to the safe detection distance, in order to Taking 1.5m as an example, This refers to the distance between the centers of the occluded nodes. This refers to the total number of obstructed nodes, which is obtained by adding the number of primarily obstructed nodes and the number of non-primarily obstructed nodes through the processing terminal. It refers to the standard number of joint positions in the human body.

[0160] The occlusion node center distance refers to the horizontal distance between the center position of a single occlusion joint and the actual capture position. The occlusion node center distance is obtained by substituting the x-axis coordinates and y-axis coordinates of the center position of the single occlusion joint and the actual capture position into the Pythagorean theorem through the processing terminal, and then taking the square root of the calculated data.

[0161] The center position of a single occluded joint refers to the horizontal center position coordinates obtained after weighted averaging of the single occluded joint positions. The processing terminal multiplies the x-axis and y-axis coordinates of the main occluded node positions by the main weight parameters and then sums them to obtain the total weighted x-axis coordinates and y-axis coordinates of the main occluded node. Then, the x-axis and y-axis coordinates of the non-main occluded node positions are multiplied by the non-main weight parameters and then summed to obtain the total weighted x-axis coordinates and y-axis coordinates of the non-main occluded node. Finally, the total weighted x-axis coordinates of the main occluded node and the total weighted x-axis coordinates of the non-main occluded node are summed and divided by the total correction parameter to obtain the x-axis coordinate of the single occluded joint center. Similarly, the total weighted y-axis coordinates of the main occluded node and the total weighted y-axis coordinates of the non-main occluded node are summed and divided by the total correction parameter to obtain the y-axis coordinate of the single occluded joint center. Finally, the x-axis coordinates and y-axis coordinates of the single occluded joint center are summed to obtain the center position of the single occluded joint.

[0162] Step S405: Substitute the total node correction angle, the preset height of the rehabilitation patient, and the total correction parameters into the preset vertical angle calculation formula to generate the vertical adjustment angle.

[0163] The vertical adjustment angle refers to the vertical angle that needs to be adjusted when the posture capture device is adjusted. The total node correction angle, the height of the rehabilitation patient, and the total correction parameters are substituted into the vertical angle calculation formula through the processing terminal. By performing calculations, the vertical adjustment angle can be obtained, where, This refers to adjusting the angle vertically. This refers to the height of the person undergoing rehabilitation. Taking 1.7m as an example, This refers to the total number of obstructed nodes, which is obtained by adding the number of primarily obstructed nodes and the number of non-primarily obstructed nodes through the processing terminal. This refers to the node weight parameter. This refers to the vertical height difference between the nodes. Horizontal distance difference between nodes This refers to the total correction parameter.

[0164] The node vertical height difference refers to the data set of vertical height differences between the coordinates of all positions in a single occluded joint position and the actual capture position. The node vertical height difference can be obtained by processing the terminal by subtracting the z-axis coordinate of the actual capture position from the z-axis coordinate of all positions in a single occluded joint position and summarizing the results.

[0165] The node horizontal distance difference refers to the data set of horizontal distances between all position coordinates in a single occluded joint position and the actual capture position. The processing terminal substitutes the x-axis and y-axis coordinates of each position coordinate in the single occluded joint position and the x-axis and y-axis coordinates in the actual capture position into the Pythagorean theorem for calculation, takes the square root of the calculated data, and then summarizes all the data to obtain the node horizontal distance difference.

[0166] Step S406: Summarize the horizontal adjustment angle, equipment adjustment distance, and vertical adjustment angle to generate adjustment parameters.

[0167] The adjustment parameters in this step are the same as those in step S304 above. The adjustment parameters can be obtained by summarizing the horizontal adjustment angle, the equipment adjustment distance, and the vertical adjustment angle through the processing terminal.

[0168] Reference Figure 5 The steps for analyzing all posture state results to generate the target rehabilitation posture result include:

[0169] Step S500: Determine whether the overall attitude state results meet the preset requirements of a single attitude state result.

[0170] Among them, a single attitude state result refers to the result when there is only one set of data in the total attitude state results. The requirement for a single attitude state result is that there is only one set of data in the total attitude state results.

[0171] By processing the terminal, it is determined whether the overall attitude state results meet the requirements of a single attitude state result, thereby determining whether there is only one set of data in the overall attitude state results.

[0172] Step S5001: If satisfied, the overall posture state results are determined as the target rehabilitation posture results.

[0173] If the processing terminal determines that all posture state results meet the requirements of a single posture state result, it means that there is only one set of data in all posture state results. Therefore, the processing terminal determines all posture state results as the target rehabilitation posture result.

[0174] Step S5002: If not satisfied, then find all attitude data in the total attitude state results.

[0175] If the processing terminal determines that the overall posture state results do not meet the requirements of a single posture state result, it means that there is more than one set of data in the overall posture state results. Therefore, the processing terminal determines the overall posture data to provide data support for the subsequent determination of the target rehabilitation posture results.

[0176] The overall posture data refers to the data set used to store the human body edge images, joint motion angles, and joint activity frequencies corresponding to all image sequence numbers. By searching through the overall posture state results in the processing terminal, and grouping and storing them according to the image sequence number, the overall posture data can be obtained.

[0177] Step S50021: Match and search the entire posture data in the preset normal human posture database to generate matching posture data.

[0178] The matched posture data refers to the normal human posture data with the highest similarity to all posture data in the normal human posture database. The matching posture data is obtained by processing all posture data against the normal human posture database and then summarizing the data with the highest similarity.

[0179] A normal human posture database refers to a collection of data storing human postures for different age groups and genders.

[0180] Step S50022: Analyze the matched pose data and the overall pose data to generate abnormal pose data.

[0181] Abnormal pose data refers to the set of all pose data that have a low similarity to the matched pose data. After the processing terminal determines the matched pose data, it can obtain the abnormal pose data by analyzing the matched pose data and the entire pose data. For specific methods, please refer to [link to relevant documentation]. Figure 6 This process provides data support for determining the outcome of the target rehabilitation posture.

[0182] Step S50023: Collect the rehabilitation schedule.

[0183] The rehabilitation schedule refers to the current stage of rehabilitation for the target individual. In one embodiment, the operator can obtain the rehabilitation schedule by transmitting it to the processing terminal based on the actual situation.

[0184] Step S50024: Based on the rehabilitation schedule, find the target rehabilitation posture data range in the preset rehabilitation schedule posture data relationship.

[0185] The target rehabilitation posture data range refers to the range of posture data that the target rehabilitation person can achieve during rehabilitation training in the current rehabilitation schedule. The target rehabilitation posture data range can be obtained by the processing terminal by looking up the mapping table of posture data relationship in the rehabilitation schedule according to the rehabilitation schedule.

[0186] The rehabilitation schedule posture data relationship refers to the correspondence between the rehabilitation schedule and the rehabilitation posture data range. It is obtained by the operator by mapping the rehabilitation schedule to the corresponding rehabilitation posture data range one by one according to the actual situation.

[0187] Step S50025: Analyze the abnormal posture data and the target rehabilitation posture data range to generate the target rehabilitation posture result.

[0188] Specifically, after the processing terminal determines the range of abnormal posture data and target rehabilitation posture data, it analyzes these ranges. The specific method is described in [reference needed]. Figure 7 The steps are used to determine the outcome of the target rehabilitation posture.

[0189] Reference Figure 6 The steps for analyzing matched pose data and all pose data to generate anomalous pose data include:

[0190] Step S600: Find the single attitude data and the single matched attitude data in the total attitude data and the matched attitude data respectively.

[0191] Among them, single posture data refers to the data set corresponding to each image number in the entire posture data, which stores the joint motion angle. By processing the entire posture data according to the image number, the data of each group is extracted and stored in a new storage unit to obtain single posture data.

[0192] Single matched posture data refers to the set of data corresponding to each image number in the matched posture data, which stores the joint motion angles. By processing the matched posture data according to the image number, the data of each group is extracted and stored in a new storage unit, thus obtaining the single matched posture data.

[0193] Step S601: Calculate and summarize the differences between the single attitude data and the corresponding data in the corresponding single matched attitude data to generate attitude data difference values.

[0194] The attitude data difference value refers to the set of angle differences between all single attitude data and the corresponding single matched attitude data. The processing terminal subtracts the joint angles in the corresponding single matched attitude data from the joint motion angles in the single attitude data according to the image sequence number. The obtained data are summarized to obtain the angle data difference value of each group. Then, according to the image sequence number, all the angle data difference values are grouped, stored and summarized to obtain the attitude data difference value.

[0195] Step S602: Calculate the average value of the attitude data difference values and summarize them to generate the attitude data difference rate.

[0196] The attitude data difference rate refers to the set of data used to quantify the degree of deviation between each group of attitude data difference values and normal data. The processing terminal accumulates the data in each group of attitude data difference values according to the image sequence number, and then divides the accumulated result by the maximum total offset to obtain the attitude data difference rate of that group. Then, the remaining attitude data difference values corresponding to the image sequence number are processed in the same way, and the data are grouped, stored and summarized according to the image sequence number to obtain the attitude data difference rate.

[0197] The maximum total offset refers to the sum of the angles at which all joints can deviate from the normal data. In one embodiment, if a total of 33 joints are observed, the maximum total offset is 3300°.

[0198] Step S603: Match the attitude data difference rate with all attitude data to generate attitude data difference relationships.

[0199] Among them, the attitude data difference relationship refers to the correspondence between the attitude data difference rate and the entire attitude data. It is obtained by the processing terminal by mapping the data in the attitude data difference rate to the corresponding entire attitude data one by one according to the image sequence number to form a mapping table.

[0200] Step S604: Compare and summarize the attitude data difference rate with the preset normal attitude data difference rate range according to the attitude data difference relationship to generate abnormal attitude data.

[0201] In this step, the abnormal posture data is consistent with the abnormal posture data in step S above. The processing terminal compares the data in the posture data difference rate with the normal posture data difference rate range according to the image sequence number, extracts and summarizes the image sequence number corresponding to the data that does not belong to the normal posture data difference rate range, and then searches and summarizes the extracted image sequence number in the posture data difference relationship to obtain the abnormal posture data.

[0202] The normal posture data difference rate range refers to the ratio of the difference between the actual posture data collected and the normal posture data when the human body is in a normal posture. In one embodiment, the normal posture data difference rate range is 0 to 0.3.

[0203] Reference Figure 7 The steps for analyzing abnormal posture data and target rehabilitation posture data ranges to generate target rehabilitation posture results include:

[0204] Step S700: Locate the target joint range of motion and the target joint frequency within the target rehabilitation posture result range.

[0205] Among them, the target joint range of motion refers to the set of normal ranges of motion of each joint during rehabilitation training for the target rehabilitation person, and the target joint frequency of motion refers to the number of times each joint moves normally per unit time during rehabilitation training for the target rehabilitation person. The target joint range of motion and the target joint frequency of motion can be obtained by searching in the target rehabilitation posture result range through the processing terminal.

[0206] Step S701: Locate the actual joint movement angle and actual joint movement frequency in the abnormal posture data.

[0207] Among them, the actual joint motion angle refers to the motion angle of each joint in each group of abnormal posture data, and the actual joint motion frequency refers to the number of times each joint in each group of abnormal posture data moves per unit time. The actual joint motion angle and the actual joint motion frequency can be obtained by searching in the abnormal posture data.

[0208] Step S702: Compare the actual joint range of motion and the actual joint frequency of motion with the corresponding target joint range of motion and the target joint frequency of motion to generate comparison results.

[0209] Among them, the comparison result refers to the result used to judge the comparison, and the comparison result includes consistent result and inconsistent result.

[0210] The processing terminal compares the actual joint motion angle and actual joint motion frequency of each group in the abnormal posture data with the target joint motion angle range and target joint motion frequency, respectively. If the processing terminal determines that the actual joint motion angle is not within the target joint motion angle range, or the actual joint motion frequency is inconsistent with the target joint motion frequency, and either or both of these conditions occur simultaneously, the inconsistent result is determined as the comparison result. If neither of the above conditions occurs simultaneously, the consistent result is determined as the comparison result.

[0211] Step S703: Determine whether the comparison result is consistent with the preset data determination result.

[0212] The data determination result refers to the comparison result when the actual joint range of motion is within the range of the target joint range of motion and the actual joint frequency is consistent with the target joint frequency of motion.

[0213] By processing the terminal to determine whether the comparison results are consistent with the data determination results, it is determined whether the abnormal posture data corresponding to the actual joint movement angle and the actual joint movement frequency are the rehabilitation posture data of the target rehabilitation person.

[0214] Step S7031: If there is no consistency, continue to generate comparison results and perform cyclic judgment.

[0215] If the processing terminal determines that the comparison result is inconsistent with the data determination result, it means that the abnormal posture data corresponding to the actual joint movement angle and the actual joint movement frequency is not the rehabilitation posture data of the target rehabilitation person. Therefore, the processing terminal continues to compare the actual joint movement angle and the actual joint movement frequency of the next group in the abnormal posture data with the target joint movement angle range and the target joint movement frequency respectively, so as to generate comparison results for cyclic judgment, thereby determining the rehabilitation posture data.

[0216] Step S7032: If they match, the abnormal posture data corresponding to the comparison results shall be determined as rehabilitation posture data.

[0217] If the processing terminal determines that the comparison result is consistent with the data determination result, it means that the abnormal posture data corresponding to the actual joint movement angle and the actual joint movement frequency is the rehabilitation posture data of the target rehabilitation person. Therefore, the processing terminal determines the corresponding abnormal posture data as rehabilitation posture data, thereby providing data support for the subsequent determination of the target rehabilitation posture result.

[0218] Step S70321: Find the target rehabilitation posture result in the overall posture state results based on the rehabilitation posture data.

[0219] In this process, after the processing terminal determines the rehabilitation posture data, the target rehabilitation posture result can be obtained by searching the entire posture state results based on the target rehabilitation posture data.

[0220] Based on the same inventive concept, embodiments of this application provide a human pose recognition method based on dual-attention structured position coding, including:

[0221] The acquisition module is used to acquire human posture images, single human posture images, number of people in the rehabilitation room, the next single human posture image, the actual capture location, and the rehabilitation schedule.

[0222] The memory is used to store the program of the human pose recognition method based on dual attention structured position coding;

[0223] The processor and memory can load and execute the program to implement a human pose recognition method based on dual attention structured position coding.

[0224] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional modules is used as an example. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. The specific working process of the system, device, and unit described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0225] This application provides a computer-readable storage medium storing a computer program that can be loaded and executed by a processor for a human pose recognition method based on dual-attention structured position coding.

[0226] Computer storage media include, for example, USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, optical disks, and other media that can store program code.

[0227] Based on the same inventive concept, embodiments of this application provide a smart terminal, including a memory and a processor, wherein the memory stores a computer program that can be loaded and executed by the processor for a human pose recognition method based on dual attention structured position coding.

[0228] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional modules is used as an example. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. The specific working process of the system, device, and unit described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0229] The above are all preferred embodiments of this application and are not intended to limit the scope of protection of this application. Any feature disclosed in this specification (including the abstract and drawings) may be replaced by other equivalent or similar features unless specifically stated otherwise. That is, unless specifically stated otherwise, each feature is only one example of a series of equivalent or similar features.

Claims

1. A human pose recognition method based on a double-attention structured position encoding, characterized in that, include: Control the preset posture capture device to acquire human posture images in the preset rehabilitation room; Human pose images are matched one-to-one with preset image numbers to generate a human pose image library; Input the human pose image library into a preset human key point detection network to generate a human joint coordinate library; The data in the human joint coordinate library is processed by structured position encoding to generate a standard human joint position type relation library; Input the data from the standard human joint position type relation library into the preset global self-attention model to generate a dual-attention structured position encoding library; The human pose image library is input into a preset master encoding network to generate a global visual feature vector library; The dual-attention structured position encoding library is interacted with the global visual feature vector library to generate the overall pose state results; The results of all posture states are analyzed to generate the target rehabilitation posture results; The steps for controlling a preset posture capture device to acquire human posture images in a preset rehabilitation room include: Collect individual human posture images and the number of people in the rehabilitation room; A single human pose image is input into a human keypoint detection network to generate the location and number of joints in a single human body. Determine whether the number of joints in a single human body matches the preset standard number of joint positions in a human body; If they match, the single human posture image is determined as the actual single human posture image, and the next single human posture image is collected and the judgment is repeated according to the number of people in the rehabilitation room. If they are inconsistent, the position of a single human joint will be compared with the preset standard human joint position to generate a single occluded joint position. The posture capture device is adjusted based on the position of a single occluded joint, and the single human posture image is continuously acquired for cyclical judgment. The actual single human pose images are aggregated to generate human pose images; The steps of adjusting the pose capture device based on the position of a single occluded joint and continuously acquiring single human pose images for cyclical judgment include: The actual capture position of the attitude capture device is obtained; The locations of single occluded joints are classified according to the preset target rehabilitation areas to generate the locations of primary and non-primary occluded nodes. Count the number of primary occlusion nodes and the number of secondary occlusion nodes to generate the total number of primary occlusion nodes and the total number of secondary occlusion nodes; Calculate the product of the number of primary occlusion nodes and the number of non-primary occlusion nodes with the preset node weight parameters, and add the products together to generate the total correction parameters; The positions of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture positions are analyzed to generate adjustment parameters for the attitude capture device. The posture capture device is adjusted according to the adjustment parameters, and the posture capture device is controlled to acquire images of a single human posture. The steps for generating adjustment parameters for the attitude capture device by analyzing the locations of major occlusion nodes, non-major occlusion nodes, total correction parameters, and actual capture locations include: The positions of the main occluded nodes and the actual capture positions are substituted into a preset arcsine function for calculation and summarization to generate the angle difference of the main occluded nodes; Substitute the positions of non-primary occlusion nodes and the actual capture positions into the arcsine function for calculation and summation to generate the angle difference of non-primary occlusion nodes; Calculate the product of the angle difference between the main occluded nodes, the angle difference between the non-main occluded nodes, and the node weight parameters, and sum the products to generate the total node correction angle; Calculate the sum between the total node correction angle and the total correction parameter and the preset right angle to generate the horizontal adjustment angle; The preset safety detection distance, the location of the main obstruction node, the location of the non-main obstruction node, the actual capture location, and the total correction parameters are substituted into the preset distance adjustment calculation formula to generate the device adjustment distance. The total node correction angle, the preset height of the rehabilitation patient, and the total correction parameters are substituted into the preset vertical angle calculation formula to generate the vertical adjustment angle. The horizontal adjustment angle, equipment adjustment distance, and vertical adjustment angle are summarized to generate adjustment parameters.

2. The human pose recognition method based on double attention structured location encoding according to claim 1, characterized in that, The steps for analyzing all posture state results to generate the target rehabilitation posture result include: Determine whether the overall posture state results meet the preset requirements for a single posture state result; If the conditions are met, then the overall posture state results will be determined as the target rehabilitation posture results. If the conditions are not met, then find all attitude data in the total attitude state results. Matching posture data is generated by searching a pre-set database of normal human postures based on all posture data. The matched pose data and the overall pose data are analyzed to generate abnormal pose data; Collect rehabilitation schedule; Based on the rehabilitation schedule, the target rehabilitation posture data range is found in the preset rehabilitation schedule posture data relationship; The range of abnormal posture data and target rehabilitation posture data is analyzed to generate the target rehabilitation posture results.

3. The human pose recognition method based on double attention structured location encoding according to claim 2, characterized in that, The steps for analyzing matched pose data and all pose data to generate anomalous pose data include: Find the single pose data and the single matched pose data in the total pose data and the matched pose data respectively; Calculate the difference between the single pose data and the corresponding data in the corresponding single matched pose data and summarize them to generate pose data difference values; Calculate the average value of the attitude data differences and summarize them to generate the attitude data difference rate; The attitude data difference rate is mapped one-to-one with all attitude data to generate attitude data difference relationships; Based on the differences in attitude data, the attitude data difference rate is compared with the preset range of normal attitude data difference rate and summarized to generate abnormal attitude data.

4. The human pose recognition method based on double attention structured location encoding according to claim 2, characterized in that, The steps for analyzing abnormal posture data and target rehabilitation posture data range to generate target rehabilitation posture results include: Find the range of motion angles and frequency of motion of the target joint within the range of target rehabilitation posture results; Find the actual joint range of motion and the actual joint frequency in the abnormal posture data; The actual joint range of motion and the actual joint range of motion are compared with the corresponding target joint range of motion and the target joint range of motion to generate comparison results. Determine whether the comparison results are consistent with the preset data determination results; If they are inconsistent, continue generating comparison results and iteratively judging; If they match, the abnormal posture data corresponding to the comparison results will be identified as rehabilitation posture data. The target rehabilitation posture is identified from the overall posture results based on the rehabilitation posture data.

5. A human pose recognition system based on dual-attention structured location encoding, characterized in that, include: The acquisition module is used to acquire images of human posture. A memory for storing a program of a human pose recognition method based on dual attention structured position coding as described in any one of claims 1 to 4; The processor and the program in the memory can be loaded and executed by the processor to implement the human pose recognition method based on dual attention structured position encoding as described in any one of claims 1 to 4.