End-to-end multi-view three-dimensional human body posture estimation method and system and storage medium

Active Publication Date: 2021-03-26
UNIVERSITY OF CHINESE ACADEMY OF SCIENCES
3 Cites 1 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0006] In view of the above problems, the object of the present invention is to provide an end-to-end multi-view 3D human pose estimation method, system and storage medium, w...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

In above-mentioned steps 1, because the network of the present application can carry out end-to-end training from input RGB image Ic to output predictive value y, adopt optimized joint mean square error loss function can improve network in training process Robustness to outliers, the loss function is:
In the present embodiment, extract the complete training set of 1/5 and two-dimensional human body posture data set COCO and MPII as the training set of two-dimensional human body posture estimation network Resnet-152 with the mode of interval 4 frames, make training The samples have a sample distribution similar to that of the complete training data, and can learn better human priors, so that the model itself can be generalized to other application scenarios, and the network training time for 2D human pose estimation is greatly shortened. The images in the training set are uniformly adjusted to a 384x384 image I, and each batch is set to 16 images and sent to the network in a random sampling manner, the loss function is set to Lmse, and the Adam optimizer is used, and the epoch is 1 to 20 When the learning rate is set to 0.001, the learning rate is set to 0.0001 when the epoch is 20-25, and the learning rate is set to 0.00001 when the epoch is 25-30 to carry out the training of the two-dimensional human pose estimation network Resnet-152, and use The post-processing method of linear algebraic triangulation performs benchmark evaluation on the ...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention relates to an end-to-end multi-view three-dimensional human body posture estimation method and system and a storage medium, and the method comprises the steps: loading a pre-trained two-dimensional human body posture estimation network, and enabling a picture of each current view to serve as the input of the network; generating a thermodynamic diagram through a two-dimensional humanbody posture estimation network, and taking the thermodynamic diagram as the input of an LSTM thermodynamic diagram time sequence information extraction network; inputting the thermodynamic diagram into an LSTM initialization thermodynamic diagram time sequence information extraction network and an LSTM thermodynamic diagram time sequence information extraction network according to the value of the time sequence step length T to obtain a cell state and a hidden state; feeding the obtained hidden state into a decoder network to obtain a decoded thermodynamic diagram; fusing the thermodynamic diagram and the decoded thermodynamic diagram to obtain a thermodynamic diagram Ht (p) fused with time and space information; sending the thermodynamic diagram Ht (p) into a soft-argmax linear algebraictriangulation network to obtain a 2D point position; and solving an overdetermined equation on the homogeneous three-dimensional coordinate vector, and adopting a differentiable DLT-SII algorithm toobtain a final three-dimensional human body posture estimation point.

Application Domain

Technology Topic

PhysicsCell state +7

Image

  • End-to-end multi-view three-dimensional human body posture estimation method and system and storage medium
  • End-to-end multi-view three-dimensional human body posture estimation method and system and storage medium
  • End-to-end multi-view three-dimensional human body posture estimation method and system and storage medium

Examples

  • Experimental program(1)

Example Embodiment

[0122] Example:
[0123] In this embodiment, the Human3.6M dataset (Human3.6M: Large Scale Datasets and Predictive Methods for 3D HumanSensing in Natural Environments), the largest multi-view 3D human pose estimation dataset at present, is used, which consists of four datasets in time. Synchronized 50Hz camera shooting, using the marker-based MoCap system to collect 3D human pose data, the data set contains a total of 3.6 million images, consisting of 11 sets of data including 5 sets of female data and 6 sets of male data. The 1st, 5th, 6th, 7th, and 8th data sets of 1.5 million images are used as training sets, and the 9th and 11th sets of data are used as test sets. One-fifth of the complete training set and two-dimensional human pose data sets COCO and MPII are extracted at intervals of 4 frames as the training set of the two-dimensional human pose estimation network Resnet-152, so that the training samples have samples similar to the complete training data. distribution, and can learn better human priors, so that the model itself can be generalized to other application scenarios, and the network training time for two-dimensional human pose estimation is greatly shortened. The training set images are uniformly adjusted to 384×384 image I, and each batch is set to 16 images and sent to the network by random sampling, and the loss function is set to L mse , using the Adam optimizer, set the learning rate to 0.001 when the epoch is 1 to 20, set the learning rate to 0.0001 when the epoch is 20 to 25, and set the learning rate to 0.00001 when the epoch is 25 to 30. The training of the human pose estimation network Resnet-152, and the benchmark evaluation of the network performance on the MPJPE (Mean Per Joint Position Error) index using the post-processing method of linear algebraic triangulation, and save the two-dimensional human pose estimation network Resnet-152 network weight information. Load the pre-training weight information of the two-dimensional human pose estimation network Resnet-152, uniformly adjust the training set images to 384 × 384 image I, and input the image information of different perspectives at the same time in a chronological order, here Use the 1st, 5th, 6th, 7th, and 8th sets of data of the complete 1.5 million pictures as the training set, set the time series T to 5, and one batch is the pictures from different perspectives at the same time. Since Human3.6M is 4 perspectives, 2 batches are set, that is, 8 pictures are sent to the network, and the loss function is set to α is set to 0.0001, the Adam optimizer is used, the learning rate is set to 0.0001, and the training is performed for 5 epochs.
[0124] Through the above steps, the present invention can realize the three-dimensional human posture estimation based on the multi-view image. In order to verify the validity and practicability of the method proposed in the present invention, an example on the Human3.6M data set is given below, and Table 1 is the test set of Human3.6M and method M (Multi-View Martinez), method T (Tome D, Toso M, AgapitoL, et al. Rethinking pose in 3d:Multi-stage refinement and recovery for markerless motion capture[C]//2018international conference on 3D vision(3DV).IEEE, 2018:474-483.), Method P (Pavlakos G, Zhou X, Derpanis K G, et al. Harvestingmultiple views for marker-less 3d human pose annotations[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017:6988-6997.) and method K(Kadkhodamohammadi A, Padoy N.A generalizable approach for multi-view 3d human pose regression[J].Machine Vision and Applications, 2020, 32(1):1-14.) and other methods compared the detection results, the various measurement standards are MPJPE (Mean Per Joint PositionError).
[0125] Table 1 MPJPE comparison results of the method of the present invention and other methods on the Human3.6M data set (unit: mm)
[0126]
[0127]
[0128] As can be seen from Table 1, compared with other multi-view-based three-dimensional human pose estimation methods, the improved algorithm proposed by the present invention has better performance for multi-three-dimensional human pose estimation. Most test categories are greatly improved by using this embodiment, which proves the effectiveness of the invention. also, image 3 The visual detection results of , can also illustrate the performance superiority of the present invention.
[0129] As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products