Gesture recognition method and gesture recognition device

A gesture recognition and recognition technology, applied in the field of human-computer interaction, can solve the problem of low accuracy and achieve the effect of high accuracy and high precision

Active Publication Date: 2015-05-06
SHENZHEN ORBBEC CO LTD
5 Cites 54 Cited by

AI-Extracted Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is to provide a gesture recognition method and device for the low accuracy of existing non-contact gesture ...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Method used

In the preferred embodiment of the present invention, in the dynamic gesture recognition step, the static gesture recognition step that each frame to be recognized image is carried out has the following characteristics, the adaptive weighting that it adopts calculates the weighted mean value process taking into account the movement of finger Orientation or the direction of movement of the entire hand is used as the standard for adaptive weighting. Specifically, based on the determined precise hand contour, the depth information of the center point of the hand contour is used to preliminarily judge the direction of hand movement, which can be divided into: a) the direction of hand movement is mainly perpendicular to the optical axis of the depth camera, b) the direction of hand movement The direction is mostly parallel to the depth camera optical axis. Then, use the method of adaptive weighting to call depth information and color information. When a) the direction of hand movement is mainly perpendicular to the optical axis of the depth camera, the weight of color information is greater than the weight of depth information, that is, w1>w2; when b) When the direction of hand movement is mainly parallel to the optical axis of the depth camera, the weight of color information is smaller than that of depth information, that is, w1
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Abstract

The invention relates to a gesture recognition method and a gesture recognition device. The gesture recognition method comprises a training step and a recognition step, wherein the training step comprise the sub-steps of: S1, synchronously acquiring a to-be-trained image with depth information and color information; S2, based on the depth information, determining a primary hand outline; S3, based on color information, determining a precise hand outline; S4, calling the depth information and the color information of the precise hand outline, determining a weighted average by self-adaption weighting to establish a three-dimensional gesture model, and training the three-dimensional gesture models of a plurality of to-be-trained images by using a classifier method to obtain an optimized three-dimensional gesture model. The recognition step is the same as the extraction mode of hand outline, and the optimized three-dimensional gesture model can be obtained according to S4, and the corresponding three-dimensional gesture is recognized. By using the depth information and the color information for recognition, the method has the advantages of high accuracy and high precision.

Application Domain

Technology Topic

Image

  • Gesture recognition method and gesture recognition device
  • Gesture recognition method and gesture recognition device

Examples

  • Experimental program(1)

Example Embodiment

[0047] In order to make the objectives, technical solutions and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments.
[0048] See figure 1 , Is a flowchart of a gesture recognition method according to a preferred embodiment of the present invention. Such as figure 1 As shown, the gesture recognition method provided by the preferred embodiment of the present invention includes a training step and a recognition step:
[0049] Wherein, the training step further includes steps S1-S4.
[0050] First, in step S1, an image to be trained with depth information and color information is synchronously acquired. This step can be implemented by one depth camera, at least one color camera and camera fixing components. The controller realizes the synchronous control of the images collected by the depth camera and the color camera. In this step, a depth camera combined with a color camera can be used to match the depth image and the color image, and the RGB-D color and depth image obtained in synchronization.
[0051] The method of acquiring an image with depth information in this step includes but is not limited to the following methods: (1) Depth information acquired based on structured light, such as depth camera of light coding structured light method, depth of laser speckle method Camera Primesense, Microsoft's depth camera kinect; depth map obtained by projection raster and fringe scanning methods. (2) Depth information obtained based on laser ranging. And (3) depth information obtained based on vision technology, etc.
[0052] Subsequently, in step S2, based on the depth information of the image to be trained, the hand contour of the image to be trained acquired in step S1 is determined as the primary hand contour.
[0053] In an embodiment of the present invention, the hand area can be detected directly based on the depth image formed by the depth information of the image to be trained to realize the extraction of the hand contour.
[0054] In another embodiment of the present invention, this step can be implemented in two steps:
[0055] First, in step S21, the depth information of the image to be trained is used to perform human body detection, obtain the contour of the human body region, and extract the depth information and color information of the human body region. The depth information of the image to be trained constitutes the depth image. When the human body area is separated from the background environment based on the depth information, the Laplacian-Gauss operator and other methods can be used to filter the depth image to remove noise, and use noise Threshold processing. In the process of noise processing, when there are obvious noise points in the depth image, you can use the OpenCV Function, define appropriate structural elements to corrode the source image, after removing redundant noise points, use The function expands the resulting image to remove most of the noise points. After removing the noise, you can call the entire depth image, use edge detection, use OpenCV The function performs methods such as dynamic depth threshold setting and human target feature point classification (Classification) to segment the human body region from the entire depth image, and at the same time segment the color information of the corresponding region based on this, so as to achieve human body detection and extract the human body Color and depth information of the area. In this embodiment, the color and depth images of the human body region are extracted first, and in the subsequent processing flow, only the data of the human body region can be transmitted, thereby reducing the computational load and increasing the processing speed.
[0056] Subsequently, in step S22, using the data of the human body region extracted in step S21, according to the depth information, the hand region is identified, the contour of the hand region is acquired as the primary hand contour, and the color information and depth information of the hand region are extracted.
[0057] In one embodiment of the present invention, the depth information is used to extract the human body area and then the hand detection is performed. The classifier method can be used to train and recognize the human body model based on the hand area characteristics to detect the hand area. Specifically: Use depth information to detect the approximate position of the hand area, which can be provided by openCV , The depth information is processed by the function, which can further segment a more accurate hand area contour. By setting the threshold of the contour area, and comparing with the contour area threshold, the matching results can be filtered, and the contour of the human hand can be obtained for segmentation. Get the contour of the hand.
[0058] Subsequently, in step S3, the color information in the primary hand contour in the image to be trained determined in step S2 is called, the hand contour is recognized, and the precise hand contour of the image to be trained is segmented. In some embodiments of the present invention, based on the primary hand contour obtained in step S2, the color information of the corresponding hand region can be called, and the accurate contour of the hand can be obtained through methods such as skin value threshold determination and edge extraction, including The edge of the finger enables precise segmentation of the hand.
[0059] The contour of the hand based on depth information recognition is not smooth and rough, and the color information of the image is generally obtained by a high-resolution color camera, and the image resolution is very high. The color information of the palm area can be combined to obtain a very good accuracy. Hand information. In this step, first extract the corresponding area of ​​the hand region obtained by the depth image in the color image, and through the skin color threshold judgment, it is possible to exclude other objects that are not human hands and other unqualified contours, filter out the matching results, and obtain only human hands Contours to reduce the interference of irrelevant information, and then extract the edges of the results obtained to obtain high-precision hand contours, including precise contours of fingers.
[0060] Finally, in step S4, call the depth information and color information in the precise hand contour of the image to be trained determined in step S3, calculate the weighted average through an adaptive weighting method, establish a three-dimensional gesture model, and use the classifier's The method trains the three-dimensional gesture models of multiple images to be trained to obtain the optimized three-dimensional gesture model. The formula for calculating the weighted average of the adaptive weighting is:
[0061] T=w 1 ·C color +w 2 ·D depth (1)
[0062] Where w 1 Is the adaptive weighting coefficient of color information, w 2 Is the adaptive weighting coefficient of depth information, C color Is color information, D depth For in-depth information.
[0063] In some embodiments of the present invention, the three-dimensional gesture model that can be established includes but is not limited to one or more of the following models: (a) Feature point connection model; (b) Model with skin texture information; (c) Depth point Cloud networking model; (d) geometric model.
[0064] In the preferred embodiment of the present invention, focus on using (a) the feature point connection model, which is mainly established through the following steps: First, call the depth information and color information in the accurate hand contour of the image to be trained, and adopt adaptive weighting Find out the concave and convex defects of the precise hand contour, determine the position of the fingertip of the finger, the connection position of the finger and the palm; then mark each finger with a line segment with depth information, and set the finger joint mark points according to the proportion. Establish the characteristic point connection model. Then, by collecting enough training samples, the model can be trained to obtain an optimized three-dimensional gesture model. The present invention can also set boundary conditions for the characteristic point connection model: 1. Set the range of movement angles of the finger joints; 2. Set the hand movement association. The set boundary conditions are related to the degree of freedom of the feature point connection model. In an embodiment of the present invention, a feature point connection model with 38 degrees of freedom is established by setting boundary conditions.
[0065] The present invention can collect real depth-color maps and/or depth-color maps virtually generated by computer vision methods as training samples, that is, images to be trained, to perform accurate hand contour recognition and three-dimensional gesture model establishment. Among them, the computer vision method can use a virtual depth map generator and computer vision-based 3D animation technology to generate a large number of depth maps and combine color maps as training samples. The three-dimensional gesture model corresponds to static gestures. The svm classification method and AdaBoosting algorithm can be used to classify static gestures, and based on enough training samples, an optimized three-dimensional gesture model can be established. The general steps of training data generation are: 1) Collect a large number of common gestures and generate hand depth images through key frame clustering as static training gestures; 2) Randomly generate camera parameters within a certain range, and align rendering with real-world coordinates , Use computer graphics rendering technology to generate the person's depth image and part identification map; 3) Post-processing the hand's depth image, including adding noise, resampling, etc., to make it closer to the real picture taken by the depth camera.
[0066] The recognition step in the gesture recognition method provided by the preferred embodiment of the present invention further includes steps S5-S8.
[0067] In step S5, the image to be recognized with depth information and color information is synchronously acquired. This step is the same as the aforementioned step S1, except that the acquired image is a target image that needs to be identified.
[0068] In step S6, the primary hand contour of the image to be recognized is determined based on the depth information of the image to be recognized obtained in step S6. This step is the same as the aforementioned step S2, except that the primary hand contour is extracted with the image to be recognized as the object.
[0069] In step S7, the color information in the primary hand contour determined in step S6 is called to segment the precise hand contour of the image to be recognized. This step is the same as the aforementioned step S3, except that the image to be recognized is used as the object to perform accurate hand contour extraction.
[0070] In step S8, call the depth information and color information in the precise hand contour determined in step S7, calculate the weighted average through adaptive weighting, and match with the optimized three-dimensional gesture model obtained by training in step S4, and identify the corresponding three-dimensional gesture. Among them, the process of calculating the weighted average by adaptive weighting can also be implemented by the aforementioned formula (1), the difference is that the adaptive weighting coefficient w of the color information 1 And depth information adaptive weighting coefficient w 2 The value of is set according to needs. In a preferred embodiment of the present invention, this step S8 further includes a static gesture recognition step and/or a dynamic gesture recognition step.
[0071] The static gesture recognition step can be specifically implemented by the following steps: First, set the boundary conditions of the optimized feature point connection model obtained by training in step S4 to generate the corresponding model parameter space; then, call the accurate hand contour of the image to be recognized The depth information and color information inside are calculated by adaptive weighting to calculate the weighted average value, which is determined to correspond to the point in the model parameter space, and the static gesture is recognized. In the preferred embodiment of the present invention, methods that are not limited to model matching, decision trees, random forests, recursive forests, nonlinear clusters, artificial neural network methods, etc., can be used for static gesture recognition training. When performing static gesture recognition, as the optimized three-dimensional gesture model is established, the method of using adaptive weighting to call depth information and color information, w 1 And w 2 In the feature point connection model corresponding to the static gesture, the information of the feature point and the connection line is maximized, that is, the gesture model has the most feature points and the most complete connection line.
[0072] The dynamic model recognition step needs to perform the aforementioned static gesture recognition step on multiple frames of images to be recognized, that is, a series of color image and depth image sequences are respectively subjected to static gesture recognition, and then the movement between the static gestures corresponding to each frame of color image and depth image is tracked Change, that is, to track the gesture movement, and then recognize these movement trajectories. This can be achieved through the following steps: firstly, the static gestures corresponding to the images to be recognized in each frame are identified, and the trajectories formed by these static gestures corresponding to points in the model parameter space are obtained. The static gesture corresponds to a point in the model parameter space, and the dynamic gesture corresponds to a trajectory in the model parameter space. Subsequently, the obtained trajectory classification corresponds to the model parameter space to generate a subset, and the subset corresponds to the dynamic gesture. To define such a subset is to define dynamic gestures. In an embodiment of the present invention, the definition of dynamic gestures utilizes the grammatical rules of sign language. According to the defined subset of dynamic gestures, the corresponding dynamic gestures can be determined. In a preferred embodiment of the present invention, dynamic gesture recognition includes finger motion trajectory tracking and recognition, and gesture motion trajectory tracking and recognition. In other preferred embodiments of the present invention, the static gesture corresponding to the depth-color image of the previous frame can be used to predict the static gesture corresponding to the depth-color image of the next frame during the recognition process, thereby improving the gesture tracking performance. Processing speed.
[0073] In a preferred embodiment of the present invention, the static gesture recognition step performed on each frame of the image to be recognized in the dynamic gesture recognition step has the following characteristics. The adaptive weighting process used to calculate the weighted average value takes into account the movement direction of the finger or the entire The direction of hand movement is the standard for adaptive weighting. Specifically, first, based on the determined precise hand contour, use the depth information of the center point of the hand contour to preliminarily determine the direction of hand movement, which is divided into: a) the direction of hand movement is mainly perpendicular to the optical axis of the depth camera, b) hand movement The direction is mainly parallel to the optical axis of the depth camera. Then, the method of using adaptive weighting to call depth information and color information, when a) the hand movement direction is mainly perpendicular to the optical axis of the depth camera, the weight of the color information is greater than the weight of the depth information, namely w 1w 2; When b) the hand movement direction is mainly parallel to the optical axis of the depth camera, the weight of the color information is less than the weight of the depth information, namely w 1 2. For example, when the entire hand is the palm and fingers moving in the vertical plane, for example, the color information is weighted by 80%, and the depth information is weighted by 20%; when the entire hand is the palm and fingers, the depth information is weighted by 80%. , Color information is weighted by 20%. The present invention uses adaptive dynamic setting of the weights of depth information and color information when recognizing dynamic gestures, so that when performing static gesture recognition on each frame of images, the corresponding three-dimensional gestures of each frame of color image and depth image and The matching effect of the 3D gesture model is more optimized, which improves the speed and accuracy of dynamic gesture recognition.
[0074] See figure 2 , Is a module block diagram of a gesture recognition device according to a preferred embodiment of the present invention. Such as figure 2 As shown, the gesture recognition device provided by the preferred embodiment of the present invention mainly includes a training module and a recognition module.
[0075] The training module further includes a first image acquisition unit 201, a first primary contour extraction unit 202, a first accurate contour extraction unit 203, and a model establishment unit 204.
[0076] The first image acquisition unit 201 is used to synchronously acquire an image to be trained with depth information and color information. The first image acquisition unit 201 can be implemented by a depth camera, at least one color camera, and camera fixing components. The controller realizes the synchronous control of the images collected by the depth camera and the color camera. The first image acquisition unit 201 may use a depth camera combined with a color camera to match the depth image and the color image, and synchronize the acquired RGB-D color and depth image.
[0077] The methods for acquiring an image with depth information in the first image acquisition unit 201 include, but are not limited to, the following methods: (1) Depth information acquired based on structured light, such as a depth camera of the light coding structured light method, laser light scattering Depth camera Primesense of spot method, kinect of Microsoft depth camera; depth map obtained by projection raster and fringe scanning method. (2) Depth information obtained based on laser ranging. And (3) depth information obtained based on vision technology, etc.
[0078] The first primary contour extraction unit 202 is connected to the first image acquisition unit 201 and is configured to determine, based on the depth information, the hand contour of the image to be trained acquired by the first image acquisition unit 201 as the primary hand contour.
[0079] In an embodiment of the present invention, the hand area can be detected directly based on the depth image formed by the depth information of the image to be trained, so as to realize the extraction of the hand contour.
[0080] In another embodiment of the present invention, the first primary contour extraction unit 202 can be implemented in two sub-units:
[0081] First, the first human body region extraction subunit uses the depth information of the image to be trained to perform human body detection, obtain the contour of the human body region, and extract the depth information and color information of the human body region. The depth information of the image to be trained constitutes the depth image. When the human body area is separated from the background environment based on the depth information, the Laplacian-Gauss operator and other methods can be used to filter the depth image to remove noise, and use noise Threshold processing. In the process of noise processing, when there are obvious noise points in the depth image, you can use the OpenCV Function, define appropriate structural elements to corrode the source image, remove the redundant noise points, and then use The function expands the resulting image to remove most of the noise points. After removing the noise, you can call the entire depth image, use edge detection, use OpenCV The function performs methods such as dynamic depth threshold setting and human target feature point classification (Classification) to segment the human body region from the entire depth image, and at the same time segment the color information of the corresponding region based on this, so as to achieve human body detection and extract the human body Color and depth information of the area. In this embodiment, the color and depth images of the human body region are extracted first, and in the subsequent processing flow, only the data of the human body region can be transmitted, thereby reducing the computational load and increasing the processing speed.
[0082] Subsequently, the first hand region extraction sub-unit uses the data of the human body region extracted in the first human body region extraction sub-unit to identify the hand region according to the depth information, obtain the contour of the hand region as the primary hand contour, and extract the hand Color information and depth information of the area.
[0083] In one embodiment of the present invention, the depth information is used to extract the human body area and then the hand detection is performed. The classifier method can be used to train and recognize the human body model based on the hand area characteristics to detect the hand area. Specifically: Use depth information to detect the approximate position of the hand area, which can be provided by openCV , The depth information is processed by the function, which can further segment a more accurate hand area contour. By setting the threshold of the contour area, and comparing with the contour area threshold, the matching results can be filtered, and the contour of the human hand can be obtained for segmentation. Get the contour of the hand.
[0084] The first precise contour extraction unit 203 is connected to the first primary contour extraction unit 202 and is used to call the color information in the primary hand contour in the image to be trained determined by the first primary contour extraction unit 202 to recognize the hand contour, Segment the precise hand contour of the image to be trained. In some embodiments of the present invention, based on the primary hand contour acquired by the first primary contour extraction unit 202, the color information of the corresponding hand region can be called, and the hand contour can be obtained through methods such as skin value threshold determination and edge extraction. The precise contour, including the edge of the finger, enables precise segmentation of the hand.
[0085] The contour of the hand based on depth information recognition is not smooth and rough, and the color information of the image is generally obtained by a high-resolution color camera, and the image resolution is very high. The color information of the palm area can be combined to obtain a very good accuracy. Hand information. In this process, firstly extract the corresponding area of ​​the hand region obtained by the depth image in the color image. Through the skin color threshold determination, other unqualified contours such as other objects that are not human hands can be excluded, and the matching results can be screened out to obtain only human hands. Contours to reduce the interference of irrelevant information, and then extract the edges of the results obtained to obtain high-precision hand contours, including precise contours of fingers.
[0086] The model building unit 204 is connected to the first precise contour extraction unit 203, and is used to call the depth information and color information in the precise hand contour of the image to be trained determined in the first precise contour extraction unit 203, and calculate it by an adaptive weighting method The weighted average is used to establish a three-dimensional gesture model, and a classifier method is used to train the three-dimensional gesture models of multiple images to be trained to obtain an optimized three-dimensional gesture model. Wherein, the model building unit 204 includes a subunit for calculating a weighted average through adaptive weighting. The adaptive weighting can also use the aforementioned formula (1) to calculate the weighted average.
[0087] As in some embodiments of the present invention, three-dimensional gesture models that can be established include but are not limited to one or more of the following models: (a) Feature point connection model; (b) Model with skin texture information; (c) Depth Point cloud networking model; (d) Geometry model.
[0088] In the preferred embodiment of the present invention, focus on using (a) the feature point connection model, which is mainly established through the following steps: First, call the depth information and color information in the precise hand contour of the image to be trained, and adopt adaptive weighting Find out the concave and convex defects of the precise hand contour, determine the position of the fingertip of the finger, the connection position of the finger and the palm; then mark each finger with a line segment with depth information, and set the finger joint mark points according to the proportion. Establish the characteristic point connection model. Then, by collecting enough training samples, the model can be trained to obtain an optimized three-dimensional gesture model. The present invention can also set boundary conditions for the characteristic point connection model: 1. Set the range of motion angles of the finger joints; 2. Set the hand movement association. The set boundary conditions are related to the degree of freedom of the feature point connection model. In an embodiment of the present invention, a characteristic point connection model with 38 degrees of freedom is established by setting boundary conditions.
[0089] The present invention can collect real depth-color maps and/or depth-color maps virtually generated by computer vision methods as training samples, that is, images to be trained, to perform accurate hand contour recognition and three-dimensional gesture model establishment. Among them, the computer vision method can use a virtual depth map generator and computer vision-based 3D animation technology to generate a large number of depth maps and combine color maps as training samples. The three-dimensional gesture model corresponds to static gestures. The svm classification method and AdaBoosting algorithm can be used to classify static gestures, and based on enough training samples, an optimized three-dimensional gesture model can be established. The general steps of training data generation are: 1) Collect a large number of common gestures, and generate hand depth images through key frame clustering, as static training gestures; 2) Randomly generate camera parameters within a certain range, and align rendering with real-world coordinates , Use computer graphics rendering technology to generate person depth images and part identification maps; 3) Post-processing the depth images of the hands, including adding noise, resampling, etc., to make them closer to the real pictures taken by the depth camera.
[0090] The recognition module in the gesture recognition device provided by the preferred embodiment of the present invention further includes a second image acquisition unit 211, a second primary contour extraction unit 212, a second precise contour extraction unit 213, and a gesture recognition unit 214.
[0091] The second image acquisition unit 211 is used to synchronously acquire the image to be recognized with depth information and color information. The implementation principle of the second image acquisition unit 211 is the same as that of the aforementioned first image acquisition unit 201, except that the acquired image is a target image that needs to be identified. Preferably, the second image acquisition unit 211 and the aforementioned first image acquisition unit 201 are implemented using the same software and firmware.
[0092] The second primary contour extraction unit 212 is connected to the second image acquisition unit 211, and is configured to determine the primary hand contour of the image to be identified based on the depth information of the image to be identified acquired by the second image acquisition unit 211. The implementation principle of the second primary contour extraction unit 212 is the same as that of the aforementioned first primary contour extraction unit 202, except that the primary hand contour is extracted with the image to be recognized as an object. Preferably, the second primary contour extraction unit 212 and the aforementioned first primary contour extraction unit 202 can be implemented using the same software and firmware.
[0093] The second precise contour extracting unit 213 is connected to the second primary contour extracting unit 212, and is used for calling the color information in the primary hand contour determined in the second primary contour extracting unit 212 to segment the precise hand contour of the image to be recognized. The implementation principle of the second precise contour extraction unit 213 is the same as that of the aforementioned second precise contour extraction unit 203, but the difference lies in that the image to be recognized is used as an object to perform precise hand contour extraction. Preferably, the second precise contour extraction unit 213 and the aforementioned second precise contour extraction unit 203 can be implemented using the same software and firmware.
[0094] The gesture recognition unit 214 is connected to the second precise contour extraction unit 213 and the model establishment unit 204, calls the depth information and color information in the precise hand contour determined in the second precise contour extraction unit 213, and calculates the weighted average through adaptive weighting , Match with the optimized three-dimensional gesture model obtained by training in the model establishment unit 204, and identify the corresponding three-dimensional gesture. Wherein, the gesture recognition unit 214 also includes a subunit for calculating a weighted average through adaptive weighting. The process of calculating the weighted average of this subunit can also be realized by the aforementioned formula (1), the difference is that the adaptive weighting coefficient w of the color information 1 And depth information adaptive weighting coefficient w 2 The value of is set according to needs. In a preferred embodiment of the present invention, the gesture recognition unit 214 further includes a static gesture recognition subunit and/or a dynamic gesture recognition subunit.
[0095] Among them, the static gesture recognition subunit can be implemented through the following steps: First, set the boundary conditions of the optimized feature point connection model obtained by the model building unit training to generate the corresponding model parameter space; then, call the precise image to be recognized The depth information and color information in the hand contour are calculated by adaptive weighting to calculate the weighted average value, which is determined to correspond to the point in the model parameter space, and the static gesture is recognized. In the preferred embodiment of the present invention, methods that are not limited to model matching, decision trees, random forests, recursive forests, nonlinear clusters, artificial neural network methods, etc., can be used for static gesture recognition training. When performing static gesture recognition, as the optimized three-dimensional gesture model is established, the method of using adaptive weighting to call depth information and color information, w 1 And w 2 In the feature point connection model corresponding to the static gesture, the information of the feature point and the connection line is maximized, that is, the gesture model has the most feature points and the most complete connection line.
[0096] The dynamic model recognition sub-unit needs to perform the aforementioned static gesture recognition on multiple frames of images to be recognized, that is, a series of color images and depth image sequences are respectively subjected to static gesture recognition, and then track the static gestures between the color images and depth images of each frame. Movement changes, that is, to track gesture movements, and then identify these movement tracks. This can be specifically achieved by the following steps: firstly, the static gestures corresponding to the images to be recognized in each frame are recognized, and the trajectories formed by these static gestures corresponding to points in the model parameter space are obtained. The static gesture corresponds to a point in the model parameter space, and the dynamic gesture corresponds to a trajectory in the model parameter space. Subsequently, the obtained trajectory classification corresponds to the model parameter space to generate a subset, and the subset corresponds to the dynamic gesture. To define such a subset is to define dynamic gestures. In an embodiment of the present invention, the definition of dynamic gestures utilizes the grammatical rules of sign language. According to the defined subset of dynamic gestures, the corresponding dynamic gestures can be determined. In a preferred embodiment of the present invention, dynamic gesture recognition includes finger motion trajectory tracking and recognition, and gesture motion trajectory tracking and recognition. In other preferred embodiments of the present invention, the static gesture corresponding to the depth-color image of the previous frame can be used to predict the static gesture corresponding to the depth-color image of the next frame during the recognition process, thereby improving the gesture tracking performance. Processing speed.
[0097] In a preferred embodiment of the present invention, the dynamic model recognition subunit includes a subunit for calculating a weighted average through adaptive weighting in performing static gesture recognition on each frame to be recognized. The static gesture recognition performed by the image has the following characteristics. The process of calculating the weighted average value by adaptive weighting takes into account the moving direction of the finger or the moving direction of the entire hand as the standard to set the adaptive weighting. Specifically, first, based on the determined precise hand contour, use the depth information of the center point of the hand contour to preliminarily determine the direction of hand movement, which is divided into: a) The direction of hand movement is mainly perpendicular to the optical axis of the depth camera, b) Hand movement The direction is mainly parallel to the optical axis of the depth camera. Then, the method of using adaptive weighting to call depth information and color information, when a) the hand movement direction is mainly perpendicular to the optical axis of the depth camera, the weight of the color information is greater than the weight of the depth information, namely w 1w 2; When b) the direction of hand movement is mainly parallel to the optical axis of the depth camera, the weight of the color information is smaller than the weight of the depth information, that is, w 1 2. For example, when the entire hand is the palm and fingers moving in the vertical plane, for example, the color information is weighted by 80%, and the depth information is weighted by 20%; when the entire hand is the palm and fingers, the depth information is weighted by 80%. , Color information is weighted by 20%. The present invention uses adaptive dynamic setting of the weights of depth information and color information when recognizing dynamic gestures, so that when performing static gesture recognition on each frame of images, the corresponding three-dimensional gestures of each frame of color image and depth image and The matching effect of the 3D gesture model is more optimized, which improves the speed and accuracy of dynamic gesture recognition.
[0098] In summary, the gesture recognition method and device provided by the present invention use depth information and color information for recognition at the same time, and an adaptive weighting method is used in the recognition process, which can use the depth information to accurately recognize the depth pitch and avoid color The information cannot distinguish the defects of the front and back distance, and at the same time, the high-resolution, high-pixel color image can be used to accurately call the depth information of the corresponding area, which has the advantages of high accuracy and high precision.
[0099] It should be noted that the principles and implementations of the gesture recognition method and device in the present invention are the same, so the detailed description of the embodiments of the gesture recognition method is also applicable to the gesture recognition device. The present invention is described based on specific embodiments, but those skilled in the art should understand that various changes and equivalent substitutions can be made without departing from the scope of the present invention. In addition, in order to adapt to specific occasions or materials of the technology of the present invention, many modifications can be made to the present invention without departing from its protection scope. Therefore, the present invention is not limited to the specific embodiments disclosed herein, but includes all embodiments falling within the protection scope of the claims.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Similar technology patents

Classification and recommendation of technical efficacy words

  • High precision
  • Improve accuracy

Cassette-based dialysis medical fluid therapy systems, apparatuses and methods

InactiveUS20050209563A1Improvement for dialysisImprove accuracyMedical devicesPeritoneal dialysisAccuracy improvementDialysis
Owner:BAXTER INT INC +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products