Hand-raising recognition method, electronic device and computer program product
By acquiring the gravitational acceleration information of the mobile terminal, utilizing the attitude detection unit and target attitude classification model, and combining neural networks and classification algorithms, the problem of low accuracy in existing hand-raising recognition methods under diverse scenarios is solved, achieving higher recognition accuracy and user experience.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- ZTE CORP
- Filing Date
- 2025-12-16
- Publication Date
- 2026-07-02
AI Technical Summary
Existing hand-raising recognition methods are based on a single logical rule, which cannot adapt to diverse user scenarios and have low recognition accuracy.
By acquiring the gravitational acceleration information of the mobile terminal, the attitude detection unit confirms the terminal's attitude, and the hand-raising recognition is performed based on the target attitude classification model. The recognition accuracy is improved by combining neural networks and classification algorithms. The introduction of preset thresholds and a judgment process of tending to be still ensures the accuracy of the recognition results.
It improves the accuracy and robustness of hand-raise recognition, provides a better user experience, reduces the possibility of erroneous operations, and adapts to hand-raise operations in various postures.
Smart Images

Figure CN2025142984_02072026_PF_FP_ABST
Abstract
Description
Methods, electronic devices, and computer program products for hand-raise recognition
[0001] Cross-reference of related applications
[0002] This disclosure is based on and claims priority to Chinese patent application CN2024119818881, filed on December 26, 2024, entitled “Method for Hand Raise Recognition, Electronic Device and Computer Program Product”, the entire contents of which are incorporated herein by reference. Technical Field
[0003] This disclosure relates to the field of data processing technology, and in particular to a method, electronic device, and computer program product for hand-raising recognition. Background Technology
[0004] With the development of smartphones, features that recognize hand gestures, such as raising your hand to wake the screen, have become a standard feature of smartphones as a convenient way to interact with computers.
[0005] Currently, commonly used hand-raise recognition methods are mainly based on preset logical judgment rules. For example, they determine whether a user has raised their hand by judging whether the data sensed by the accelerometer meets a preset threshold. With the rapid development of smart terminals, users' demands for interactive experiences are constantly increasing. However, existing recognition methods based on single logical rules cannot adapt to diverse user scenarios, and their recognition accuracy is also low. Summary of the Invention
[0006] Based on this, in order to better meet user needs and provide a better interactive experience, this disclosure provides a new hand-raise recognition scheme, which can at least achieve more accurate hand-raise recognition based on different postures.
[0007] According to a first aspect of this disclosure, a method for hand-raise recognition is provided, comprising: acquiring gravity acceleration information of a mobile terminal; confirming the posture of the mobile terminal when the gravity acceleration information meets preset conditions; confirming a target posture classification model corresponding to the posture based on the posture; and confirming the hand-raise recognition result based on the target posture classification model.
[0008] According to a second aspect of this disclosure, an electronic device is provided, including a memory storing one or more programs and a processor electrically coupled to the memory and configured to execute one or more programs to perform any method or step or combination thereof in this disclosure.
[0009] According to a third aspect of this disclosure, a computer program product is provided, comprising a computer program that, when run on a computer, causes the computer to perform any of the methods provided in the first aspect.
[0010] According to a fourth aspect of this disclosure, a computer-readable storage medium is provided that stores a computer program, which, when executed by a processor, causes to perform any of the methods or steps or combinations thereof disclosed herein. Attached Figure Description
[0011] To more clearly illustrate the technical solutions in the embodiments of this disclosure, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this disclosure. For those skilled in the art, other drawings can be obtained based on these drawings without exceeding the scope of protection claimed by this disclosure.
[0012] Figure 1 is a schematic flowchart of a hand-raising recognition method according to an embodiment of the present disclosure.
[0013] Figure 2 is a flowchart of training a target pose classification model according to an embodiment of the present disclosure.
[0014] Figure 3 is a schematic flowchart of a method for training a target pose classification model according to an embodiment of the present disclosure.
[0015] Figure 4 is a schematic diagram of the results of a sample of the gravitational acceleration components collected according to an embodiment of the present disclosure.
[0016] Figure 5 is a schematic diagram of smoothing processing according to an embodiment of the present disclosure.
[0017] Figure 6 is a schematic diagram of determining the point of greatest change from a sample of gravitational acceleration components according to an embodiment of the present disclosure.
[0018] Figure 7 is a schematic diagram of the structure of an electronic device provided in this disclosure. Detailed Implementation
[0019] The technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this disclosure. All other embodiments obtained by those skilled in the art based on the embodiments of this disclosure without creative effort are within the scope of protection of this disclosure.
[0020] Throughout the specification and claims, terms may have subtle meanings implied or implied in the context, rather than explicitly stated meanings. Similarly, the phrases “in one embodiment” or “in some embodiments” as used herein do not necessarily refer to the same embodiment, and the phrases “in another embodiment” or “in other embodiments” as used herein do not necessarily refer to different embodiments. The phrases “in one implementation” or “in some implementations” as used herein do not necessarily refer to the same implementation, and the phrases “in another implementation” or “in other implementations” as used herein do not necessarily refer to different implementations. For example, the claimed subject matter includes all or part of a combination of exemplary embodiments or implementations.
[0021] Generally, terms can be understood at least in part from their use in context. For example, terms used herein, such as “and,” “or,” and “and / or,” can include a variety of meanings, which can depend at least in part on the context in which they are used. Typically, “or,” when used in an associative list, such as A, B, or C, means A, B, and C, here used for inclusion, and A, B, or C, here used only for exclusion. Furthermore, the terms “one or more” or “at least one,” as used herein, depend at least in part on the context and can be used to describe any feature, structure, or characteristic in a singular sense, or can be used to describe a combination of features, structures, and characteristics in a plural sense. Similarly, terms such as “a,” “an,” or “described,” depend at least in part on the context and can be understood to convey either singular or plural usage. Moreover, also depending at least in part on the context, the terms “based on” or “determined by” can be understood not necessarily to indicate a set of exclusive factors; rather, they may allow for the presence of other factors that are not necessarily explicitly described.
[0022] Figure 1 is a schematic flowchart of a hand-raise recognition method according to an embodiment of the present disclosure. As shown in Figure 1, the method includes the following steps:
[0023] Step S101: Obtain the gravity acceleration information of the mobile terminal.
[0024] In some embodiments, the gravitational acceleration information of the mobile terminal is measured by sensors of the mobile terminal, such as a gravity sensing system or an accelerometer. The gravitational acceleration information may include gravitational acceleration components of the mobile terminal in the X, Y, and Z axes, or may include at least one of the gravitational acceleration components in one of the three axes.
[0025] Step S102: If the gravity acceleration information meets the preset conditions, confirm the attitude of the mobile terminal.
[0026] In some embodiments, it is first necessary to confirm whether the gravitational acceleration information meets preset conditions. During the prediction phase, the gravitational acceleration components in the Y-axis direction for the current time and the previous n time points can be processed using a smoothing method. After processing, the maximum and minimum values of the gravitational acceleration components in the Y-axis direction for the previous (na) time points are calculated to determine the gravitational acceleration component interval, such as [y_min, y_max]. If the Y-axis data for the current time and the previous n time points all exceed this interval, it can be considered that the mobile terminal has undergone a change different from the previous stable state, triggering confirmation of the mobile terminal's attitude. For example, if the current time is time 13, the maximum and minimum values of the gravitational acceleration components in the Y-axis direction for time points from time 0 to time 9 are calculated to determine the gravitational acceleration component interval. If the Y-axis data for times 13, 12, 11, and 10 all exceed this interval, confirmation of the mobile terminal's attitude can be triggered.
[0027] In some embodiments, step S102 may include: acquiring a first preset number of gravitational acceleration components in the Y-axis direction and determining a gravitational acceleration component interval; and confirming that the gravitational acceleration information meets preset conditions if the values of a second preset number of gravitational acceleration components in the Y-axis direction immediately following the first preset number of gravitational acceleration components are all outside the interval.
[0028] In some embodiments, the first-order difference value of the Y-axis in the gravitational acceleration information is calculated and binarized. The difference result is used to determine whether the user is raising their hand or remaining stationary. Analysis shows that when raising the mobile terminal while sitting, standing, walking, running, or lying down, if the absolute value of the first-order difference value of the gravitational acceleration component in the Y-axis direction of the mobile terminal is greater than a preset threshold, the mobile terminal can be confirmed to be in motion; conversely, if the absolute value of the first-order difference value of the gravitational acceleration component in the Y-axis direction of the mobile terminal is not greater than the preset threshold, the mobile terminal can be confirmed to be stationary. Preset thresholds can be obtained by analyzing data collected from different mobile terminals to distinguish between the motion and stationary states of the mobile terminal's posture, corresponding to raising the hand while moving or raising it while stationary. The posture of the mobile terminal is determined based on the absolute value of the first-order difference value of the gravitational acceleration component in the Y-axis direction in the gravitational acceleration information.
[0029] In some embodiments, the mobile terminal may include a posture detection unit capable of detecting and acquiring multiple different postures of the mobile terminal. The posture detection unit may include devices such as a gyroscope sensor and an accelerometer. The posture of the mobile terminal may include a first posture and a second posture, wherein the first posture and the second posture are postures related to a hand-raising operation of the mobile terminal. For example, the first posture and the second posture can be a hand-raising operation in any posture, such as sitting, lying down, standing, walking, or climbing stairs. The posture of the mobile terminal is confirmed based on the information detected by the posture detection unit.
[0030] Step S103: Based on the posture, confirm the target posture classification model corresponding to the posture.
[0031] In some embodiments, the target pose classification model is a model obtained through training. How to train and obtain the target pose classification model will be described in detail below with reference to Figure 2.
[0032] In some embodiments, the target pose classification model can correspond to the pose of the mobile terminal. After determining the pose of the mobile terminal, the target pose classification model corresponding to that pose is confirmed. For example, when the pose of the mobile terminal is determined to be a first pose, the target pose classification model corresponding to that pose is confirmed as the first pose model; when the pose of the mobile terminal is determined to be a second pose, the target pose classification model corresponding to that pose is confirmed as the second pose model. Here, the first pose and the second pose are different, and the first pose model and the second pose model are trained based on the target gravitational acceleration feature values under the corresponding poses.
[0033] In some embodiments, the first posture can be raising a hand while moving, and the second posture can be raising a hand while stationary. For the first and second postures of the mobile terminal, there are target posture classification models corresponding to moving and stationary states, respectively. For example, the first posture can be raising a hand while sitting, and the second posture can be raising a hand while standing. For the first and second postures of the mobile terminal, there are target posture classification models corresponding to raising a hand while sitting and raising a hand while standing, respectively. Furthermore, the first posture can be raising a hand while walking, and the second posture can be raising a hand while going up or down stairs. For the first and second postures of the mobile terminal, there are target posture classification models corresponding to raising a hand while walking and raising a hand while going up or down stairs, respectively, and so on.
[0034] After determining the mobile terminal's posture, the target posture classification model corresponding to that posture is identified. For example, if the mobile terminal is determined to be in motion, the target posture classification model corresponding to that motion is identified; if the mobile terminal is determined to be sitting and raising its arm, the target posture classification model corresponding to sitting and raising its arm is identified, and so on.
[0035] Step S104: Confirm the hand-raising recognition result based on the target posture classification model.
[0036] In some embodiments, the gravity acceleration information is processed according to the target posture classification model corresponding to the current posture of the mobile terminal to confirm the hand-raising recognition result. The hand-raising recognition result includes confirming that the current posture belongs to a hand-raising operation, including moving hand-raising and stationary hand-raising, or including sitting hand-raising, lying hand-raising, standing hand-raising, walking hand-raising, going up and down stairs, etc.
[0037] In some embodiments, if it is confirmed that the current posture is a hand-raising operation, the mobile terminal can be instructed to perform the corresponding operation or start the corresponding function, such as turning on the screen, playing music, etc. This disclosure does not make any limitation in this regard.
[0038] In some embodiments, the target pose classification model includes not only pose-specific models, such as those corresponding to raising an arm while sitting or lying down, but also a pose-insensitive target pose classification model, referred to as the overall target pose classification model. During the training process, the overall target pose classification model is trained based on the target gravitational acceleration feature values under various poses. The overall target pose classification model can handle various poses and confirm the arm-raising recognition result.
[0039] When the hand-raising recognition result is confirmed using a target pose classification model corresponding to the pose, it can generally be confirmed whether the hand-raising operation was performed or not. However, in some cases, the hand-raising recognition result cannot be confirmed using a target pose classification model corresponding to the pose. In such cases, the hand-raising recognition result can be confirmed using an overall target pose classification model.
[0040] Confirming the hand-raise recognition result through the target posture classification model corresponding to the posture can more quickly confirm whether it is a hand-raise operation. If the hand-raise recognition result cannot be confirmed by the target posture classification model corresponding to the posture, the overall target posture classification model can correct the misjudgment of posture in the process of confirming the mobile terminal's posture or make up for the deficiencies of the target posture classification model corresponding to the posture in the process of confirming the hand-raise recognition result. While improving the recognition accuracy, it can also play a backup role.
[0041] In some embodiments, the target pose classification model may include a neural network model trained using a neural network, or a model trained using a classification algorithm, such as KNN (K-Nearest Neighbors) or DTW (Dynamic Time Adjustment). During the recognition process, the target pose classification model typically bases its recognition on distance to the target category for models trained using classification algorithms, while neural network models typically base it on probability values. When using a target pose classification model for recognition, these recognition processes are based on distance or probability values. There are instances where, even if the distance is not close enough or the probability value is not high enough, some mobile terminal poses may still be forcibly classified into a known category, potentially leading to unreasonable classifications.
[0042] In some embodiments, preset threshold conditions can be set for the above situations. For example, if the target pose classification model is a classification algorithm model, the threshold value must be less than a preset distance; or if the target pose classification model is a neural network model, the threshold value must be greater than a preset probability value. The judgment criteria value, such as distance or probability value, is obtained during the pose classification process using the target pose classification model. If the judgment criteria value does not meet the preset threshold condition, the current pose is neither classified as a hand-raising operation nor a non-hand-raising operation, but rather as an unknown pose.
[0043] By using the above methods, we can not only improve the accuracy of posture classification and avoid misclassification, but also enhance the robustness of the system and provide a better user experience for mobile terminal users.
[0044] In other embodiments, when the target posture classification model identifies a non-raise-hand operation, the mobile terminal may not perform any operation or processing, such as not turning on the screen. When the target posture classification model identifies a raise-hand operation, the user may not yet have entered a static position, and the raise-hand action has just begun or is not yet halfway complete. If an operation or processing (such as turning on the screen) is performed directly based on a certain posture, incorrect operation or processing instructions may occur.
[0045] To address this, this disclosure establishes a process for determining when a raised hand is about to come to a standstill. In one embodiment, a preset number w is set for different raised hand behaviors before they are about to enter a standstill state. When a certain raised hand posture is continuously determined to be within a certain category more than w times, it can be considered that the user is currently performing a continuous raised hand action, and the raised hand has been in progress for a period of time and is about to enter a standstill state, thus confirming that the current posture belongs to a raised hand operation. The raised hand posture categories include raising the hand while sitting, lying down, standing, walking, and climbing stairs, etc. The preset number of times can be a value set by technicians based on experience, or any suitable value. Users can also adjust it according to their needs; this disclosure does not impose any restrictions on this.
[0046] In a specific embodiment, the preset number of times is 5 times. From the first moment to the fifth moment, it is continuously determined whether it belongs to a certain raised - hand gesture classification, and it is confirmed that the current gesture belongs to a raised - hand operation. After confirming that the current gesture belongs to a raised - hand operation, the mobile terminal can be instructed to perform an operation or processing, such as turning on the screen, etc.
[0047] By setting the judgment process for the raised - hand tendency to be stationary, the accuracy of recognizing a raised hand can be further improved, and the mobile terminal can be instructed to perform corresponding operations or processing at a more appropriate time.
[0048] In some embodiments, corresponding preset range intervals are set for the raised - hand gesture classification. For example, for raising the hand while sitting, the preset range intervals of the gravitational acceleration information of the X, Y, and Z axes include: - 9.8 < X < 9.8, 3 < Y < 9.8, 0 < Z < 9.8; for raising the hand while standing, walking, or running, the preset range intervals of the gravitational acceleration information of the X, Y, and Z axes include: - 9.8 < X < 9.8, 0 < Y < 6, 0 < Z < 9.8; for raising the hand while lying down, the preset range intervals of the gravitational acceleration information of the X, Y, and Z axes include: - 9.8 < X < 9.8, 0 < Y < 6, - 9.8 < Z < 0, and so on. It can be understood that the above - mentioned preset ranges are only examples, and those skilled in the art can set the preset range intervals according to experience, and these all fall within the scope covered by the present disclosure.
[0049] To ensure the accuracy of raising - hand recognition and at the same time ensure that the mobile terminal is in a suitable position, for example, the screen of the mobile phone faces the user instead of facing away from the user. In an exemplary embodiment, when the number of times of determining that the current gesture belongs to the raised - hand gesture classification reaches the preset number of times, it is further necessary to determine whether the gravitational acceleration information corresponding to the current gesture belongs to the preset range interval corresponding to the raised - hand gesture classification. When the gravitational acceleration information corresponding to the current gesture belongs to the preset range interval corresponding to the raised - hand gesture classification, it is confirmed that the current gesture belongs to a raised - hand operation.
[0050] Next, the present disclosure introduces the process of training the target gesture classification model. FIG. 2 is a flowchart of training the target gesture classification model according to an embodiment of the present disclosure. As shown in FIG. 2, generally speaking, the process of training the target gesture classification model may include: data collection, data processing, and gesture classification training. Among them, data processing may include determining the standard input length of the data set and performing data truncation according to the standard input length.
[0051] FIG. 3 is a schematic flowchart of a method for training the target gesture classification model according to an embodiment of the present disclosure. As shown in FIG. 3, the process includes the following steps.
[0052] Step S301: Collect target gravitational acceleration feature values of mobile terminals under various postures to obtain samples of multiple gravitational acceleration components under various postures.
[0053] In practical applications, mobile terminals exhibit diverse postures. Therefore, during the model training phase, a variety of hand-raising scenarios are collected as the training set, encompassing various states of real-world user interaction, to ensure the model matches the user's multi-posture hand-raising behavior.
[0054] In some embodiments, time-series values of the user lifting the mobile terminal and reaching a stable state in various postures such as sitting, lying down, standing, walking, and going up and down stairs are collected to form the target gravitational acceleration feature value of the mobile terminal, including gravitational acceleration components in the X-axis, Y-axis and Z-axis directions, and samples of multiple gravitational acceleration components under various postures are obtained.
[0055] In some embodiments, when collecting hand-raising action data for each posture, it is necessary to continuously collect steady-state data for a preset time, such as 3 to 5 seconds, throughout the entire process from the static / continuous stable motion state to the hand-raising state and back to the static / continuous stable motion state. This ensures that the complete hand-raising and stopping posture can be fully included after subsequent data truncation. The target gravitational acceleration feature value includes the gravitational acceleration components of the mobile terminal in the X, Y, and Z axis directions for a preset time. In addition, since each user's behavioral habits are different, it is necessary to collect hand-raising behaviors from as many different users as possible to ensure the diversity of training data. In a specific embodiment, the collected data used for training is shown in Figure 4, where data0, data1, and data2 represent the components of the target gravitational acceleration feature value in the X, Y, and Z axis directions, respectively.
[0056] In some exemplary embodiments, to eliminate interference caused by sensor noise and minor fluctuations in human movement during the acquisition process, the raw sample data of the acquired gravitational acceleration component can be smoothed. In one specific embodiment, a low-pass filter can be used to remove these minor fluctuations, highlighting the overall trend of the data and thus better reflecting the user's actual lifting motion characteristics. As shown in Figure 5, it can be seen that some minor fluctuation values are eliminated after smoothing.
[0057] Step S302: Determine the standard input length of the samples of multiple gravitational acceleration components.
[0058] After collecting samples of multiple gravitational acceleration components, in order to ensure that the samples can completely capture the entire process of the user raising their hand and coming to a stop, and to ensure the consistency of the input data length during model training, it is necessary to determine the standard input length of the samples of multiple gravitational acceleration components.
[0059] In some embodiments, in the sample data of gravitational acceleration components corresponding to sitting, standing, walking, and running postures, the first-order difference value of the gravitational acceleration component along the Y-axis is calculated and binarized. The difference result is used to determine whether the hand is raised or stationary. After analysis, it was found that when the mobile terminal is raised while sitting, standing, walking, running, or lying down, if the absolute value of the first-order difference value of the gravitational acceleration component along the Y-axis of the mobile terminal is greater than a preset threshold, it can be confirmed that the mobile terminal is in motion; while if the absolute value of the first-order difference value of the gravitational acceleration component along the Y-axis of the mobile terminal is not greater than the preset threshold, it can be confirmed that the mobile terminal is stationary.
[0060] In some embodiments, a corresponding threshold can be set for each posture (e.g., raising the arm while sitting, standing, walking, running, lying down, etc.). Specifically, for the postures of raising the arm while sitting, standing, walking, and running, the first-order difference value of the gravitational acceleration component in the Y-axis direction for each sample can be obtained. The number of data points where the first-order difference value of the gravitational acceleration component in the Y-axis direction for each sample is continuously greater than the corresponding threshold is recorded to determine the lifting time length corresponding to each sample. For the posture of raising the arm while lying down, the sample variation of the gravitational acceleration component in the Z-axis direction is most significant. The first-order difference value of the gravitational acceleration component in the Z-axis direction for each sample can be obtained. The number of data points where the first-order difference value of the gravitational acceleration component in the Z-axis direction for each sample is continuously greater than the corresponding threshold is recorded to determine the lifting time length corresponding to each sample.
[0061] In determining the standard input length of samples for multiple gravitational acceleration components, firstly, based on the first-order difference values of the samples for multiple gravitational acceleration components, the lift-off time length corresponding to each sample is determined, and multiple lift-off time lengths can be determined; then, the maximum value among the lift-off time lengths corresponding to each sample is determined as the standard input length.
[0062] Step S303: Extract the gravitational acceleration components of standard input length from the samples of multiple gravitational acceleration components, and use them as lifting posture data.
[0063] After determining the standard input length of the samples of multiple gravitational acceleration components, the gravitational acceleration components of that standard input length are extracted from the samples of multiple gravitational acceleration components to serve as lift-up attitude data.
[0064] In some embodiments, all sample data needs to be truncated to ensure that the time series length of each input sample is the same, and the length of the truncated sample is the standard input length. This ensures that the training data includes the complete hand-raising action, and also facilitates subsequent model training and prediction.
[0065] During the sample truncation process, a dynamic programming-based breakpoint detection algorithm (Pruned Exact Linear Time, PELT) can be used, employing the mean squared error as the cost function to determine the indices of multiple points of maximum variation from samples of multiple gravitational acceleration components. Then, using these indices as a reference, and according to a preset truncation method, gravitational acceleration components of standard input length are truncated from the samples of the corresponding multiple gravitational acceleration components. For example, for the determined point of maximum variation, a fixed length of m data can be extracted forward and backward from the original sample sequence based on this point, and a fixed length of m-1 data can be extracted backward, where m can be half the standard input length. After truncation, each sample sequence has a length of 2m. Those skilled in the art will understand that other suitable truncation methods can also be used, and this disclosure does not limit them in any way.
[0066] Figure 6 is a visualization example of the intercept point of the sample of the gravitational acceleration component. The green dashed line is the better intercept point. It can be seen that this intercept point is roughly located in the middle part of the raised hand posture. A fixed length of data can be extracted from the sample using the standard input length as the raised hand posture data.
[0067] Step S304: Based on the raised posture data and the corresponding raised hand posture classification, a classification algorithm is used to train the posture classification to obtain the target posture classification model.
[0068] In some embodiments, after capturing the raised posture data, the target posture classification model can be obtained by using a classification algorithm to train the posture classification based on the raised posture data and the corresponding raised hand posture classification.
[0069] In other embodiments, the standard input length of the gravity acceleration component can be extracted from the samples of multiple gravity acceleration components, excluding the lift-up posture data, and used as non-lift-up posture data.
[0070] After extracting both raised and unraised posture data, a classification algorithm is used to train the posture classification model based on the raised posture data and the corresponding raised posture classification, as well as the unraised posture data and the corresponding unraised posture classification.
[0071] In some embodiments, a classification algorithm is used for posture classification training. The training set x is the extracted hand-raising posture data, and the corresponding label set y can be {0: hand-raising while sitting, 1: hand-raising while standing, 2: hand-raising while walking, 3: hand-raising while running, 4: hand-raising while lying down, 5: hand-raising while going up or down stairs}. The training set x is the extracted hand-raising posture data and non-hand-raising posture data, and the corresponding label set y can be {0: hand-raising while sitting, 1: hand-raising while standing, 2: hand-raising while walking, 3: hand-raising while running, 4: hand-raising while lying down, 5: hand-raising while going up or down stairs, 6: no hand-raising}.
[0072] In some embodiments, the user's hand raising can be categorized based on whether it is a moving or stationary hand raising. x and y are used as inputs for classification training to obtain target posture classification models corresponding to moving and stationary hand raising, respectively. In some embodiments, the user can use x and y as inputs for classification training based on various postures of the mobile terminal (including hand raising while sitting, lying down, standing, walking, climbing stairs, etc.) to obtain multiple target posture classification models corresponding to various postures. In some embodiments, x and y are used as inputs for classification training to obtain a single target posture classification model that does not distinguish between postures; this is called the overall target posture classification model.
[0073] In some embodiments, KNN (K-Nearest Neighbors) combined with DTW (Dynamic Time Warping) can be used to train and model the preprocessed data x and y. This method has the following advantages for hand-raise classification and recognition:
[0074] (1) The KNN algorithm is simple and easy to understand, and it performs well in modeling small sample data. It can adapt well to this kind of hand-raising posture data which is not very diverse.
[0075] (2) The DTW distance metric can effectively avoid the problems caused by the non-fixed position of the lifting action. Unlike the Euclidean distance or Manhattan distance metric, DTW can flexibly perform time calibration on the sequence to find the optimal alignment path, thus more accurately reflecting the similarity between two action sequences.
[0076] The KNN combined with DTW approach makes full use of limited training samples to learn an effective model for recognizing different lifting postures. This method is highly suitable for scenarios developed in DSP embedded environments, as it is simple to implement and can quickly establish accurate action recognition capabilities.
[0077] In other embodiments, if applied in other environments, other time series classification algorithms may be adopted, such as neural network classification models based on CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory Network), Transformer, etc.
[0078] After obtaining the target pose classification model through pose classification training, the model file can be converted into a file such as tflite and embedded into the mobile terminal system. During the use of the mobile terminal, gravity acceleration information can be continuously monitored. When the gravity acceleration information meets preset conditions, the pose of the mobile terminal is confirmed, the corresponding tflite file of the target pose classification model is called, and the hand-raise recognition result is confirmed based on the target pose classification model.
[0079] According to the hand-raise recognition scheme provided in this disclosure, hand-raise recognition based on posture classification algorithm model and rule fusion can identify hand raises in various user postures, improving user experience. Simultaneously, the set rules for calling the target posture classification model ensure that the mobile terminal is only instructed to initiate processing or operation when needed, reducing power consumption. In the process of obtaining the target posture classification model, this disclosure collects as many hand-raise scenarios as possible as possible as a training set, including various states of real-world user use. In subsequent implementation, it ensures that the target posture classification model can match the user's multi-posture hand-raise operation behavior, guaranteeing recognition accuracy. Furthermore, by setting a judgment process for when the hand raise tends to be still, the accuracy of hand-raise recognition can be further improved, instructing the mobile terminal to perform corresponding operations or processing at a more appropriate time. In addition, the introduction of an "unknown" category judgment mechanism not only improves the accuracy of posture classification and avoids misclassification but also enhances the robustness of the system, providing a better user experience for mobile terminal users.
[0080] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions in other embodiments.
[0081] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this disclosure is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this disclosure. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are exemplary embodiments, and the actions and modules involved are not necessarily essential to this disclosure.
[0082] In the several embodiments provided in this disclosure, it should be understood that the disclosed apparatus can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the displayed or discussed mutual couplings, direct couplings, or communication connections may be through some interfaces; indirect couplings or communication connections between devices or units may be electrical connections or other forms.
[0083] Referring to Figure 7, Figure 7 provides an electronic device including a processor, a memory, and a sensor. The sensor is used to measure the gravitational acceleration information of a mobile terminal. The memory stores computer instructions or one or more programs, which, when executed by the processor, cause the processor to execute the computer instructions to implement the method and refined scheme shown in Figure 1.
[0084] It should be understood that the above-described device embodiments are merely illustrative, and the device disclosed in this invention can be implemented in other ways. For example, the division of units / modules described in the above embodiments is only a logical functional division, and there may be other division methods in actual implementation. For example, multiple units, modules, or components may be combined, integrated into another system, or some features may be ignored or not executed.
[0085] Furthermore, unless otherwise specified, the functional units / modules in the various embodiments of the present invention can be integrated into one unit / module, or each unit / module can exist physically separately, or two or more units / modules can be integrated together. The integrated units / modules described above can be implemented in hardware or as software program modules.
[0086] If the integrated unit / module is implemented in hardware, the hardware can be digital circuits, analog circuits, etc. The physical implementation of the hardware structure includes, but is not limited to, transistors, memristors, etc. Unless otherwise specified, the processor or chip can be any suitable hardware processor, such as a CPU, GPU, FPGA, DSP, and ASIC, etc. Unless otherwise specified, the on-chip cache, off-chip memory, and storage can be any suitable magnetic or magneto-optical storage medium, such as resistive random access memory (RRAM), dynamic random access memory (DRAM), static random access memory (SRAM), enhanced dynamic random access memory (EDRAM), high-bandwidth memory (HBM), hybrid memory cube (HMC), etc.
[0087] If the integrated unit / module is implemented as a software program module and sold or used as an independent product, it can be stored in a computer-readable storage device (CMD). Based on this understanding, the technical solution of this invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a memory and includes several instructions to cause a computer electronic device (which may be a personal computer, server, or network electronic device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this disclosure. The aforementioned memory includes various media capable of storing program code, such as USB flash drives, read-only memory (ROM), random access memory (RAM), portable hard drives, magnetic disks, or optical disks.
[0088] This disclosure also provides a computer-readable storage medium storing one or more computer programs that, when executed by a plurality of processors, cause the processors to perform the method and refinement shown in FIG1.
[0089] This disclosure also provides a computer program product, comprising a computer program that, when run on a computer, causes the computer to execute the hand-raising recognition method of any of the above embodiments.
[0090] References to features, advantages, or similar language in this specification do not imply that all features and advantages achievable with this solution should be included or included in any single implementation thereof. Rather, references to features and advantages are understood to mean that a particular feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment of this solution. Therefore, discussions of features, advantages, and similar language throughout this specification may, but do not necessarily, refer to the same embodiments.
[0091] Furthermore, the features, advantages, and characteristics described herein can be combined in any suitable manner in one or more embodiments. Based on the description herein, those skilled in the art will recognize that this solution can be implemented without one or more specific features or advantages of a particular embodiment. In other instances, additional features and advantages can be appreciated in specific embodiments not presented in all embodiments of this solution.
[0092] The embodiments of this disclosure have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this disclosure. The descriptions of the embodiments above are only for the purpose of helping to understand the methods and core ideas of this disclosure. Furthermore, any changes or modifications made by those skilled in the art based on the ideas of this disclosure, and on the specific implementation methods and application scope of this disclosure, are all within the scope of protection of this disclosure. Therefore, the content of this specification should not be construed as a limitation of this disclosure.
Claims
1. A method for hand-raise recognition, comprising: Obtain the gravitational acceleration information of the mobile terminal; If the gravity acceleration information meets the preset conditions, the attitude of the mobile terminal is confirmed; Based on the posture, determine the target posture classification model corresponding to the posture; The hand-raising recognition result was confirmed based on the target posture classification model.
2. The method of claim 1, further comprising: The target pose classification model is obtained through training; The training of the target pose classification model includes: The target gravitational acceleration feature values of the mobile terminal under various postures are collected to obtain samples of multiple gravitational acceleration components under various postures. The target gravitational acceleration feature values include the gravitational acceleration components of the mobile terminal in the X-axis, Y-axis and Z-axis directions for a preset duration. Determine the standard input length of the samples for the plurality of gravitational acceleration components; Gravitational acceleration components of the standard input length are extracted from the samples of the multiple gravitational acceleration components to serve as lift-up posture data; and Based on the raised posture data and the corresponding raised hand posture classification, a classification algorithm is used to train the posture classification to obtain the target posture classification model.
3. The method of claim 2, further comprising: In the samples of the multiple gravitational acceleration components, except for the lift-up posture data, the gravitational acceleration components of the standard input length are respectively extracted and used as non-lift-up posture data. Obtaining the target pose classification model includes: Based on the raised posture data and the corresponding raised hand posture classification, as well as the non-raised posture data and the corresponding non-raised hand posture classification, a classification algorithm is used to train the posture classification model to obtain the target posture classification model.
4. The method of claim 2, wherein, The standard input length for determining the samples of the plurality of gravitational acceleration components includes: Based on the sample first-order difference values of the multiple gravitational acceleration components, determine the lift-up time length corresponding to each sample; The maximum value among the lift-off time lengths corresponding to each sample is determined as the standard input length.
5. The method of claim 2, wherein, The step of extracting the gravitational acceleration component of the standard input length from the samples of the plurality of gravitational acceleration components includes: A breakpoint detection algorithm is used to determine the indices of multiple points with the greatest changes from samples of the multiple gravitational acceleration components; and Based on the index of the multiple points with the greatest changes, and according to the preset truncation method, the gravitational acceleration component of the standard input length is truncated from the samples of the corresponding multiple gravitational acceleration components.
6. The method according to any one of claims 1 to 5, wherein, The confirmation of the mobile terminal's attitude includes: The attitude of the mobile terminal is determined based on the absolute value of the first-order difference of the gravitational acceleration component in the Y-axis direction in the gravitational acceleration information; or The attitude of the mobile terminal is confirmed based on the information detected by the attitude detection unit of the mobile terminal.
7. The method as described in any one of claims 1 to 5, wherein, The step of determining the target pose classification model state corresponding to the pose based on the pose includes: When the pose of the mobile terminal is determined to be a first pose, the target pose classification model corresponding to the pose is confirmed as the first pose model; and When the mobile terminal's pose is determined to be the second pose, the target pose classification model corresponding to the pose is confirmed as the second pose model. The first posture is different from the second posture, and the first posture model and the second posture model are trained based on the target gravitational acceleration feature values under the corresponding postures.
8. The method as described in any one of claims 1 to 5, wherein, The target posture classification model includes an overall target posture classification model, and the confirmation of the hand-raising recognition result based on the target posture classification model includes: If the hand-raising recognition result cannot be confirmed by the target posture classification model corresponding to the posture, the hand-raising recognition result is confirmed based on the overall target posture classification model, wherein the overall target posture classification model is trained based on the target gravitational acceleration feature values under multiple postures.
9. The method according to any one of claims 1 to 5, wherein, The methods for confirming that the gravitational acceleration information meets the preset conditions include: Collect a first preset number of consecutive gravitational acceleration components along the Y-axis, and determine the interval of the gravitational acceleration components; and If the values of the second set number of consecutive gravitational acceleration components in the Y-axis direction following the first set number of gravitational acceleration components are all outside the range, then the gravitational acceleration information is confirmed to meet the preset conditions.
10. The method according to any one of claims 1 to 5, wherein, The confirmation of the hand-raising recognition result based on the target pose classification model includes: Obtain the judgment criteria value during the pose classification process of the current pose using the target pose classification model; and If the determination criteria value does not meet the preset threshold condition, the current posture is classified as an unknown posture.
11. The method according to any one of claims 1 to 5, wherein, The confirmation of the hand-raising recognition result based on the target pose classification model includes: If the current posture is identified as a hand-raising posture a preset number of times, then the current posture is confirmed as a hand-raising operation; or If the number of times the current posture is determined to belong to the raised hand posture category reaches a preset number, it is determined whether the gravitational acceleration information corresponding to the current posture belongs to the preset range interval corresponding to the raised hand posture category. If the gravitational acceleration information corresponding to the current posture belongs to the preset range interval corresponding to the raised hand posture category, it is confirmed that the current posture belongs to the raised hand operation.
12. An electronic device comprising a memory storing one or more programs and one or more processors, the one or more processors being electrically coupled to the memory and configured to execute the one or more programs to perform the method as claimed in any one of claims 1 to 11.
13. A computer program product, wherein, The computer program product includes a computer-readable storage medium storing a computer program operable to cause a computer to perform the method as described in any one of claims 1 to 11.