Smart watch interaction method based on gesture actions

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a smartwatch interaction method based on facial images and gestures, and utilizing eye and facial features to determine the direction of gaze, combined with preset judgment rules and a mapping database, the problem of complex smartwatch interaction commands and misrecognition is solved, achieving an efficient and secure user interaction experience.

CN121979397BActive Publication Date: 2026-06-19CHONGQING ZHOUHAI INTELLIGENT TECH CO LTD

View PDF 4 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHONGQING ZHOUHAI INTELLIGENT TECH CO LTD
Filing Date: 2026-04-02
Publication Date: 2026-06-19

Application Information

Patent Timeline

02 Apr 2026

Application

19 Jun 2026

Publication

CN121979397B

IPC: G06F3/01; G06F16/242; G06V10/82; G06F9/4401; G06V40/16; G06V40/18

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Smart phone control method based on eye fixation and facial micro-action
CN116684526A
A network learning auxiliary method and system
CN111178189B
Intelligent sight interaction system for emotional communication
CN120276595A
Eye movement tracking method based on coupling cascade regression
CN114973389A
Sight interaction method and device based on single target
CN114779925A

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing smartwatch interaction methods suffer from complex interaction commands, are easily confused with everyday behaviors leading to misidentification, and are difficult to operate on small-screen devices, resulting in a poor user experience.

⚗Method used

By collecting users' facial images and gestures, and using eye and facial features to determine the user's gaze direction and operational intentions, combined with preset judgment rules and a mapping relationship database, the association and structured integration of gaze and gestures are realized to form operation gestures, thereby improving the accuracy and convenience of interaction.

🎯Benefits of technology

It improves the accuracy and convenience of smartwatch operation, reduces the error rate, conforms to users' natural interaction habits, adapts to individual differences among different users, and ensures the stability and security of interaction.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN121979397B_ABST

Patent Text Reader

Abstract

This invention relates to the field of smartwatches, specifically disclosing a gesture-based smartwatch interaction method. The method involves cropping an eye image from a facial image and separating the pupil region based on the grayscale distribution of the eye image; calculating the pupil center coordinates; constructing an eye coordinate system with the midpoint of the line connecting the inner corners of the eyes as the origin, the X-axis pointing from the inner corner to the outer corner, and the Y-axis pointing from the midpoint of the lower eyelid to the midpoint of the upper eyelid; mapping the pupil center coordinates to the eye coordinate system and calculating the pupil's offset relative to the origin; determining the gaze direction based on the offset and an offset threshold; determining the operation direction of the gesture based on the gaze direction; and associating and structurally integrating the operation direction and the gesture according to a differentiated mapping rule to form an operation gesture containing the operation type and direction; and obtaining the corresponding operation command based on a database of operation gesture and interaction command mapping relationships. This approach improves the user interaction experience.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of electronic smartwatch technology, and in particular to a smartwatch interaction method based on gestures. Background Technology

[0002] With the rapid development of wearable device technology, smartwatches and other touchscreen wearable devices have become an important medium for human-computer interaction due to their portability. However, existing touchscreen wearable devices are limited by screen size, which can easily cause fingers to obscure the displayed content during touch operation, and they also require extremely high precision in touch positioning, making operation difficult.

[0003] In response, Chinese patent application CN105094675A offers an optimized solution. It first acquires the user's hand gestures on the smartwatch's operating plane (which is not on the same horizontal plane as the smartwatch's display plane, but rather an enlarged plane), then acquires and responds to the corresponding machine operation commands. This makes the area of the object's plane much larger than the touch interface of the touchscreen wearable device. When the user interacts with the smartwatch on its plane, it avoids the hand obstructing the touch operation caused by the small screen of the touchscreen wearable device, improving the accuracy of finger movements. However, the smartwatch plane and the screen are displayed in the same direction relative to the user; that is, the user's gaze needs to pass through the smartwatch plane to see the content on the screen, or the smartwatch plane can be projected onto a wall using projection technology. However, this operation method can easily affect the user's line of sight, or require the user to keep one hand stable in front of their body while the other hand reaches across the smartwatch plane to operate, resulting in a poor user experience.

[0004] In this regard, to enhance the user's interactive experience when operating a smartwatch, Chinese patent application CN115509358A discloses a gesture interaction method. This method involves recognizing a user's hand, which is wearing a wearable device, making a first finger movement to play a virtual musical instrument. Based on the first finger movement, the method plays the sound corresponding to the first playing unit of the virtual instrument, displays the first effect of the first playing unit being played, and identifies and prompts the user's playing errors by comparing the user's playing pitch and rhythm with standard pitch and rhythm. This achieves interactive performance with the virtual musical instrument, simulating a real music playing scenario and enhancing the user's interactive experience.

[0005] The applicant intends to apply the aforementioned technology to the daily interaction of smartwatches. However, after in-depth research, it was found that due to the complexity of the overall operation logic of smartwatches, which typically requires diverse commands such as launching applications, switching functions, adjusting parameters, and selecting modes, the currently common gesture interaction methods suffer from drawbacks such as complex interaction commands and easy confusion with daily actions, leading to misrecognition. Based on the above technical solutions, currently only finger and wrist movements related to virtual instrument playing or other single commands can be recognized. In actual use, if the above technology is still used for gesture control, the interaction effect will be poor.

[0006] Therefore, there is an urgent need for a smartwatch interaction method that can improve the user experience. Summary of the Invention

[0007] This invention provides a smartwatch interaction method based on gestures, which can improve the user interaction experience.

[0008] To solve the above-mentioned technical problems, this application provides the following technical solution:

[0009] The gesture-based smartwatch interaction method includes the following steps:

[0010] Step 1: Keep the screen off and in standby mode. After receiving the user's preset wake-up gesture, the smartwatch enters the preset wake-up mode; or enters the preset wake-up mode after the preset conditions are met; after entering the preset wake-up mode, it detects its own status to obtain the status mode information.

[0011] Step 2: After clarifying the state mode information, collect the user's input gestures, and then determine whether the gestures match the wake-up gestures according to preset judgment rules:

[0012] If a match is found, proceed to step 4;

[0013] If there is no match, the user's facial image is captured, and eye features and expression features are extracted from the facial image. It is then determined whether the eye features meet the preset gaze conditions and whether the expression features meet the preset non-misoperation conditions: if yes, proceed to step 3; if no, return to step 1.

[0014] Step 3, determine the user's gaze direction based on the facial image: crop the eye image from the facial image, and separate the pupil region based on the grayscale distribution of the eye image; calculate the pupil center coordinates ( Identify the inner corner of the eye, the outer corner of the eye, the midpoint of the upper eyelid, and the midpoint of the lower eyelid; use the midpoint of the line connecting the inner corners of the eyes as the origin ( Construct an eye coordinate system with the inner corner of the eye pointing to the outer corner as the X-axis and the midpoint of the lower eyelid pointing to the midpoint of the upper eyelid as the Y-axis; then, establish the coordinates of the pupil center (…). Mapped to the eye coordinate system, the pupil's position relative to the origin is calculated. offset of ) ); based on offset ( ) and offset threshold ( Determine the direction of your gaze;

[0015] The direction of the operation gesture is determined based on the direction of the gaze. The operation direction and operation gesture are associated and structurally integrated according to the differential mapping rules to form an operation gesture that includes operation type and operation direction.

[0016] Step 4: Based on the mapping database between operation gestures and interaction commands, obtain the operation commands corresponding to the operation gestures;

[0017] Step 5: Respond to the operation command and execute the corresponding interactive action.

[0018] The basic principle and beneficial effects of the solution are as follows: Through progressive processing—cropping the eye area from the facial image, separating the pupil using grayscale distribution, and locating key eye reference points—interference factors such as other facial areas and ambient light are gradually eliminated. This ensures the accuracy of the pupil center coordinates and the positioning of reference points such as the inner and outer corners of the eyes, avoiding coordinate shifts caused by blurred boundaries between the pupil and the sclera. A coordinate system is constructed using the midpoint of the line connecting the inner corners of both eyes as the origin, conforming to the physiological structure of the eye. This custom eye coordinate system adapts to individual differences, rather than using a fixed screen coordinate system. It can adapt to differences in eye size and wearing position among different users, making the calculation of pupil offset more closely match the user's actual gaze changes. This avoids misjudgment of direction due to differences in individual physiological characteristics, improving the universality and accuracy of gaze direction determination. Compared to vague probabilistic judgments, the offset threshold transforms pupil offset into a clear gaze direction. The offset threshold can be calibrated according to the user's eye characteristics, reducing interference from slight pupil tremors and unconscious eye movements, ensuring the stability of gaze direction determination.

[0019] This solution establishes differentiated mapping rules to better match operation direction with operation type, aligning with user habits and avoiding redundancy caused by mapping logic. For example, a swipe gesture combined with eye direction directly achieves "look left + swipe = swipe left," eliminating the need for manual direction specification. Simultaneously, the structured integration of operation type and direction forms operation gestures, upgrading interaction commands from single actions to composite commands of action and direction. For instance, "look right + click" corresponds to "click the right side of the smartwatch screen," and "look up + swipe" corresponds to "swipe up to switch interfaces," making the semantics of commands clearer and avoiding ambiguity that can arise from single gestures, such as the need for additional intent determination when a swipe gesture lacks direction. This improves the recognition efficiency and clarity of operation commands. Furthermore, since operations often involve focusing the gaze on a target area and then coordinating with hand movements, the association between gaze and gesture aligns with natural human interaction habits. This solution eliminates the need for users to learn complex gesture combinations; precise interaction is achieved solely through gaze aiming and basic gestures. Compared to interactions requiring memorization of multiple complex gestures, this reduces the cognitive load and operational threshold for users. Because smartwatches have small screens, finger operation is easily obstructed, and precise clicks are difficult, this solution eliminates the need for direct finger contact with the screen, completely avoiding obstruction. At the same time, the dual confirmation mechanism of gaze and gesture reduces the requirements for touch position accuracy and improves the ease of operation of small-screen devices, making it especially suitable for quick interaction scenarios on small-screen devices such as smartwatches.

[0020] Meanwhile, by detecting the smartwatch's own state, a unique wake-up gesture is matched for different states. This strongly binds the operation gesture to the smartwatch's current functional requirements, enabling dynamic interaction where the state determines the gesture and the gesture adapts to the function. This avoids ambiguity of the same gesture in different scenarios and improves the accuracy of command recognition. For high-frequency scenarios, dedicated wake-up gestures are preset. Users do not need to memorize complex gesture combinations; they only need to perform a simple gesture corresponding to the device state to trigger the command. This simplifies the interaction chain, conforms to the user's natural association habits between scenarios and actions, lowers the operation threshold, and is especially suitable for the quick interaction needs of small-screen smartwatches. Pre-set judgment rules determine whether the operation gesture matches the wake-up gesture corresponding to the device state, directly filtering gestures unrelated to the current state, avoiding irrelevant actions from accidentally triggering commands, reducing unnecessary interaction interference, and improving operational reliability.

[0021] For cases where the wake-up gesture does not match, facial images are captured to extract eye and facial expression features for a dual determination of whether it is a mis-launch: eye features meet preset gaze conditions to ensure the user has a clear interaction intention, and facial expression features meet preset non-misoperation conditions to exclude the possibility of accidental touch. This effectively eliminates accidental operations caused by unconscious gestures in daily activities, especially avoiding operational risks in critical scenarios such as accidental call connections and accidental application launches, ensuring interaction security. When the operation gesture matches the wake-up gesture, the operation command is executed directly without additional verification steps, enabling rapid interaction in high-frequency scenarios and meeting users' needs for quick operations. When the operation gesture does not match the wake-up gesture, facial feature verification confirms the user's intention, providing users with the flexibility to trigger interaction even with non-preset wake-up gestures, while strict verification avoids accidental operations, achieving a balance between convenience and rigor.

[0022] In summary, this solution optimizes the interaction process across the entire chain, from data collection and logical mapping to scene adaptation, through gaze recognition and structured gesture-gaze association. It reduces environmental dependence, improves operational accuracy, and has a feedback mechanism, thereby enhancing the user interaction experience.

[0023] Further, in step 3, the eye image is cropped from the facial image, and the pupil region is separated based on the grayscale distribution of the eye image. This includes: converting the acquired facial image to grayscale to obtain a grayscale image; performing noise reduction processing on the grayscale image using a Gaussian filtering algorithm; detecting key eye points in the grayscale image based on the MTCNN neural network, wherein the key eye points include the inner corner of the eye, the outer corner of the eye, the upper eyelid edge point, and the lower eyelid edge point; determining the boundary coordinates of the eye region based on the key eye points; cropping the eye image according to the boundary coordinates; and calculating the grayscale histogram of the eye image. A bimodal threshold is set for the grayscale histogram, where the bimodal threshold is the grayscale value corresponding to the valley between the two peaks in the grayscale histogram. An adaptive threshold segmentation algorithm is used, with the bimodal threshold as a benchmark, and the segmentation threshold is adjusted in combination with the grayscale mean of the eye image. Regions in the eye image with grayscale values lower than the adjusted segmentation threshold are identified as pupil candidate regions. Morphological closing operations are performed on the pupil candidate regions to fill the tiny holes in the candidate regions. Then, the contours of the candidate regions are extracted using a contour detection algorithm, and the regions corresponding to the contours with a roundness greater than a preset roundness threshold are selected as the pupil regions.

[0024] The beneficial effects are as follows: First, Gaussian filtering is used to denoise the grayscale facial image, effectively removing ambient light interference and image sensor noise, thus preventing noise from affecting the accuracy of eye key point detection. Then, the MTCNN neural network is used to accurately locate eye key points, ensuring the accuracy of eye region cropping and eliminating interference from other facial areas. By adjusting the segmentation threshold using a bimodal threshold based on the grayscale histogram and the local grayscale mean, the system can adapt to different users' eye characteristics and ambient light variations, ensuring accurate differentiation of the pupil from the sclera, eyelids, and other areas even in complex scenes, avoiding missed or false detections of the pupil region due to fixed thresholds. Morphological closing operations are used to fill in tiny holes within the pupil candidate region, eliminating image imperfections after segmentation. Finally, roundness filtering is used to remove irregular areas other than the pupil, improving the purity and integrity of the pupil region and ensuring the accuracy of subsequent pupil center coordinate calculations.

[0025] Furthermore, in step 3, the coordinates of the pupil center are calculated ( Identifying the inner corner of the eye, the outer corner of the eye, the midpoint of the upper eyelid, and the midpoint of the lower eyelid includes: calculating the center coordinates of the pupil area using the centroid method, with the following formula:

[0026]

[0027] in, These are the coordinates of the pixels within the pupil area. Let m and n be the grayscale value of the pixel, respectively, and m and n be the number of rows and columns of the pupil region. Keypoint detection is performed on the cropped eye image, and the coordinates of these keypoints are output. These keypoints include the inner corner feature point, outer corner feature point, upper eyelid edge point, and lower eyelid edge point. The average coordinates of the detected inner and outer corner feature points are calculated to obtain the inner corner coordinates of both eyes. , Coordinates of the outer corners of both eyes , Fitting the points along the upper eyelid edge yields the upper eyelid contour curve, and the coordinates of its midpoint are taken as the midpoint of the upper eyelid. Fitting the points along the lower eyelid edge yields the lower eyelid contour curve, and the coordinates of its midpoint are taken as the midpoint of the lower eyelid. .

[0028] The beneficial effects are as follows: Using the centroid method based on pixel grayscale values to calculate the pupil center coordinates, compared to the geometric center method, fully considers the differences in grayscale distribution within the pupil area, avoiding center offset caused by irregular pupil edges and uneven lighting, thus ensuring the accuracy of the pupil center coordinates. Detecting key eye points can accurately capture subtle features of the inner and outer corners of the eyes and eyelid edges, avoiding missed or false detections. By using curve fitting to the eyelid edge points and then taking the midpoint, rather than directly taking a single feature point, it can adapt to different users' eyelid contour shapes, such as single eyelids, double eyelids, and differences in eyelid thickness, avoiding midpoint positioning errors caused by deviations in a single feature point, ensuring that the coordinates of the upper and lower eyelid midpoints truly reflect the geometric center of the eye contour.

[0029] Furthermore, in step 3, the pupil center coordinates ( Mapped to the eye coordinate system, the pupil's position relative to the origin is calculated. offset of ) ), including: based on the coordinates of the inner canthi of both eyes , Calculate the origin of the eye coordinate system The formula is as follows:

[0030]

[0031] Based on the coordinates of the pupil center Calculate the lateral offset of the pupil relative to the origin. and vertical offset The formula is as follows:

[0032]

[0033] For the calculated and Smoothing is performed using a sliding window filtering algorithm, and the average of the offsets of the first consecutive preset number of frames is taken as the final offset result.

[0034] The beneficial effects are as follows: By calculating the lateral and longitudinal offsets through direct coordinate difference, the positional change of the pupil relative to the center of the eye is quantified, providing a clear numerical basis for determining the direction of gaze. Compared with vague qualitative judgments, this makes the determination of gaze direction more objective and accurate. Applying sliding window filtering to the offset data of consecutive frames effectively offsets the fluctuations in offset caused by factors such as slight pupil jitter and instantaneous changes in illumination, outputting a smooth and stable final offset result, avoiding the influence of accidental deviations in single-frame data on the determination of gaze direction.

[0035] Furthermore, in step 3, based on the offset ( ) and offset threshold ( Determining the direction of vision includes: calculating the horizontal distance between the inner and outer corners of both eyes.

[0036]

[0037] Calculate the vertical distance between the midpoint of the upper eyelid and the midpoint of the lower eyelid.

[0038]

[0039] With horizontal spacing and vertical spacing Based on the baseline, the offset threshold is adaptively set. and , ,in, , The coefficients are used to determine the line-of-sight direction based on the offset (Δx, Δy).

[0040] If the second consecutive preset frame number is satisfied The decision is to look to the left;

[0041] If the second consecutive preset frame number is satisfied The decision is to look to the right;

[0042] If the second consecutive preset frame number is satisfied Determine to look downwards;

[0043] If the second consecutive preset frame number is satisfied Determine to look upwards;

[0044] If the second consecutive preset frame number is satisfied and Determine that your line of sight is facing the screen of the smartwatch;

[0045] If a change in gaze direction is detected, the second consecutive preset number of frames is restarted, and the gaze direction determination result is updated after the second consecutive preset number of frames is reached.

[0046] The beneficial effect is based on the horizontal spacing. and vertical spacing Dynamically setting the offset threshold adapts to the different eye physiological characteristics of various users, avoiding the insufficient adaptability of fixed thresholds. Simultaneously, the offset threshold is strongly correlated with eye size, reducing the impact of ambient light and slight wear position shifts on the judgment results, improving adaptability in complex scenarios. The requirement that the judgment condition be met for a second consecutive preset number of frames before outputting the gaze direction effectively filters out offset fluctuations caused by accidental factors such as slight pupil tremors, blinking, and unconscious eye movements, avoiding misjudgments of gaze direction due to accidental deviations in a single frame, ensuring the stability and reliability of the judgment results. By clearly defining the gaze direction corresponding to different offset ranges and the re-accumulation rules when switching directions, a unified execution standard for gaze direction judgment is established, avoiding inconsistencies caused by ambiguous judgment logic. This also improves the efficiency and accuracy of gaze direction judgment, ensuring the precise generation of operation gestures and enhancing the overall smoothness of the interaction.

[0047] Furthermore, in step 3, the operation direction of the gesture is determined based on the gaze direction. The operation direction and gesture are then associated and structurally integrated according to differentiated mapping rules to form an operation gesture containing both operation type and operation direction. This includes: a preset differentiated mapping rule library, categorized by operation type, with each operation type corresponding to a specific mapping logic; extraction of the operation gesture's type identifier and action parameters, including duration and amplitude; matching the corresponding differentiated mapping rule from the mapping rule library based on the type identifier, and determining the operation direction in conjunction with the gaze direction; constructing a structured data model of the operation gesture, including three core fields: operation type, operation direction, and action parameters; and filling the corresponding fields of the structured data model with the operation gesture's type identifier, operation direction, and action parameters to generate the operation gesture.

[0048] The beneficial effects are as follows: Dedicated mapping logic is designed for different operation types, avoiding ambiguity caused by a one-size-fits-all approach; deep adaptation to interaction scenarios ensures that the operation direction of each gesture conforms to user habits, reducing the error rate. By constructing a structured model containing "operation type + operation direction + action parameters," the data format of operation gestures is unified and semantically clear, avoiding ambiguity in instruction parsing caused by fragmented data; core information in structured data can be quickly read, efficiently matching corresponding interaction instructions, shortening the link from gesture recognition to instruction execution, and improving the smoothness of interaction. The preset mapping rule library supports on-demand expansion; corresponding mapping rules can be added according to new operation types or new interaction scenarios without refactoring the overall logic, improving the adaptability of subsequent function upgrades.

[0049] Furthermore, in step 4, based on the mapping relationship database between operation gestures and interaction commands, the operation commands corresponding to the operation gestures are obtained, including: constructing an interaction command mapping relationship database, wherein the database contains a basic gesture subset and a precise gesture subset; the basic gesture subset stores the correspondence between operation gestures and wake-up gestures, with wake-up mode as the category index; the precise gesture subset stores the correspondence between operation gestures and operation commands, with operation type, operation direction and action parameters as the joint index;

[0050] If the wake-up mode is triggered, extract the smartwatch's status mode and the operation type of the gesture. Use the status mode + operation type as the search key to query the basic gesture subset and obtain the corresponding basic operation instructions.

[0051] If a gesture is triggered, extract the gesture type, gesture direction, and action parameters from the gesture structured data model, and use these three as the joint search key to query the precise gesture subset and obtain the corresponding precise operation command.

[0052] The beneficial effects are as follows: Dividing the mapping relationship into a basic gesture subset and a precise gesture subset, corresponding to the two scenarios of rapid response and precise interaction respectively, avoids command confusion caused by a single database. Simultaneously, through the refined search key design of "status + gesture" or "operation type + operation direction + action parameters," it ensures a high degree of alignment between command matching and user intent, avoiding command mismatches caused by single-dimensional mapping. Based on clear search keys and categorized query logic, the corresponding operation command can be quickly located, avoiding response delays caused by full database traversal and improving the interactive experience.

[0053] Furthermore, in step 2, determining whether the operation gesture matches the wake-up gesture according to preset judgment rules includes: constructing a mapping table between operation gestures and wake-up gestures, determining the wake-up gestures and gesture feature parameters corresponding to different state modes; querying the wake-up gestures and feature parameter thresholds corresponding to the state modes from the mapping table; collecting real-time feature parameters of the operation gestures, and comparing the real-time feature parameters with the queried preset feature parameter thresholds: if all real-time feature parameters are within the preset threshold range, the operation gesture is determined to match the wake-up gesture; if any real-time feature parameter exceeds the preset threshold range, it is determined to be a mismatch.

[0054] The beneficial effects are as follows: Quantifying the characteristics of wake-up gestures replaces vague qualitative judgments, giving gesture matching a clear quantitative standard, effectively distinguishing between intentional gestures and unconscious everyday gestures, and reducing the false matching rate. A mapping table enables precise adaptation of gesture standards based on state, ensuring scene consistency and avoiding cross-scene gesture confusion. Pre-set matching rules based on meeting all parameters quickly compare real-time parameters with preset thresholds, shortening the time from gesture recognition to process branch transitions, improving interaction smoothness, and especially meeting the needs of rapid response in high-frequency scenarios.

[0055] Furthermore, in step 2, eye features and expression features are extracted from the facial image, and it is determined whether the eye features meet preset fixation conditions and whether the expression features meet preset non-misoperation conditions, including:

[0056] Extract eye features and calculate pupil fixation duration, eyelid opening and closing degree and eye movement frequency based on key eye points;

[0057] Extract facial features, use the Dlib facial feature point detection algorithm to extract facial key points, and calculate eyebrow offset, corner of mouth upturn angle and lip closure based on facial key points;

[0058] The preset fixation conditions are: pupil fixation duration ≥ preset fixation duration, eyelid opening and closing degree ≥ preset opening and closing degree, and eyeball rotation frequency ≤ preset rotation frequency.

[0059] The preset non-misoperation conditions are: eyebrow offset ≤ preset offset, mouth corner upturn angle ≥ preset upturn angle, and lip closure degree ≥ preset closure degree.

[0060] If all eye features meet the preset fixation conditions and all facial expression features meet the preset non-misoperation conditions, the conditions are deemed met; if any eye feature parameter does not meet the preset fixation conditions, or any facial expression feature parameter does not meet the preset non-misoperation conditions, the conditions are deemed not met.

[0061] The beneficial effects are as follows: It simultaneously extracts two core features: eye features and facial expressions. Eye features reflect the user's gaze intention, while facial expression features confirm the initiative of the operation. The joint verification of the two can effectively distinguish between operations with clear intentions and those triggered by unconscious gestures. It transforms gaze conditions and non-misoperation conditions into specific numerical thresholds, avoiding misjudgments or omissions caused by subjective judgments, and improving the executability and consistency of the verification rules. Through strict dual-condition judgment, it can effectively filter out misoperations caused by unconscious gestures in daily activities. Attached Figure Description

[0062] Figure 1 This is a flowchart of an embodiment of a gesture-based smartwatch interaction method. Detailed Implementation

[0063] The following detailed description illustrates the specific implementation method:

[0064] This invention provides a smartwatch interaction method based on gestures, as shown in the appendix. Figure 1 As shown, the specific implementation process is as follows:

[0065] Step 1: While in standby mode with the screen off, the smartwatch enters a preset wake-up mode upon receiving a preset wake-up gesture from the user; or it enters a preset wake-up mode after meeting preset conditions. Once in preset wake-up mode, the smartwatch detects its own status to obtain status mode information. For example, if the user makes a preset wake-up gesture of rotating their wrist twice while the screen is off, the smartwatch, through its inertial measurement unit, detects that the wrist rotation angle is ≥180° and the action threshold is met twice consecutively, and then enters the preset wake-up mode (such as the quick function preview mode). If no wake-up gesture is detected, but a device search command is received from the phone (meeting preset conditions), it will also automatically enter wake-up mode and vibrate to alert the user. After entering wake-up mode, the smartwatch quickly detects its own status, such as a 35% battery level, normal Bluetooth connection to the phone, and no missed calls or notifications, ultimately outputting a status mode message of low battery + connected + no pending tasks, providing a basis for subsequent interactions.

[0066] Step 2: After clarifying the status mode information, the device collects the user's hand gestures. For example, after the smartwatch clarifies the status mode information as "fully charged + Bluetooth connected + no pending notifications," it automatically activates the gesture collection function. Through the collaborative work of an electromyography (EMG) sensor and an inertial measurement unit (IMU), the device captures the user's hand movements in real time. If the user makes a gesture of quickly tapping three times with their index finger, the device detects that the characteristic waveform of the surface EMG signal matches the tapping gesture, and the wrist displacement distance is ≤0.5cm with no obvious rotation, thus determining it as a valid gesture. If the user makes a left-right swiping motion with their palm, the device, through the IMU, captures a horizontal wrist displacement ≥2cm and a rotation angle ≤10°, and combined with the EMG signal to confirm no additional finger movement, accurately collecting the swiping gesture.

[0067] Then, based on preset judgment rules, it is determined whether the operation gesture matches the wake-up gesture, including: constructing a mapping table between operation gestures and wake-up gestures, determining the wake-up gestures and gesture feature parameters corresponding to different state modes; querying the wake-up gestures and feature parameter thresholds corresponding to the state modes from the mapping table; collecting real-time feature parameters of the operation gestures, and comparing the real-time feature parameters with the queried preset feature parameter thresholds: if all real-time feature parameters are within the preset threshold range, the operation gesture is determined to match the wake-up gesture; if any real-time feature parameter exceeds the preset threshold range, it is determined to be a mismatch. For example, first construct a mapping table: the wake-up gesture corresponding to the state of sufficient battery + no to-dos is to rotate the wrist twice, with feature parameters of rotation angle ≥180° and interval ≤1s; the wake-up gesture corresponding to the state of low battery + notification is to raise and hold, with parameters of raising angle ≥45° and duration ≥1.5s. After the smartwatch confirms that the state is sufficient battery + no to-dos, it queries the table for the corresponding wake-up gesture and parameter thresholds. When a user makes a wrist-flipping motion, the device collects real-time parameters. If the rotation angle is 195° and the interval between two flips is 0.8s, both are within the preset threshold, and a match is determined. If the user flips the wrist at an angle of 150° (below the threshold), a mismatch is determined.

[0068] If a match is found, proceed to step 4;

[0069] If there is no match, the user's facial image is collected, and eye features and expression features are extracted from the facial image. It is determined whether the eye features meet the preset gaze conditions and whether the expression features meet the preset non-misoperation conditions: extract eye features, and calculate pupil gaze duration, eyelid opening and closing degree and eyeball rotation frequency based on key eye points.

[0070] Extract facial features, use the Dlib facial feature point detection algorithm to extract facial key points, and calculate eyebrow offset, corner of mouth upturn angle and lip closure based on facial key points;

[0071] The preset fixation conditions are: pupil fixation duration ≥ preset fixation duration, eyelid opening and closing degree ≥ preset opening and closing degree, and eyeball rotation frequency ≤ preset rotation frequency.

[0072] The preset non-misoperation conditions are: eyebrow offset ≤ preset offset, mouth corner upturn angle ≥ preset upturn angle, and lip closure degree ≥ preset closure degree.

[0073] If all eye features meet the preset fixation conditions and all facial expression features meet the preset non-misoperation conditions, the conditions are deemed met; if any eye feature parameter does not meet the preset fixation conditions, or any facial expression feature parameter does not meet the preset non-misoperation conditions, the conditions are deemed not met.

[0074] If yes, proceed to step 3; otherwise, return to step 1.

[0075] Step 3: Determine the user's gaze direction based on the facial image.

[0076] The process involves cropping the eye image from the facial image and separating the pupil region based on the grayscale distribution of the eye image. The acquired facial image is then converted to grayscale to obtain a grayscale image. A Gaussian filtering algorithm is used to denoise the grayscale image, effectively removing ambient light interference and image sensor noise, thus preventing noise from affecting the accuracy of eye key point detection. Eye key points in the grayscale image are detected using an MTCNN neural network. These key points include the inner corner of the eye, the outer corner of the eye, the upper eyelid edge, and the lower eyelid edge, ensuring the accuracy of eye region cropping and eliminating interference from other facial areas. The boundary coordinates of the eye region are determined based on the eye key points, and the eye image is cropped according to these boundary coordinates.

[0077] The grayscale histogram of the eye image is calculated, and a bimodal threshold is determined. The bimodal threshold is the grayscale value corresponding to the valley between the two peaks in the grayscale histogram. By adjusting the segmentation threshold by combining the bimodal threshold of the grayscale histogram with the local grayscale mean, the segmentation threshold can be adapted to the eye characteristics of different users and changes in ambient light. This ensures that the pupil can be accurately distinguished from the white of the eye, eyelids, and other areas even in complex scenes, avoiding missed or false detections of the pupil area due to a fixed threshold. An adaptive threshold segmentation algorithm is employed, using a bimodal threshold as a baseline and adjusting the segmentation threshold based on the mean grayscale value of the eye image. Regions in the eye image with grayscale values lower than the adjusted segmentation threshold are identified as pupil candidate regions. Morphological closing operations are performed on the pupil candidate regions to fill in any tiny holes. Then, a contour detection algorithm is used to extract the contours of the candidate regions, and regions corresponding to contours with a roundness greater than a preset roundness threshold are selected as pupil regions. Morphological closing operations are used to fill in the tiny holes within the pupil candidate regions, eliminating image defects after segmentation. Finally, roundness filtering is used to remove irregular regions that are not pupils, improving the purity and integrity of the pupil regions and ensuring the accuracy of subsequent pupil center coordinate calculations.

[0078] Calculate the coordinates of the pupil center ( Identify the inner corner of the eye, outer corner of the eye, midpoint of the upper eyelid, and midpoint of the lower eyelid; calculate the center coordinates of the pupil area using the centroid method, as shown in the following formula.

[0079]

[0080] in, These are the coordinates of the pixels within the pupil area. Let m and n be the grayscale value of the pixel, respectively, and m and n be the number of rows and columns of the pupil region. Compared with the geometric center method, this method fully considers the differences in grayscale distribution within the pupil region, avoiding center shift caused by irregular pupil edges and uneven illumination, thus ensuring the accuracy of the pupil center coordinates. Key point detection is performed on the cropped eye image, and the coordinates of these key points are output. These key points include two inner corner feature points, two outer corner feature points, three upper eyelid edge points, and three lower eyelid edge points. Detecting key points in the eye can accurately capture subtle features of the inner corner, outer corner, and eyelid edges, avoiding missed or false detections. The average coordinates of the detected inner and outer corner feature points are calculated to obtain the inner corner coordinates of both eyes. , Coordinates of the outer corners of both eyes , Fitting data to three points along the upper eyelid edge yields the upper eyelid contour curve, and the coordinates of its midpoint are taken as the midpoint of the upper eyelid. Fitting data to three lower eyelid edge points yields the lower eyelid contour curve, and the coordinates of the midpoint are taken as the lower eyelid midpoint. Instead of directly taking a single feature point, the midpoint is obtained by curve fitting of the eyelid edge points. This allows for adaptation to different eyelid contour shapes of users, such as single eyelids, double eyelids, and differences in eyelid thickness. It avoids midpoint positioning errors caused by deviations of a single feature point and ensures that the coordinates of the midpoints of the upper and lower eyelids can truly reflect the geometric center of the eye contour.

[0081] The origin is the midpoint of the line connecting the inner corners of the eyes. Construct an eye coordinate system with the inner corner of the eye pointing to the outer corner as the X-axis and the midpoint of the lower eyelid pointing to the midpoint of the upper eyelid as the Y-axis; then, establish the coordinates of the pupil center (…). Mapped to the eye coordinate system, the pupil's position relative to the origin is calculated. offset of ) ); based on the coordinates of the inner canthus of both eyes , Calculate the origin of the eye coordinate system The formula is as follows:

[0082]

[0083] Based on the coordinates of the pupil center Calculate the lateral offset of the pupil relative to the origin. and vertical offset The formula is as follows:

[0084]

[0085] For the calculated and Smoothing is performed using a sliding window filtering algorithm, taking the average of the offsets of the first consecutive preset number of frames as the final offset result. The first consecutive preset number of frames is 10. Smoothing window filtering is applied to the offset data of consecutive frames to effectively offset fluctuations caused by factors such as slight pupil tremors and instantaneous changes in illumination.

[0086] Based on the offset ( ) and offset threshold ( Determine the direction of the gaze; calculate the horizontal distance between the inner and outer corners of both eyes.

[0087]

[0088] Calculate the vertical distance between the midpoint of the upper eyelid and the midpoint of the lower eyelid.

[0089]

[0090] With horizontal spacing and vertical spacing Based on the baseline, the offset threshold is adaptively set. and

[0091] , ,

[0092] in, , All are coefficients. The value range is 0.3 to 0.4. The value range is 0.25~0.35; based on the offset (Δx, Δy), the gaze direction determination rule is set, wherein the second consecutive preset frame number is 15 frames:

[0093] If the second consecutive preset frame number is satisfied The decision is to look to the left;

[0094] If the second consecutive preset frame number is satisfied The decision is to look to the right;

[0095] If the second consecutive preset frame number is satisfied Determine to look downwards;

[0096] If the second consecutive preset frame number is satisfied Determine to look upwards;

[0097] If the second consecutive preset frame number is satisfied and Determine that your line of sight is facing the screen of the smartwatch;

[0098] If a change in gaze direction is detected, a second set of preset frames is executed, and the gaze direction determination result is updated after reaching this second set of preset frames. The requirement that the determination condition be met for a second consecutive set of preset frames before outputting the gaze direction effectively filters out offset fluctuations caused by accidental factors such as slight pupil tremors, blinking, and unconscious eye movements. This avoids misjudgments of gaze direction caused by accidental deviations in a single frame, ensuring the stability and reliability of the determination result. By clearly defining the gaze direction corresponding to different offset ranges and the re-accumulation rules when the direction changes, a unified execution standard for gaze direction determination is established, avoiding inconsistencies caused by ambiguous determination logic. Simultaneously, this improves the efficiency and accuracy of gaze direction determination, ensuring the precise generation of operation gestures.

[0099] The direction of the operation gesture is determined based on the direction of the gaze. The operation direction and the operation gesture are then associated and structurally integrated according to differentiated mapping rules to form an operation gesture that includes the operation type and direction. A pre-set differentiated mapping rule library is used, categorized by operation type, with each operation type corresponding to its own mapping logic. For example:

[0100] Click-type gestures: The direction of the operation corresponds to the area coordinates on the smartwatch screen. Looking left maps to the left area, looking right maps to the right area, looking up maps to the top area, looking down maps to the bottom area, and looking directly at the screen maps to the currently focused area.

[0101] Tapping gestures: The direction of operation corresponds to the intensity level of the interaction command. Looking left / right maps to level 1 intensity, looking up / down maps to level 2 intensity, and looking straight ahead maps to the default intensity.

[0102] Typing gestures: The direction of operation corresponds to the virtual keyboard partition. Looking left switches to the left partition, looking right switches to the right partition, looking up switches to the top partition, looking down switches to the bottom partition, and looking straight ahead keeps the current partition active.

[0103] Swipe gestures: The direction of operation directly corresponds to the direction of the gaze. Looking left maps to swiping left, looking right maps to swiping right, looking up maps to swiping up, looking down maps to swiping down, and looking directly at the gaze maps to swiping along the finger's movement trajectory.

[0104] The process involves extracting the type identifier and action parameters of the gesture, including duration and amplitude. Based on the type identifier, a corresponding differential mapping rule is matched from a mapping rule base, and the operation direction is determined by combining the gaze direction. A structured data model of the gesture is constructed, comprising three core fields: operation type, operation direction (the matched direction result), and action parameters (duration and amplitude). The type identifier, operation direction, and action parameters are then filled into the corresponding fields of the structured data model to generate the gesture. Constructing a structured model containing "operation type + operation direction + action parameters" ensures a unified data format and clear semantics for the gestures, avoiding ambiguity in command parsing caused by fragmented data. It allows for rapid reading of core information from structured data, efficient matching of corresponding interaction commands, shortens the link from gesture recognition to command execution, and improves the smoothness of the interaction.

[0105] Step 4: Based on the operation gesture and interaction command mapping database, obtain the operation commands corresponding to the operation gestures. Construct an interaction command mapping database, which includes a subset of basic gestures and a subset of precise gestures;

[0106] The basic gesture subset stores the correspondence between operation gestures and wake-up gestures, with wake-up mode as the category index; for example, in "Quick Preview Mode", "flipping the wrist" corresponds to the "wake up the screen" command;

[0107] The precise gesture subset stores the correspondence between operation gestures and operation commands, using operation type, operation direction, and action parameters as a joint index; for example, "slide + left + amplitude ≥ 2cm" corresponds to the "swipe left on the interface" command;

[0108] If the wake-up mode is triggered, extract the smartwatch's status mode and operation type of the gesture. Use the status mode + operation type as the search key to query the basic gesture subset and obtain the corresponding basic operation command. For example, if the watch triggers the wake-up mode and the status is "fully charged" and the operation gesture is "flip wrist", use "fully charged + flip wrist" to search the basic subset and obtain the "wake up screen" command.

[0109] If the triggered action is a gesture, extract the action type, action direction, and action parameters from the structured data model of the gesture. Use these three as the joint search key to query the precise gesture subset and obtain the corresponding precise operation command. For example, extract "slide + left + amplitude 2.5cm", and use the joint search to obtain the precise operation command "swipe left on the interface".

[0110] The mapping relationship is divided into a basic gesture subset and a precise gesture subset, corresponding to two scenarios: rapid response and precise interaction, respectively, to avoid command confusion caused by a single database. At the same time, through the design of refined search keys such as "state + gesture" or "operation type + operation direction + action parameters", it is ensured that the command matching is highly consistent with the user's intent, avoiding command mismatch caused by single-dimensional mapping.

[0111] Step 5: Respond to the operation command and execute the corresponding interactive action.

[0112] This solution optimizes the interaction process across the entire chain, from data collection and logical mapping to scene adaptation, through gaze recognition and structured gesture-gaze association. It reduces environmental dependence, improves operational accuracy, and has a feedback mechanism, thereby enhancing the user interaction experience.

[0113] The above are merely embodiments of the present invention. The invention is not limited to the fields covered by these embodiments. Commonly known structures and characteristics in the solutions are not described in detail here. Those skilled in the art are aware of all common technical knowledge in the field prior to the application date or priority date, are able to access all existing technologies in that field, and have the ability to apply conventional experimental methods prior to that date. Those skilled in the art can, under the guidance of this application, improve and implement this solution in combination with their own capabilities. Some typical known structures or methods should not be obstacles for those skilled in the art to implement this application. It should be noted that those skilled in the art can make several modifications and improvements without departing from the structure of the present invention. These should also be considered within the scope of protection of the present invention, and will not affect the effectiveness of the implementation of the present invention or the practicality of the patent. The scope of protection claimed in this application should be determined by the content of its claims, and the specific embodiments described in the specification can be used to interpret the content of the claims.

Claims

1. A method for smart watch interaction based on gesture action, characterized in that, Including the following steps: Step 1: Keep the screen off and in standby mode. After receiving the user's preset wake-up gesture, the smartwatch will enter the preset wake-up mode; or it will enter the preset wake-up mode after the preset conditions are met. After entering the preset wake-up mode, the system detects its own state to obtain state mode information; Step 2: After clarifying the state mode information, collect the user's input gestures, and then determine whether the gestures match the wake-up gestures according to preset judgment rules: If a match is found, proceed to step 4; If there is no match, the user's facial image is captured, and eye features and expression features are extracted from the facial image. It is then determined whether the eye features meet the preset gaze conditions and whether the expression features meet the preset non-misoperation conditions: if yes, proceed to step 3; if no, return to step 1. Step 3, determine the user's gaze direction based on the facial image: crop the eye image from the facial image, and separate the pupil area based on the grayscale distribution of the eye image; Calculate the coordinates of the pupil center ( Identify the inner corner of the eye, the outer corner of the eye, the midpoint of the upper eyelid, and the midpoint of the lower eyelid; use the midpoint of the line connecting the inner corners of the eyes as the origin ( Construct an eye coordinate system with the inner corner of the eye pointing to the outer corner as the X-axis and the midpoint of the lower eyelid pointing to the midpoint of the upper eyelid as the Y-axis; then, establish the coordinates of the pupil center (…). Mapped to the eye coordinate system, the pupil's position relative to the origin is calculated. offset of ) ); based on offset ( ) and offset threshold ( Determine the direction of your gaze; The direction of the operation gesture is determined based on the direction of the gaze. The operation direction and operation gesture are associated and structurally integrated according to the differential mapping rules to form an operation gesture that includes operation type and operation direction. The differentiated mapping rule establishes a correspondence between gaze direction and operation direction for different types of interactive activation gestures, including: For click-type gestures, the direction of the gaze is mapped to the area of the virtual operation plane; For swipe gestures, the direction of the gaze is directly mapped to the swipe direction; For typing gestures, the direction of the gaze is mapped to a virtual keyboard area; The structured integration involves filling the type identifier of the operation gesture, the operation direction determined by the gaze direction, and the gesture feature parameters into a preset structured data model to form the target gesture action. Step 4: Based on the mapping database between operation gestures and interaction commands, obtain the operation commands corresponding to the operation gestures; Step 5: Respond to the operation command and execute the corresponding interactive action.

2. The smartwatch interaction method based on gestures according to claim 1, characterized in that, In step 3, the eye image is cropped from the facial image, and the pupil region is separated based on the grayscale distribution of the eye image. This includes: converting the acquired facial image to grayscale to obtain a grayscale image; performing noise reduction on the grayscale image using a Gaussian filtering algorithm; detecting key eye points in the grayscale image based on the MTCNN neural network, wherein the key eye points include the inner corner of the eye, the outer corner of the eye, the upper eyelid edge point, and the lower eyelid edge point; determining the boundary coordinates of the eye region based on the key eye points; cropping the eye image according to the boundary coordinates; and calculating the grayscale histogram of the eye image to determine the grayscale distribution. A bimodal threshold is used in the gray-level histogram, where the bimodal threshold is the gray value corresponding to the valley between the two peaks in the gray-level histogram. An adaptive threshold segmentation algorithm is used, with the bimodal threshold as a benchmark, and the segmentation threshold is adjusted in combination with the gray-level mean of the eye image. Regions in the eye image with gray values lower than the adjusted segmentation threshold are identified as pupil candidate regions. Morphological closing operations are performed on the pupil candidate regions to fill the tiny holes in the candidate regions. Then, the contours of the candidate regions are extracted using a contour detection algorithm, and the regions corresponding to the contours with a roundness greater than a preset roundness threshold are selected as the pupil regions. 3.The gesture-action based smart watch interaction method of claim 2, wherein, In step 3, the pupil center coordinates are calculated In step 3, the pupil center coordinates are calculated In step 3, the pupil center coordinates are calculated in, These are the coordinates of the pixels within the pupil area. Let m and n be the grayscale value of the pixel, respectively, and m and n be the number of rows and columns of the pupil region. Keypoint detection is performed on the cropped eye image, and the coordinates of these keypoints are output. These keypoints include the inner corner feature point, outer corner feature point, upper eyelid edge point, and lower eyelid edge point. The average coordinates of the detected inner and outer corner feature points are calculated to obtain the inner corner coordinates of both eyes. , Coordinates of the outer corners of both eyes , Fitting the points along the upper eyelid edge yields the upper eyelid contour curve, and the coordinates of its midpoint are taken as the midpoint of the upper eyelid. Fitting the points along the lower eyelid edge yields the lower eyelid contour curve, and the coordinates of its midpoint are taken as the midpoint of the lower eyelid. . 4.The gesture-action based smart watch interaction method of claim 3, wherein, In step 3, the pupil center coordinates ( Mapped to the eye coordinate system, the pupil's position relative to the origin is calculated. offset of ) ), including: based on the coordinates of the inner canthi of both eyes , Calculate the origin of the eye coordinate system The formula is as follows: Based on the coordinates of the pupil center Calculate the lateral offset of the pupil relative to the origin. and vertical offset The formula is as follows: The calculated and are smoothed, a sliding window filtering algorithm is adopted, and the mean value of the first continuous preset frame number offset is taken as the final offset result.

5. The smartwatch interaction method based on gestures according to claim 4, characterized in that, In claim 3, based on the offset ( ) and offset threshold ( Determining the direction of vision includes: calculating the horizontal distance between the inner and outer corners of both eyes. , Calculate the vertical distance between the midpoint of the upper eyelid and the midpoint of the lower eyelid. , with horizontal spacing and vertical spacing adaptively set offset threshold with , wherein , is a coefficient; according to the offset (Δx, Δy), set the line-of-sight direction determination rule: If the second continuous preset frame number meets , determine to look left; If the second consecutive preset frame number is satisfied The decision is to look to the right; If the second continuous preset frame number meets , it is determined to look down; If the second continuous preset frame number meets , it is determined to look up. If the second continuous preset frame number meets and , it is determined that the line of sight is directly opposite the smart watch plane; If a change in gaze direction is detected, the second consecutive preset number of frames is restarted, and the gaze direction determination result is updated after the second consecutive preset number of frames is reached. 6.The gesture-action-based smart watch interaction method of claim 5, wherein, In step 3, the operation direction of the gesture is determined based on the gaze direction. The operation direction and gesture are then associated and structurally integrated according to differentiated mapping rules to form an operation gesture containing both operation type and direction. This includes: a preset differentiated mapping rule library, categorized by operation type, with each type corresponding to a specific mapping logic; extraction of the operation gesture's type identifier and action parameters, including duration and amplitude; matching the corresponding differentiated mapping rule from the mapping rule library based on the type identifier, and determining the operation direction in conjunction with the gaze direction; constructing a structured data model of the operation gesture, including three core fields: operation type, operation direction, and action parameters; and filling the corresponding fields of the structured data model with the operation gesture's type identifier, operation direction, and action parameters to generate the operation gesture. 7.The gesture-action based smart watch interaction method of claim 6, wherein, In step 4, based on the operation gesture and interaction command mapping relationship database, the operation command corresponding to the operation gesture is obtained, including: constructing an interaction command mapping relationship database, the database containing a basic gesture subset and a precise gesture subset; the basic gesture subset stores the correspondence between operation gestures and wake-up gestures, with wake-up mode as the category index; the precise gesture subset stores the correspondence between operation gestures and operation commands, with operation type, operation direction and action parameters as the joint index; If the wake-up mode is triggered, extract the smartwatch's status mode and the operation type of the gesture. Use the status mode + operation type as the search key to query the basic gesture subset and obtain the corresponding basic operation instructions. If a gesture is triggered, extract the gesture type, gesture direction, and action parameters from the gesture structured data model, and use these three as the joint search key to query the precise gesture subset and obtain the corresponding precise operation command. 8.The gesture-action based smart watch interaction method of claim 7, wherein, In step 2, the operation gesture is determined to match the wake-up gesture according to the preset judgment rules. This includes: constructing a mapping table between operation gestures and wake-up gestures, determining the wake-up gestures and gesture feature parameters corresponding to different state modes; querying the wake-up gestures and feature parameter thresholds corresponding to the state modes from the mapping table; collecting the real-time feature parameters of the operation gestures, and comparing the real-time feature parameters with the queried preset feature parameter thresholds: if all real-time feature parameters are within the preset threshold range, the operation gesture is determined to match the wake-up gesture; if any real-time feature parameter exceeds the preset threshold range, it is determined to be a mismatch. 9.The gesture-action based smart watch interaction method of claim 8, wherein, In step 2, eye features and expression features are extracted from the facial image. It is determined whether the eye features meet preset fixation conditions and whether the expression features meet preset non-misoperation conditions, including: Extract eye features and calculate pupil fixation duration, eyelid opening and closing degree and eye movement frequency based on key eye points; Facial features are extracted, and the Dlib facial landmark detection algorithm is used to extract key facial points. Based on the key facial points, the eyebrow offset, the upward angle of the corner of the mouth, and the lip closure are calculated. The preset fixation conditions are: pupil fixation duration ≥ preset fixation duration, eyelid opening and closing degree ≥ preset opening and closing degree, and eyeball rotation frequency ≤ preset rotation frequency. The preset non-misoperation conditions are: eyebrow offset ≤ preset offset, corner of mouth upturn angle ≥ preset upturn angle, and lip closure degree ≥ preset closure degree. If all eye features meet the preset fixation conditions and all facial expression features meet the preset non-misoperation conditions, the conditions are deemed met; if any eye feature parameter does not meet the preset fixation conditions, or any facial expression feature parameter does not meet the preset non-misoperation conditions, the conditions are deemed not met.

Citation Information

Patent Citations

Man-machine interaction method and touch screen wearable device
CN105094675A
Wearable virtual music interaction device and calculation method thereof
CN115509358A
Human-computer interaction method based on eye movement control
CN108595008A
Vehicle voice assistant awakening method and device and electronic equipment
CN116935490A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Man-machine interaction method and touch screen wearable device

Wearable virtual music interaction device and calculation method thereof

Human-computer interaction method based on eye movement control

Vehicle voice assistant awakening method and device and electronic equipment