A smart interaction method and system based on programmable gestures and context awareness
By constructing programmable gesture and context-aware interaction methods, the problem of rigid interaction logic in smart glove systems has been solved, achieving a flexible and intelligent interactive experience. Users can customize the interaction logic and switch it dynamically according to the context, improving the adaptability and efficiency of the interactive device.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING HAOWANG TECHNOLOGY CO LTD
- Filing Date
- 2026-03-02
- Publication Date
- 2026-06-30
AI Technical Summary
The interaction logic of existing smart glove systems is fixed and cannot be customized according to user habits or application needs. Furthermore, a single gesture mapping cannot achieve intelligent experiences in multiple scenarios. Existing technologies cannot meet these requirements. For example, a "swipe" gesture when pointing to a presentation and a "swipe" gesture when pointing to a video player should trigger "page turning" and "fast forward/rewind" respectively, which is difficult to achieve with existing technologies.
We construct a programmable gesture and context-aware interaction method. By abstracting the underlying interaction signals into standardized elements, we provide a visual programming interface and script configuration interface, allowing users to define custom interaction logic and switch dynamically based on context.
It achieves flexibility and intelligent context awareness in interaction logic, allowing users to freely define their own interaction schemes based on different software, tasks, and even personal preferences. It provides a visual programming interface or script configuration interface, allowing users or developers to perform the following operations: the system's interaction devices are transformed into control interfaces for general interaction devices, and control interfaces for custom interaction devices are provided, achieving "point and shoot" and providing the most suitable interaction method.
Smart Images

Figure CN122308600A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of human-computer interaction, wearable computing, and computer software. Specifically, it relates to an intelligent interaction method and system that enables user-defined, combined, and contextualized dynamic binding of interaction logic on an intelligent glove system with high-precision spatial pointing and multimodal gesture recognition capabilities. Background Technology
[0002] With the development of natural human-computer interaction technology, spatial interaction solutions based on smart wearable devices (such as data gloves) have gradually become a research and application hotspot. For example, in the inventor's existing technical solution "A Spatial Interaction Method for Smart Gloves Based on Multi-Sensor Fusion," a high-precision and robust "fixed-point-scrolling" interaction paradigm is achieved by integrating laser visual positioning, inertial measurement, and multi-finger tactile sensing. This solution maps the contact and sliding actions of the thumb and middle finger to operations such as cursor control, content scrolling, and object scaling, providing users with a stable and efficient basic interactive experience.
[0003] However, existing solutions of this kind generally suffer from fixed interaction logic and insufficient scalability. The correspondence between gestures and commands is usually preset at the factory, and users cannot personalize or extend the functionality according to their personal habits, the specific needs of different applications, or different interaction scenarios. When users want to define a set of dedicated gestures for rotating and scaling models for new 3D modeling software, or define quick gestures for turning lights on and off or adjusting color temperature for smart home control, it often requires firmware upgrades from manufacturers or modifications to the underlying code by developers, a cumbersome process with a very high barrier to entry. In addition, a single global gesture mapping cannot achieve a "what you point is what you get" intelligent experience. For example, a "swipe" gesture when pointing at a presentation and a "swipe" gesture when pointing at a video player should trigger "page turning" and "fast forward / rewind" respectively, which is difficult to achieve under fixed logic.
[0004] Therefore, there is an urgent need for a solution that can build an open, flexible, and programmable interaction logic definition and management layer on the basis of existing high-performance spatial interaction hardware systems, thereby transforming dedicated interaction devices into a general-purpose interaction platform. Summary of the Invention
[0005] To overcome the shortcomings of existing technologies, the present invention aims to provide an intelligent interaction method and system based on programmable gestures and context awareness. This invention does not intend to replace or improve underlying gesture recognition and spatial positioning algorithms, but rather to provide a higher-level, configurable framework for interpreting and mapping interaction logic for underlying interaction systems such as "a spatial interaction method for intelligent gloves based on multi-sensor fusion." Its core idea is to abstract the fixed, atomic interaction events and data streams output by the underlying system into standardized, programmable "interaction elements"; allow users or developers to freely combine these elements and bind them to the control interface of any target application or device; and simultaneously enable the entire interaction logic to dynamically switch according to the user's current interaction focus (such as the application pointed to by the cursor), achieving intelligent context-aware interaction.
[0006] To achieve the above objectives, the present invention adopts the following technical solution: Option 1: An intelligent interaction method based on programmable gestures and context awareness This method is applied to a system comprising a smart glove, a processing unit, and an interactive interface. Its key feature is that the method operates on top of an underlying interactive system (capable of outputting spatial pointing data, discrete gesture events, and continuous motion parameters generated by the smart glove and processed through multi-sensor fusion), and includes the following steps: S1: Steps for abstracting interactive elements.
[0007] The native interaction signals output by the underlying interaction system are abstracted and encapsulated into standardized interaction elements that can be invoked by upper-layer logic. These interaction elements include at least two types: Event-type elements: These correspond to discrete gesture actions with clear start and end states recognized by the underlying interaction system. For example, events triggered by a short contact between the thumb and middle finger (click start event, click end event), long press start event, long press end event, or specific compound gestures (such as triple contact sequence).
[0008] Parameter-type elements: These correspond to the data stream continuously output by the underlying interaction system, representing continuous motion or state. For example, in a specific interaction state, the two-dimensional sliding vector of the thumb on the thumb-middle finger contact plane, the angular velocity of the wrist rotating around a specific axis, and the coordinate offset of the smart glove cursor relative to a certain anchor point on the screen.
[0009] S2: Interactive logic programmable definition steps.
[0010] Provide a visual programming interface or script configuration interface to allow users or developers to perform the following operations: Select one or more of the interactive elements from the interactive element library.
[0011] By defining the combination relationships and triggering logic between the interactive elements through logical operators (such as sequence, parallelism, conditional judgment, and loop), custom composite interactive logic can be constructed.
[0012] The final output of the custom composite interaction logic is mapped to the application programming interface (API) of the target software application, system-level simulated input events, or the control protocol of the target hardware device. For parameter-type elements, mapping rules such as data scaling, dead zone, and response curve can be configured.
[0013] S3: Context-aware and binding steps.
[0014] The system maintains a context mapping rule base. The context is determined by the real-time spatial pointing information provided by the underlying interaction system, specifically the application window, control, or area identifier covered by the cursor in the display interface. When the system detects a change in the interaction context, it automatically loads and activates one or more sets of custom composite interaction logic associated with the current context from the context mapping rule base, while suspending the interaction logic associated with the previous context.
[0015] S4: Instruction interpretation and distribution steps.
[0016] During runtime, the system receives native interaction signals from the underlying interaction system in real time and interprets them according to the custom composite interaction logic in the active state. Finally, it generates control instructions that conform to the target object interface specification and distributes and executes them through the corresponding communication channels.
[0017] Preferably, the interactive elements also include system status elements, which reflect the operating status of the underlying interactive system itself, such as the current interactive mode (precise pointing mode / spatial control mode), laser transmitter working status, device connection status, etc., so that the custom interactive logic can perform conditional branching based on the system status.
[0018] Preferably, the method further includes an interaction logic learning step: the system records a series of continuous operations performed by the user through the smart glove, automatically analyzes the operation sequence, attempts to decompose and match it into existing interaction element combinations, and after user confirmation or correction, saves it as a new custom composite interaction logic.
[0019] Preferably, the method further includes an interaction scheme sharing step: allowing users to export their defined custom composite interaction logic and its context binding relationships as a standardized configuration file. This configuration file can be imported and used by other users' systems through file sharing, network download, or cloud library synchronization, achieving rapid dissemination and reuse of efficient interaction schemes.
[0020] Option 2: An electronic device The method includes a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the program, implements the method as described in Scheme 1.
[0021] Option 3: An intelligent interactive system Its features include: Smart gloves are used to collect data on user hand movements and contact. A positioning device is used to track the orientation of the smart glove in space; The processing device is configured as follows: a) Run the underlying interaction process as described in "A Smart Glove Spatial Interaction Method Based on Multi-Sensor Fusion" to generate spatial pointing data, discrete gesture events and continuous motion parameters; b) Run the method described in Scheme 1 to provide a programmable, context-aware interactive control layer. Beneficial effects
[0022] Compared with the prior art, the present invention has the following significant advantages: Ultimate flexibility, breaking away from functional rigidity: A completely open programmable layer is built on top of the underlying high-performance interactive hardware capabilities. Users can freely define their own exclusive interaction schemes according to different software, different tasks, and even personal preferences, truly achieving one set of hardware adaptable to ten thousand scenarios.
[0023] Significantly lowers the barriers to development and use: Application developers do not need to understand complex sensor fusion algorithms to add deeply customized gesture support to their applications through the provided visualization tools. End users can also assemble interactive methods that conform to their intuition, just like assembling building blocks, with low learning and migration costs.
[0024] Achieving intelligent contextual interaction: By dynamically binding interaction logic with precise spatial pointing focus, a "point-and-shoot, WYSIWYG" interactive experience is achieved. The same physical gesture can produce the most semantically appropriate operation in different contexts, greatly enhancing the naturalness and efficiency of the interaction.
[0025] Building an open ecosystem and enhancing platform value: Through the shareable and reusable mechanism of interaction solutions, a library of high-quality interaction solutions contributed by the developer community and the user community can be fostered, enabling the practical value and application ecosystem of the entire system to expand rapidly, evolving from a single interaction device into an open interaction platform. Attached Figure Description
[0026] Figure 1 This is a schematic diagram of the system architecture of an embodiment of the present invention, illustrating the relationship between the underlying interactive system and the upper programmable interactive layer.
[0027] Figure 2 This is a schematic diagram of a visual programming interface provided in an embodiment of the present invention, illustrating the configuration process of interactive elements, logical combinations, and instruction mapping.
[0028] Figure 3 This is a flowchart illustrating the context awareness and dynamic switching of gesture sets according to an embodiment of the present invention.
[0029] Figure 4 This is a schematic diagram illustrating an application scenario of an embodiment of the present invention, showing different interaction schemes defined for 3D modeling software and presentation software. Detailed Implementation
[0030] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
[0031] See Figure 1 The system architecture of this invention is built on a reliable underlying interaction system. This underlying interaction system is preferably implemented by "a smart glove spatial interaction method based on multi-sensor fusion", which stably outputs high-precision cursor coordinates P(x,y), discrete gesture events (such as EVENT_CLICK) and continuous parameter streams (such as sliding vector V(dx,dy)) through sensors on the smart glove (such as laser emitter, IMU, linear sensor array) and external positioning devices.
[0032] The core of this invention—the programmable interaction layer—runs as software middleware or a standalone application. It comprises the following main modules: The interactive element abstraction module receives Events and Data from the underlying system in real time and encapsulates them into unified "interactive element" objects with timestamps and data, storing them in the element library. For example, EVENT_CLICK is encapsulated as the element_click(), and the sliding data stream V is encapsulated as the element_sliding vector().
[0033] Visual programming modules: such as Figure 2 As shown, this module provides a graphical interface. The left panel lists all available interactive elements (such as click, long press to start, swipe vector, wrist angular velocity) in icon form. Users can drag these icons to the central canvas area. The canvas area provides logic control blocks (such as sequence, loop, conditional statements), and users can define the execution logic between elements using the connection tool. The right side is a mapping configuration panel, where users can bind the final output logic flow to keyboard keys (such as Ctrl+C), system commands, or specific software APIs.
[0034] Context-aware engines: such as Figure 3As shown, the engine continuously monitors the current cursor coordinates P(x,y) provided by the underlying system and queries the corresponding application window handle and control ID at those coordinates through operating system interfaces (such as the Windows API). Internally, the engine maintains a configuration file (such as XML or JSON format) that defines the file paths of the "interaction schemes" corresponding to different "contexts" (such as "Blender.exe - 3D view window" and "PowerPoint.exe - presentation window"). When the cursor moves from one window to another, the engine automatically unloads the old interaction scheme and loads and activates the new one.
[0035] Instruction interpreter and dispatcher: This is the core of the runtime. It loads the currently active interaction scheme (actually an executable script or logic graph generated by the visual programming module) and subscribes to the underlying system's Event and Data streams. When the input stream matches the logic defined in the scheme, the dispatcher generates the final instruction (such as simulating key press messages, calling DLL functions, or sending network packets) according to preset rules and sends it to the target application. Example
[0036] A 3D artist wanted to customize a set of gestures for Blender software. Figure 4 a). He opened the visual programmer and defined the following: Gesture "Rotate View": Element_Click() (as a start trigger) → After triggering, continuously map the element_wrist angular velocity (Yaw) data, through a scaling factor, to Blender's view rotation command bpy.ops.view3d.rotate().
[0037] Gesture "Adjust Brush Size": Element_Click() + Element_Slide Vector(Y). Configured to: When in click mode, map the dy value of the thumb's up-and-down swipe to an incremental setting of the Blender brush tool's size property.
[0038] He then customized another set of gestures for the PowerPoint presentation. Figure 4 b): Gesture "Page Turning": Element_Slide Vector(X). Configured so that when the cursor is in the PowerPoint presentation window, left and right swipe gestures are directly mapped to the Page Up / Page Down keys on the keyboard.
[0039] He saved the two schemes separately and configured context mappings: when the cursor was in the Blender window, the "3D Modeling" scheme was activated; when the cursor was in the PowerPoint window, the "Presentation Control" scheme was activated. After that, he didn't need to manually switch between them; the system automatically provided the most suitable interaction method based on the software he was pointing to.
[0040] Furthermore, the artist can export their meticulously tuned "Blender Advanced Modeling Gesture Package" and share it with colleagues. Colleagues can simply import the configuration file to immediately enjoy the same efficient operating experience, demonstrating the immense convenience brought by sharing this invention.
[0041] In summary, this invention, built upon a powerful and stable underlying spatial interaction technology, constructs a flexible, intelligent, and scalable interaction definition and management framework. It transforms interactive devices from fixed-function "tools" into "platforms" adaptable to countless scenarios, greatly enhancing the practicality and longevity of the technology.
[0042] The above specific embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Those skilled in the art can make several modifications or equivalent substitutions without departing from the spirit and scope of the present invention. These modifications or equivalent substitutions should also be considered to fall within the protection scope of the present invention.
Claims
1. A smart interaction method based on programmable gestures and context awareness, characterized in that, The method operates on an interactive system with spatial pointing and gesture recognition capabilities, and includes the following steps: Gesture element abstraction steps: Abstract the underlying interaction data and events output by the interaction system into standardized interaction elements that can be combined and mapped; Interaction logic programming steps: Through a configuration interface, users are allowed to combine multiple interaction elements through logical relationships to form custom interaction logic, and map the output of the logic to the control interface of the target controlled object; Context awareness and activation steps: Based on the spatial pointing information provided by the interaction system in real time, determine the current interaction context, automatically activate at least one of the custom interaction logics bound to the current interaction context; Runtime interpretation and instruction distribution steps: Based on the activated custom interaction logic, the underlying interaction data and events generated in real time by the interaction system are interpreted, corresponding control instructions are generated, and distributed to the target controlled object.
2. The method as described in claim 1, characterized in that, The interactive elements include: Event-type elements, which correspond to discrete gesture-triggered events recognized by the interactive system; Parameter-type elements correspond to the data stream continuously output by the interactive system, representing continuous motion or state.
3. The method as described in claim 2, characterized in that, The interactive system is a smart glove spatial interaction system based on multi-sensor fusion; The event type elements include at least click events and long press events defined by the change in the contact state between the thumb and middle finger; The parameter class elements include at least the two-dimensional sliding vector generated by the thumb sliding on the contact surface, the angular velocity of the wrist rotation, and the offset of the spatial pointing coordinates relative to an anchor point in a specific interaction state.
4. The method as described in claim 1, characterized in that, The configuration interface is a visual graphical programming interface. By providing a visual list of interactive elements, a logical relationship connection tool, and a parameter mapping configuration panel, users can construct the custom interactive logic by dragging, connecting lines, and configuring parameters.
5. The method as described in claim 1, characterized in that, The method also includes an interaction logic learning step: recording a series of continuous operations performed by the user through the interaction system, automatically analyzing and attempting to parse them into a combination of the interaction elements, and saving them as new custom interaction logic after user confirmation.
6. The method as described in claim 1, characterized in that, The spatial orientation information is calculated by the interactive system by fusing positioning data and inertial measurement data, and the interactive context is the application, application window, or interface control corresponding to the spatial orientation information.
7. The method as described in claim 1, characterized in that, The method also includes an interaction scheme sharing step: standardizing and encapsulating configuration information containing at least one custom interaction logic and its context binding relationship to support the export, import, distribution and reuse of the configuration information among users.
8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the method as described in any one of claims 1 to 7.
9. An intelligent interactive system, characterized in that, include: Interactive devices used to collect user input actions; The processing device is configured as follows: a) Run the underlying interaction process to generate spatial pointing data, discrete gesture events, and continuous motion parameters; b) Run the method as described in any one of claims 1 to 7 to provide a programmable, context-aware interactive control layer.
10. The intelligent interactive system as described in claim 9, characterized in that, The interactive device is a smart glove, which integrates at least a positioning unit for providing pointing information, an inertial measurement unit for detecting hand movements, and a sensing unit for detecting finger contact states; the underlying interaction process is a smart glove spatial interaction method based on multi-sensor fusion.