Power mode switching in gaze-and-gesture tracking applications
The controller device dynamically switches between EOG and camera-based eye tracking, and EMG and camera-based gesture recognition to balance energy efficiency and reliability in augmented reality devices, optimizing power usage and user interface performance.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
- Filing Date
- 2024-12-19
- Publication Date
- 2026-06-25
AI Technical Summary
Existing user interface systems for augmented reality devices face challenges in balancing energy efficiency and operational reliability, particularly in integrating eye tracking and gesture recognition technologies like electrooculography (EOG) and electromyography (EMG), which suffer from lower resolution, calibration requirements, and applicability issues.
A controller device switches between different combinations of EOG-based and camera-based eye tracking, and EMG-based and camera-based gesture recognition to optimize power consumption and performance, activating additional modes when higher accuracy or resolution is needed.
This approach enables efficient power management by minimizing camera usage, reducing energy consumption, and ensuring reliable user interface operation in augmented reality devices.
Smart Images

Figure EP2024087427_25062026_PF_FP_ABST
Abstract
Description
[0001] POWER MODE SWITCHING IN GAZE-AND-GESTURE TRACKING APPLICATIONS
[0002] TECHNICAL FIELD
[0003] Embodiments presented herein relate to a method, a controller device, a computer program, and a computer program product for switching power modes in a gaze-and-gesture tracking application.
[0004] BACKGROUND
[0005] Recent advancements in user interface (UI) technology have introduced new means for users to interact with electronic devices. One example involves integration of gesture recognition and eye-tracking technologies, allowing users to control applications without traditional input devices, such as hand controllers, touchscreens, or keyboards. Such systems operate by detecting eye gaze and gestures, enabling intuitive control through a combination of visual confirmations and motion-based inputs.
[0006] Currently, implementations of eye tracking in virtual reality (VR) systems predominantly rely on optical or camera-based technologies, commonly referred to as videooculography (VOG). VOG provides high- resolution tracking with high accuracy but is associated with significant energy consumption.
[0007] Some mixed-reality headsets, and other types of user devices, are based on combining two or more sensor technologies for gesture recognition and eye-tracking. Such user devices often incorporate high- performance displays, processors, and multiple sensors, including externally facing cameras for gesture recognition and optical sensors for eye tracking. While such systems can accommodate the high energy demands of these components due to their size and external power sources, they are not suited for smaller devices, such as augmented reality (AR) glasses, which require more energy-efficient solutions. AR glasses typically operate in low-power modes and are designed for intermittent use, making the implementation of energy-intensive UIs impractical.
[0008] An alternative approach to camera-based eye tracking is electrooculography (EOG), offering markedly lower power requirements. However, EOG has inherent limitations, including lower resolution and a dependency on calibration that is specific to individual users and environmental conditions. Electromyography (EMG) represents another technological approach that can support gesture recognition. EMG sensors capture the electrical activity generated by skeletal muscles, enabling the detection of finger movements and gestures. EMG sensors can be placed on various parts of the body, such as the wrist or forearm, to monitor muscle activation. However, despite its energy efficiency compared to camera-based gesture recognition, EMG-based systems face challenges related to accuracy, calibration, and potential retraining requirements, especially in scenarios where the sensors are frequently removed and reapplied.
[0009] Existing solutions for low-power eye tracking, such as those combining EOG and VOG technologies, have sought to balance the trade-offs between energy efficiency and accuracy. Similarly, EMG-based systems have been explored as a lower-power alternative to camera-based gesture recognition. However, standalone implementations of these technologies are often limited by their individual disadvantages, including calibration requirements and restricted applicability across various usage scenarios.
[0010] Hence, a challenge remains in developing UI systems that combines the above-mentioned technologies to deliver both energy efficiency and operational reliability.
[0011] SUMMARY
[0012] An object of embodiments herein is to address the above challenges.
[0013] A particular object is to enable selection between different combinations of eye tracking and gesture recognition technologies.
[0014] According to a first aspect there is presented a controller device for switching power modes in a gaze- and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection. The gaze-and-gesture tracking application is to be run in an AR device. The AR device comprises a display. The controller device comprises processing circuitry. The processing circuitry is configured to cause the controller device to activate a first UI input mode for the gaze-and-gesture tracking application. The first UI input mode is based on a first combination of EOG-based or camerabased eye tracking and EMG-based or camera-based gesture tracking. The processing circuitry is configured to cause the controller device to, responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and- gesture tracking application, activate a second UI input mode for the gaze-and-gesture tracking application. The second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination.
[0015] According to a second aspect there is presented a system. The system comprises the controller device according to the first aspect, and electrodes and sensors to be placed on a user and configured to perform the EOG-based eye tracking and the EMG-based gesture recognition.
[0016] According to a third aspect there is presented a method for switching power modes in a gaze-and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection. The gaze-and-gesture tracking application is to be run in an AR device. The AR device comprises a display. The method is performed by a controller device. The method comprises activating a first UI input mode for the gaze-and-gesture tracking application. The first UI input mode is based on a first combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking. The method comprises, responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application, activating a second UI input mode for the gaze-and-gesture tracking application. The second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination. According to a fourth aspect there is presented a computer program for switching power modes in a gaze- and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection. The gaze-and-gesture tracking application is to be run in an AR device. The AR device comprises a display. The computer program comprises computer code which, when run on processing circuitry of a controller device, causes the controller device to perform actions. One action comprises the controller device to activate a first UI input mode for the gaze-and-gesture tracking application. The first UI input mode is based on a first combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking. One action comprises the controller device to, responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application, activate a second UI input mode for the gaze-and-gesture tracking application. The second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination.
[0017] According to a fifth aspect there is presented a computer program product comprising a computer program according to the fourth aspect and a computer readable storage medium on which the computer program is stored. The computer readable storage medium could be a non-transitory computer readable storage medium.
[0018] Advantageously, these aspects enable efficient switching between different combinations of eye tracking and gesture recognition technologies.
[0019] Advantageously, these aspects enable camera-based eye tracking and gesture recognition to be used only when necessary from an eye tracking and gesture recognition performance perspective.
[0020] Advantageously, these aspects enable an eye tracking and gesture recognition combination to be selected that yields substantial power consumption benefits compared to traditional head-mounted displays.
[0021] Other objectives, features and advantages of the enclosed embodiments will be apparent from the following detailed disclosure, from the attached dependent claims as well as from the drawings.
[0022] Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a / an / the element, apparatus, component, means, module, step, etc." are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, module, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated. BRIEF DESCRIPTION OF THE DRAWINGS
[0023] The inventive concept is now described, by way of example, with reference to the accompanying drawings, in which:
[0024] Fig. 1 is a schematic diagram illustrating a gaze-and-gesture tracking system according to embodiments;
[0025] Figs. 2 and 3 are block diagrams of devices according to embodiments;
[0026] Figs. 4 shows different power modes in a gaze-and-gesture tracking application according to embodiments;
[0027] Figs. 5, 6, and 7 are flowcharts of methods according to embodiments;
[0028] Fig. 8 is a schematic diagram showing structural units of a controller device according to an embodiment; and
[0029] Fig. 9 shows one example of a computer program product comprising computer readable storage medium according to an embodiment.
[0030] DETAILED DESCRIPTION
[0031] The inventive concept will now be described more fully hereinafter with reference to the accompanying drawings, in which certain embodiments of the inventive concept are shown. This inventive concept may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided by way of example so that this disclosure will be thorough and complete, and will fully convey the scope of the inventive concept to those skilled in the art. Like numbers refer to like elements throughout the description. Any step or feature illustrated by dashed lines should be regarded as optional.
[0032] As disclosed above, challenges remain in developing UI systems that combines the above-mentioned technologies to deliver both energy efficiency and operational reliability.
[0033] Fig. 1 is a schematic diagram illustrating AR devices 110, represented by a pair of smart glasses, or AR glasses, for gaze-and-gesture tracking according to embodiments. The gaze-and-gesture tracking might be performed for a gaze-and-gesture tracking application pertaining to gaming, navigation, or tracking user behaviour in an AR environment. Further, the eye-tracking application might be performed for an application pertaining to an online meeting, an online learning event, etc. In these examples, there may be different UI input modes for the gaze-and-gesture tracking application. Gaze-tracking may be performed using either electrooculography-based eye-tracking or camera-based eye-tracking. The electrooculography-based eye-tracking is based on signals obtained from electrodes 120. In more detail, Fig. 1 schematically illustrates different types of implementations of electrooculography-based eyetracking, and particularly the placements of the electrodes 120. In Fig. 1(a) six electrodes 120 (denoted HR (horizontal right), VL (vertical lower), HL (horizontal left), VU (vertical upper), REF (reference), GND (ground)) are fixed to the skin of a user 150 wearing the user device 110. In Fig. 1(b) the electrodes (as represented by single electrode 120) are part of the user device 110 itself. In Fig. 1(c) the electrodes (as represented by single electrode 120) are placed on an in-ear headset 140. In Fig. 1(c) the electrodes might be integrated with audio and microphone circuitry in the in-ear headset 140. The camera-based eyetracking is based on signals obtained from an inward-facing camera 130 (i.e., a camera arranged to face the user as the user wears the AR device 110). Further, gesture tracking may be performed using either electromyography-based gesture tracking or camera-based gesture tracking. The camera-based gesture tracking is performed using one or more front-facing cameras 130 (i.e., a camera arranged to face away from the user as the user wears the AR device 110). One or more biosensors 170 can be put on fingers, on the forearm, or in the wrist of the user in order to detect finger movements and gestures. In general terms, the bio sensor may be arranged to different types of biosignals, such as electroencephalography (EEG) signals, EMG signals, electrooculography (EOG) signals, electrocardiogram (ECG) signals, galvanic skin response (GSR) signals, blood volume pulse (BVP) signals, etc. These biosignals will hereinafter be exemplified by EMG signals. For example, EMG-based gesture tracking can be performed based on recordings of the electrical activity produced by skeletal muscles of the user. The biosensor 170 may then be configured to detect the electric potential generated by muscle cells when these cells are electrically or neurologically activated. In the illustrative example of Fig. 1, a biosensor 170 is provided in a smart wearable 160 to be worn by the user. The wearable electronic device 160 may be a smartwatch, or other type of device, such as a smart wristband. The user might wear two such wearable electronic devices 160, e.g., one per wrist, or only one. In case of multiple wearable electronic devices 160 being worn by the user, these wearable electronic devices 160 do not necessarily have to be of the same type. The system 100 further comprises a controller device. In some examples, the controller device is integrated with the AR device 110 or the wearable electronic device 160. In other examples, the controller device is provided in a user equipment. The controller device is configured to control which UI input mode is used for the gaze-and-gesture tracking application, as will be further disclosed hereinafter.
[0034] Whereas both EOG-based eye-tracking and EMG-based gesture recognition technologies are known per se, the herein disclosed embodiments are based on using them jointly and / or in combination with either camera-based eye-tracking or camera -based gesture recognition for enabling low power consumption in AR devices 110.
[0035] Block diagrams of an AR device 210, a wearable electronic device 220, and EOG device 230 will be described next with reference to Fig. 2.
[0036] The AR device 210 comprises at least one camera 210a for eye-tracking (inwards-facing camera(s) plus infrared light-emitting diodes) and at least one camera 210b for gesture recognition (outwards / downwards facing cameras). Further, the AR device 210 comprises a display 210c for rendering a visual user interface on which various selectable objects, or icons, can be rendered in an AR environment. The general operation of the AR device 210 is controlled by a controller 21 Od, comprising processing circuitry, power and system control, local communication within the AR device 210, etc. Instructions and other type of data can be stored in a memory 210e. A communication interface 21 Of is provided for communication to external devices, such as a controller device, wearable electronic devices, EOG devices, wireless headphones, and the like. The communication interface 210f might implement any, or any combination of, a Bluetooth interface, an IEEE 802.11 interface, a third-generation partnership project (2GPP) side-link interface, or any other local communication interface, or even a cellular communication interface. The AR device 210 may comprise further modules, such as an EOG-based system for low-power low-accuracy eye-tracking.
[0037] The wearable electronic device 220 comprises at least one first sensor 220a for gesture recognition based on EMG signals, thus configured for sensing movements of muscles etc. of the arm and hand. The wearable electronic device 220 comprises at least one second sensor 220b for force, angular rate, and / or body orientation of the user, such as an inertial measurement unit (IMU). Further, the wearable electronic device 220 comprises a feedback module 220c for providing feedback (tactile, audible, or visual) to a user. The general operation of the wearable electronic device 220 is controlled by a controller 220d, comprising processing circuitry, power and system control, local communication within the wearable electronic device 220, etc. Instructions and other type of data can be stored in a memory 220e. A communication interface 220f is provided for communication to external devices, such as a controller device, AR devices, EOG devices, and the like. The communication interface 220f might implement any, or any combination of, a Bluetooth interface, an IEEE 802. 11 interface, a 2GPP side-link interface, or any other local communication interface, or even a cellular communication interface.
[0038] The EOG device 230 comprises at least one first sensor 230a, or electrode, arranged to be fixed to the skin of a user for gesture recognition based on EOG signals. Further, the wearable electronic device 230 comprises a feedback module 230b for providing feedback (tactile, audible, or visual) to a user. The general operation of the wearable electronic device 230 is controlled by a controller 230c, comprising processing circuitry, power and system control, local communication within the wearable electronic device 230, etc. Instructions and other type of data can be stored in a memory 230d. A communication interface 230e is provided for communication to external devices, such as a controller device, AR devices, wearable electronic devices, and the like. The communication interface 230e might implement any, or any combination of, a Bluetooth interface, an IEEE 802.11 interface, a 2GPP side-link interface, or any other local communication interface, or even a cellular communication interface.
[0039] As disclosed above, the controller device is configured to control which UI input mode is used for the gaze-and-gesture tracking application. This is illustrated in the block diagram 300 of Fig. 3. The block diagram 300 may be implemented by the controller device. The block diagram 300 comprises a UI control module 310 configured to control which UI input mode is to be used for the gaze-and-gesture tracking application. Since different combinations of eye-tracking and gesture tracking will be used for different UI input modes, the block diagram 300 comprises a gaze and gesture control module 320 configured to provide instructions for activation, deactivation, calibration, and other types of instructions to the AR device, the wearable electronic device, and the EOG device. For this purpose, the block diagram 300 comprises a camera (CAM) control and calibration module 330a that interfaces the AR device, an EMG control and calibration module 330b that interfaces the wearable electronic device, and an EOG control and calibration module 330c that interfaces the EOG device.
[0040] As will be disclosed in more detail hereinafter, there is disclosed methods for switching power modes in a gaze-and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection, making use of a combination of EOG and camera-based gaze tracking as well as EMG and camera-based gesture recognition, as well as other combinations. In this way, most of the time, the cameras can be in a sleep mode, which will minimize the power consumption of the AR device. This is made possible through switching between the camera-based techniques and EOG as well as EMG, and the adaption of the UI in order to maximize their usage. Reference is here made to Fig. 4, which shows a scheme 400 comprising different power modes 410:460 that can be used in a gaze-and- gesture tracking application. Bi-directional arrows indicate some possible switching between the different power modes 410:460. Generally, the power consumption increases from left to right. I.e., the only EOG power mode 410 and / or the only EMG power mode 420 require(s) least amount of power and the CAM+CAM power mode 460 requires most amount of power.
[0041] After start-up of the gaze-and-gesture tracking application, the first power mode to be used is the only EOG power mode 410 in which only EOG-based eye tracking is activated. This is sufficient for basic operation of the AR device until more advanced or higher resolution input or control is required. This is a power mode which the system software (e.g. home screen or basic control setup of the AR environment) and applications have access to, aware if the constraints. This allows maximizing the usage in low-power UI input mode to minimize power consumption.
[0042] In the only EOG power mode 410 icons as displayed on a visual UI of the AR device are separated based on the resolution and accuracy of the EOG-based eye tracking.
[0043] The only EMG power mode 420 is restricted to perform gesture recognition based on one or more EMG sensors placed on one or more wrists of the user, whereas in the EMG+EOG power mode 430 and / or the CAM+EMG power mode 450 it could be possible to perform gesture recognition based on one or more EMG sensors placed on either a single wrist or both wrists of the user. Hence, in the only EMG power mode 420 the UI should be adapted to avoid the need for dual-hand gestures by avoiding those options in menus etc. Further, since the EMG-based gesture recognition may be run in the background (e.g., for calibration purposes) when camera-based gesture recognition is used, it may in the CAM+EOG power mode 440 and in the CAM+CAM power mode 160 use a combination of EMG and camera-based gesture recognition. In the EMG+EOG power mode 430 only gestures that are accurately recognizable by the EMG-based gesture recognition are used. This can be secured by EMG calibration and feedback to the user about which gestures are possible and guiding the user how these gestures can be better performed (if needed).
[0044] Since EOG has lower resolution and larger inaccuracy (more margin needed), then multiple objects might fall into the detected gaze. If this occurs, not only one, but several icons may be visually marked, e.g. by gracefully indicating a region rather than a single icon on the visual UI of the AR device, and an “activate” gesture would not function unless more accurate selection has been performed. Instead of enabling the “activate” gesture, e.g. “pinch” gestures or “adjust” gestures could be enabled, allowing the user with simple EMG-detectable gestures to fine-tune the selection before any “activate” gesture can be accepted. In many cases, icons and other objects have been spread out sufficiently so that in-region finetuning with EMG gestures should not be needed.
[0045] When the gaze-and-gesture tracking application requires a more advanced UI, e.g. more advanced gestures or higher-resolution gaze detection, either the CAM+EOG power mode 440 or the CAM+EMG power mode 450 will be enabled. This occurs automatically without any need for activation from the user. The gesture-based and gaze-based camera systems are enabled independently, depending on the respective needs, in order to minimize their energy consumption. In this way, either the CAM+EOG power mode 440 or the CAM+EMG power mode 450 can be enabled, depending on if either camerabased eye tracking or camera-based gesture recognition is required by the gaze-and-gesture tracking application.
[0046] Further, when the gaze-and-gesture tracking application requires an even more advanced UI, e.g. even more advanced gestures or even higher-resolution gaze detection, the CAM+CAM power mode 160 will be enabled.
[0047] In some examples, whenever camera-based eye tracking and / or camera-based gesture recognition is activated, also EOG and EMG technologies are active in parallel for training purposes, such as finetuning and calibration. This is possible due to the comparatively low power consumption of both EOGbased eye tracking and EMG-based gesture tracking.
[0048] Fig. 5 is a flowchart illustrating embodiments of methods for switching power modes in a gaze-and- gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection. The methods are performed by the controller device 800. The methods are advantageously provided as computer programs. The gaze-and-gesture tracking application is to be run in an AR device 110. The AR device 110 comprises a display.
[0049] S 102: The controller device activates a first UI input mode for the gaze-and-gesture tracking application. The first UI input mode is based on a first combination of EOG or camera-based eye tracking and EMG or camera-based gesture tracking. S104: The controller device, responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application, activates a second UI input mode for the gaze-and-gesture tracking application. The second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination.
[0050] Embodiments relating to further details of switching power modes in a gaze-and-gesture tracking application as performed by the controller device will now be disclosed with continued reference to Fig.
[0051] 5.
[0052] Different power modes have been disclosed with reference to Fig. 4. In general terms, each of these power modes can be associated with a UI input mode. Focus will here be on four of the power modes, namely EMG+EOG 430, CAM+EOG 440, CAM+EMG 450, and CAM+CAM 460.
[0053] In a first example, a switch is made from EMG+EOG 430 directly to CAM+CAM 460. Thus, in a first embodiment, the first UI input mode is based on EOG-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on camera-based eye tracking and camera-based gesture recognition.
[0054] In a second example, a switch is made from CAM+EMG 450 to CAM+EOG 440. Thus, in a second embodiment, the first UI input mode is based on camera-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on EOG-based eye tracking and camera-based gesture recognition.
[0055] In a third example, a switch is made from EMG+EOG 430 to CAM+EOG 440. Thus, in a third embodiment, the first UI input mode is based on EOG-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on EOG-based eye tracking and camera-based gesture recognition.
[0056] In a fourth example, a switch is made from EMG+EOG 430 to CAM+EMG 450. Thus, in a fourth embodiment, the first UI input mode is based on EOG-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on camera-based eye tracking and EMG-based gesture recognition.
[0057] In a fifth example, a switch is made from CAM+EOG 440 to CAM+CAM 460. Thus, in a fifth embodiment, the first UI input mode is based on EOG-based eye tracking and camera-based gesture recognition, and the second UI input mode is based on camera-based eye tracking and camera-based gesture recognition.
[0058] In a sixth example, a switch is made from CAM+EMG 450 to CAM+CAM 460. Thus, in a sixth embodiment, the first UI input mode is based on camera-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on camera-based eye tracking and camera-based gesture recognition.
[0059] Generally, based on the above, there could be different properties that characterize the different UI input modes.
[0060] Some properties concern the required resolution of the UI and the amount of available gestures. In this respect, in some aspects the resolution is higher and / or there are more gestures in the second UI input mode than in the first UI input mode. Hence, in some embodiments, the second UI input mode is associated with higher resolution gaze detection requirement, and is associated with a greater set of gestures, than the first UI input mode. For example, there may be more and / or closer placed icons, or objects, in the second UI input mode compared to the first UI input mode. That is, in some embodiments, the second UI input mode is associated with the higher resolution gaze detection requirement by comprising more selectable objects and / or selectable objects that are closer together than the first UI input mode.
[0061] As already disclosed, any camera-based system will be enabled when the gaze-and-gesture tracking application requires a more advanced UI. In particular, in some embodiments, the performance criterion fails to be fulfilled when the gaze-and-gesture tracking application has a requirement for a more advanced user interface than provided by the first UI input mode. This more advanced user interface may be associated with higher resolution gaze detection requirements than the first UI input mode and / or is associated with a greater set of gestures than the first UI input mode. In other embodiments, the performance criterion fails to be fulfilled when the number of erroneous eye tracking and gesture recognition results produced using the first UI input mode exceeds some threshold value.
[0062] In yet other embodiments, the performance criterion fails to be fulfilled when the user is performing a gesture (e.g., with their hands in their pockets or with their hands behind their back) that cannot be captures by camera-based gesture recognition.
[0063] As already disclosed, upon starting the gaze-and-gesture tracking application, EMG-based gesture recognition is activated. Therefore, in some embodiments, the controller device is configured to, upon start-up of the gaze-and-gesture tracking application, perform (optional) steps SI 02-2 and SI 02-4.
[0064] S 102-2: The controller device activates the EOG-based eye tracking and the EMG-based gesture recognition for the first UI input mode.
[0065] S 102-4: The controller device maintains the camera-based eye tracking and gesture recognition in a sleep mode.
[0066] Further, when camera-based eye tracking and gesture recognition is used, a switch back to a low-power UI input mode can be made once the performance of the EOG-based eye tracking and the EMG-based gesture recognition is sufficient, in order to reduce power consumption. Therefore, in some embodiments, the controller device is configured to, when the second UI input mode has been entered, perform steps SI 04-2 and SI 04-6.
[0067] S 104-2: The controller device activates background use of the EOG-based eye tracking and the EMG- based gesture recognition in parallel with the camera-based eye tracking and gesture recognition.
[0068] S 104-6: The controller device, responsive to detecting that performance of the EOG-based eye tracking and the EMG-based gesture recognition fulfil an accuracy criterion and / or when a critical low power condition is detected, deactivates the camera-based eye tracking and gesture recognition for the second UI input mode of the gaze-and-gesture tracking application.
[0069] In some examples, the accuracy criterion is fulfilled when accuracy of the EOG-based eye tracking and the EMG-based gesture recognition is at most within a threshold from the camera-based eye tracking and gesture recognition. Further, in some examples, the critical low power condition is fulfilled when a power supply for the gaze-and-gesture tracking application falls below a power supply threshold level, as in step S102-6.
[0070] S 102-6: The controller device activates the EOG-based eye tracking and the EMG-based gesture recognition for the first UI input mode of the gaze-and-gesture tracking application.
[0071] As previously disclosed, whenever camera-based eye tracking and / or camera-based gesture recognition is activated, also EOG and EMG technologies are active in parallel for training purposes, such as finetuning and calibration. Hence, in some embodiments, the controller device is configured to, upon having activated the camera-based eye tracking and gesture recognition, perform (optional) step S 1.
[0072] S 104-4: The controller device initiates calibration of the EOG-based eye tracking and / or the EMG-based gesture recognition.
[0073] One particular embodiment for selecting power mode for a gaze-and-gesture tracking application based on at least some of the above disclosed embodiments will now be disclosed in detail with reference to the flowchart of Fig. 6. One aim of this embodiment is to handle power restrictions by keeping any camerabased technology in sleep mode as long as possible. EMG-based gesture tracking is assumed to be always available, but not always able to recognize all gestures, but at least (one or several) specific start-up gestures, or trigger gestures.
[0074] S201: The gaze-and-gesture tracking application is started. All camera-based technologies remain in sleep mode and EMG-based gesture recognition is activated only for specific start-up gestures, or trigger gestures.
[0075] S202: The EOG-based eye tracking is activated. S203 : The EOG-based eye tracking is calibrated, if not already calibrated for the user.
[0076] S204: It is checked whether the EOG-based eye tracking performance fulfils a performance criterion for the gaze-and-gesture tracking application. If no, step S205 is entered. If yes, step S206 is entered.
[0077] S205: The camera-based eye tracking is activated. The EOG-based eye tracking is calibrated.
[0078] S206: It is checked whether the EMG-based gesture recognition performance fulfils a performance criterion for the gaze-and-gesture tracking application. If yes, step S204 is entered again after some time interval. If no, step S207 is entered.
[0079] S207: It is checked whether the user’s hand, or hands, is / are in the field of view of the camera used for camera-based gesture recognition. If yes, step S208 is entered. If no, step S209 is entered.
[0080] S208: Either feedback is provided to the user that the hands are not visible for the camera-based gesture recognition or the UI input mode is adapted to only consider gesture recognition of a limited set of gestures.
[0081] S209: The camera-based gesture recognition is activated. The EMG-based gesture recognition is calibrated. Step S204 can then be entered again after some time interval.
[0082] S210: It is checked whether the EMG-based gesture recognition performance fulfils a performance criterion for the UI input mode that only considers gesture recognition of the limited set of gestures. If yes, step S204 is entered again after some time interval. If no, step S207 is entered.
[0083] One particular embodiment for interaction between the wearable electronic device and the AR device based on at least some of the above disclosed embodiments will now be disclosed in detail with reference to the combined flowchart and signaling diagram of Fig. 7.
[0084] S301: The gaze-and-gesture tracking application is started. The EMG-based gesture recognition is activated only for specific start-up gestures, or trigger gestures.
[0085] S302: The camera-based eye tracking and the EOG-based eye tracking are both in sleep mode.
[0086] S303: The EMG-based gesture recognition recognizes a start gesture as performed by the user and notifies the AR device.
[0087] S304: The EOG-based eye tracking is activated. The camera-based eye tracking remains in sleep mode.
[0088] S305: The EMG-based gesture recognition is activated for all gestures.
[0089] S306: The visual UI in the AR device is activated with spread out icons and other objects. S307: The EMG-based gesture recognition recognizes a gesture as performed by the user.
[0090] S308: The EOG-based eye tracking determines the gaze of the user. By means of the determined gaze, one of the displayed icons or other objects is identified.
[0091] S309: The relevant action, as given by the gesture recognized in step S307, and associated with the icon identified by the eye tracking in step S308 is performed.
[0092] Fig. 8 schematically illustrates, in terms of a number of structural units, the components of a controller device 800 according to an embodiment. Processing circuitry 810 is provided using any combination of one or more of a suitable central processing unit (CPU), multiprocessor, microcontroller, digital signal processor (DSP), etc., capable of executing software instructions stored in a computer program product 910 (as in Fig. 9), e.g. in the form of a storage medium 830. The processing circuitry 810 may further be provided as at least one application specific integrated circuit (ASIC), or field programmable gate array (FPGA).
[0093] Particularly, the processing circuitry 810 is configured to cause the controller device 800 to perform a set of operations, or steps, as disclosed above. For example, the storage medium 830 may store the set of operations, and the processing circuitry 810 may be configured to retrieve the set of operations from the storage medium 830 to cause the controller device 800 to perform the set of operations. The set of operations may be provided as a set of executable instructions.
[0094] Thus, the processing circuitry 810 is thereby arranged to execute methods as herein disclosed. The storage medium 830 may also comprise persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid state memory or even remotely mounted memory. The controller device 800 may further comprise a communications (comm.) interface 820 at least configured for communications with other entities, functions, nodes, and devices, such as the AR device, the wearable electronic device, and the EOG device. As such the communications interface 820 may comprise one or more transmitters and receivers, comprising analogue and digital components. The processing circuitry 810 controls the general operation of the controller device 800 e.g. by sending data and control signals to the communications interface 820 and the storage medium 830, by receiving data and reports from the communications interface 820, and by retrieving data and instructions from the storage medium 830. Other components, as well as the related functionality, of the controller device 800 are omitted in order not to obscure the concepts presented herein.
[0095] The controller device 800 may be provided as a standalone device or as a part of at least one further device. Thus, a first portion of the instructions performed by the controller device 800 may be executed in a first device, and a second portion of the of the instructions performed by the controller device 800 may be executed in a second device; the herein disclosed embodiments are not limited to any particular number of devices on which the instructions performed by the controller device 800 may be executed. Hence, the methods according to the herein disclosed embodiments are suitable to be performed by a controller device 800 residing in a cloud computational environment. Therefore, although a single processing circuitry 810 is illustrated in Fig. 8 the processing circuitry 810 may be distributed among a plurality of devices, or nodes. The same applies to the computer program 920 of Fig. 9.
[0096] Fig. 9 shows one example of a computer program product 910 comprising computer readable storage medium 930. On this computer readable storage medium 930, a computer program 920 can be stored, which computer program 920 can cause the processing circuitry 810 and thereto operatively coupled entities and devices, such as the communications interface 820 and the storage medium 830, to execute methods according to embodiments described herein. The computer program 920 and / or computer program product 910 may thus provide means for performing any steps as herein disclosed.
[0097] In the example of Fig. 9, the computer program product 910 is illustrated as an optical disc, such as a CD (compact disc) or a DVD (digital versatile disc) or a Blu-Ray disc. The computer program product 910 could also be embodied as a memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or an electrically erasable programmable read-only memory (EEPROM) and more particularly as a non-volatile storage medium of a device in an external memory such as a USB (Universal Serial Bus) memory or a Flash memory, such as a compact Flash memory. Thus, while the computer program 920 is here schematically shown as a track on the depicted optical disk, the computer program 920 can be stored in any way which is suitable for the computer program product 910.
[0098] The inventive concept has mainly been described above with reference to a few embodiments. However, as is readily appreciated by a person skilled in the art, other embodiments than the ones disclosed above are equally possible within the scope of the inventive concept, as defined by the appended patent claims.
Claims
CLAIMS1. A controller device (800) for switching power modes in a gaze-and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection, wherein the gaze-and-gesture tracking application is to be run in an augmented reality, AR, device (110), wherein the AR device (110) comprises a display, the controller device (800) comprising processing circuitry (810) configured to: activate a first user interface, UI, input mode for the gaze-and-gesture tracking application, wherein the first UI input mode is based on a first combination of electrooculography, EOG, based or camerabased eye tracking and electromyography, EMG, based or camera-based gesture tracking; and responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application: activate a second UI input mode for the gaze-and-gesture tracking application, wherein the second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG- based or camera-based gesture tracking, different from the first combination.
2. The controller device (800) according to claim 1, wherein the first UI input mode is based on EOGbased eye tracking and EMG-based gesture recognition, and the second UI input mode is based on camera-based eye tracking and camera-based gesture recognition.
3. The controller device (800) according to claim 1, wherein the first UI input mode is based on camera-based eye tracking and EMG-based gesture recognition, and the second UI input mode is based on EOG-based eye tracking and camera-based gesture recognition.
4. The controller device (800) according to any preceding claim, wherein the second UI input mode is associated with higher resolution gaze detection requirement, and is associated with a greater set of gestures, than the first UI input mode.
5. The controller device (800) according to claim 4, wherein the second UI input mode is associated with the higher resolution gaze detection requirement by comprising more selectable objects and / or selectable objects that are closer together than the first UI input mode.
6. The controller device (800) according to any preceding claim, wherein the performance criterion fails to be fulfilled when the gaze-and-gesture tracking application has a requirement for a more advanced user interface than provided by the first UI input mode.
7. The controller device (800) according to any preceding claim, wherein the processing circuitry (810) further is configured to, upon start-up of the gaze-and-gesture tracking application:activate the EOG-based eye tracking and the EMG-based gesture recognition for the first UI input mode; and maintain the camera-based eye tracking and gesture recognition in a sleep mode.
8. The controller device (800) according to any preceding claim, wherein the processing circuitry (810) further is configured to, in the second UI input mode: activate background use of the EOG-based eye tracking and the EMG-based gesture recognition in parallel with the camera-based eye tracking and gesture recognition; and responsive to detecting that performance of the EOG-based eye tracking and the EMG-based gesture recognition fulfil an accuracy criterion and / or when a critical low power condition is detected: deactivate the camera-based eye tracking and gesture recognition for the second UI input mode of the gaze-and-gesture tracking application; and activate the EOG-based eye tracking and the EMG-based gesture recognition for the first UI input mode of the gaze-and-gesture tracking application.
9. The controller device (800) according to claim 8, wherein the accuracy criterion is fulfilled when accuracy of the EOG-based eye tracking and the EMG-based gesture recognition is at most within a threshold from the camera-based eye tracking and gesture recognition.
10. The controller device (800) according to claim 8, wherein the critical low power condition is fulfilled when a power supply for the gaze-and-gesture tracking application falls below a power supply threshold level.
11. The controller device (800) according to claim 6, wherein said more advanced user interface is associated with higher resolution gaze detection requirements than the first UI input mode and / or is associated with a greater set of gestures than the first UI input mode.
12. The controller device (800) according to any preceding claim, wherein the performance criterion fails to be fulfilled when number of erroneous eye tracking and gesture recognition results produced using the first UI input mode exceeds a threshold value.
13. The controller device (800) according to any preceding claim, wherein the processing circuitry (810) further is configured to, upon having activated the camera-based eye tracking and gesture recognition: initiate calibration of the EOG-based eye tracking and / or the EMG-based gesture recognition.1714. A system, the system comprising the controller device (800) according to any proceeding claim, and electrodes (120) and sensors (170) to be placed on a user (150) and configured to perform the EOGbased eye tracking and the EMG-based gesture recognition.
15. The system according to claim 14, wherein the system further comprises a camera (130) configured to perform the camera-based eye tracking and gesture recognition.
16. A method for switching power modes in a gaze-and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection, wherein the gaze-and-gesture tracking application is to be run in an augmented reality, AR, device (110), wherein the AR device (110) comprises a display, the method being performed by a controller device (800), the method comprising: activating (SI 02) a first user interface, UI, input mode for the gaze-and-gesture tracking application, wherein the first UI input mode is based on a first combination of electrooculography, EOG, based or camera-based eye tracking and electromyography, EMG, based or camera-based gesture tracking; and responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application: activating (SI 04) a second UI input mode for the gaze-and-gesture tracking application, wherein the second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination.
17. A computer program (920) for switching power modes in a gaze-and-gesture tracking application that combines eye tracking for object selection and gesture recognition for action selection, wherein the gaze-and-gesture tracking application is to be run in an augmented reality, AR, device (110), wherein the AR device (110) comprises a display, the computer program comprising computer code which, when run on processing circuitry (810) of a controller device (800), causes the controller device (800) to: activate (S 102) a first user interface, UI, input mode for the gaze-and-gesture tracking application, wherein the first UI input mode is based on a first combination of electrooculography, EOG, based or camera-based eye tracking and electromyography, EMG, based or camera-based gesture tracking; and responsive to detecting that eye tracking and / or gesture recognition performance of the first UI input mode fails to fulfil a performance criterion for the gaze-and-gesture tracking application: activate (S 104) a second UI input mode for the gaze-and-gesture tracking application, wherein the second UI input mode is based on a second combination of EOG-based or camera-based eye tracking and EMG-based or camera-based gesture tracking, different from the first combination.1818. A computer program product (910) comprising a computer program (920) according to claim 17, and a computer readable storage medium (930) on which the computer program is stored.