Dynamic sensor selection for visual-inertial odometry systems
By adjusting the sensor output in wearable devices, the problems of low efficiency and low accuracy caused by sensor power consumption were solved, achieving efficient and accurate environmental tracking.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SNAP INC
- Filing Date
- 2021-06-08
- Publication Date
- 2026-06-16
AI Technical Summary
When wearable mobile devices are located in the environment, the power consumption of sensors leads to low system efficiency and low tracking accuracy.
By adjusting the output of multiple sensors, such as turning sensors on/off, changing the sampling rate, adjusting the resolution, or adjusting the sensor quality, the system's ability to track motion in the environment can be improved while maintaining high efficiency without reducing tracking accuracy.
This approach improves system efficiency and maintains tracking accuracy while enhancing the system's motion tracking capabilities, thus avoiding excessive sensor power consumption.
Smart Images

Figure CN115812189B_ABST
Abstract
Description
[0001] Cross-references to related applications
[0002] This application claims priority to U.S. Provisional Patent Application No. 63 / 045,583, filed June 29, 2020, and U.S. Patent Application No. 17 / 122,688, filed December 15, 2020, the entire contents of which are incorporated herein by reference. Technical Field
[0003] The examples described herein relate to the fields of augmented reality (AR) and wearable mobile devices, such as eye-worn devices. More specifically, but not as a limitation, the invention describes augmented reality guidance for users in their environment. Background Technology
[0004] Wearable mobile devices use various sensors to determine their location within the physical environment. Sensors consume power. Attached Figure Description
[0005] The features of the various specific embodiments disclosed will be readily understood from the following detailed description with reference to the accompanying drawings. Each feature is represented by a reference numeral in the specification and several views of the drawings. When multiple similar features exist, a single reference numeral can be assigned to each similar feature, using a lowercase letter to indicate the specific feature.
[0006] Unless otherwise stated, the various features shown in the figures are not drawn to scale. The dimensions of the individual features may be enlarged or reduced for clarity. Several figures depict one or more specific embodiments and are presented by way of example only and should not be construed as limiting. The following figures are included in the figures:
[0007] Figure 1A This is a side view (right) of an exemplary hardware configuration for an eye-worn device suitable for a visual inertial tracking system;
[0008] Figure 1B yes Figure 1A A partial cross-sectional perspective view of the right corner of the eye-wearing device, depicting the right visible light camera and circuit board;
[0009] Figure 1C yes Figure 1A A side view (left) of an exemplary hardware configuration of an eye-wearing device, showing the left visible light camera;
[0010] Figure 1D yes Figure 1C A partial cross-sectional perspective view of the left corner of the eye-wearing device, depicting the left visible light camera and circuit board;
[0011] Figure 2A and Figure 2BThis is a rear view of an exemplary hardware configuration of an eye-wearing device used in an augmented reality generation system;
[0012] Figure 3 It is a graphical depiction of a 3D scene, the left raw image captured by the left visible light camera, and the right raw image captured by the right visible light camera;
[0013] Figure 4 It is a functional block diagram of an exemplary visual inertial odometry system that includes wearable devices (e.g., eye-wearing devices) and server systems connected via various networks;
[0014] Figure 5 It is used for Figure 4 A graphical representation of an exemplary hardware configuration of a mobile device that generates augmented reality;
[0015] Figure 6 This is a schematic illustration of a user in an exemplary environment used to describe real-time location and map building;
[0016] Figure 7 This is a flowchart outlining the steps of an exemplary method for determining the location of an eye-wearing device within its environment using multiple sensors;
[0017] Figure 8A , Figure 8B , Figure 8C , Figure 8D , Figure 8E and Figure 8F This is a flowchart outlining the steps of an exemplary method for determining the location of an eye-wearing device within its environment using multiple sensors. Detailed Implementation
[0018] The reference examples describe various specific implementations and details, including a visual-inertial tracking method for an eye-wearing device with multiple sensors. The eye-wearing device monitors multiple sensors of a visual-inertial odometry system (VIOS), which provide input for determining the device's location within its environment. The eye-wearing device determines the state of the VIOS based on input information from one or more of the sensors and adjusts the sensors based on the determined state (e.g., by turning sensors on / off, changing the sampling rate, adjusting the resolution, adjusting sensor quality, or a combination thereof). The eye-wearing device then uses the adjusted multiple sensors to determine its location within the environment.
[0019] Making each of these multiple sensors output its peak value typically improves the system's ability to track motion in an environment. However, by adjusting the sensors based on, for example, the amount of information contained in the current image and the state of the tracking system, high system efficiency can be achieved without significantly reducing tracking accuracy.
[0020] The following detailed description includes systems, methods, techniques, instruction sequences, and computer program products illustrating the examples set forth in this disclosure. Numerous details and examples are included to provide a thorough understanding of the disclosed subject matter and its associated teachings. However, those skilled in the art will understand how to apply the teachings without such details. The aspects of the disclosed subject matter are not limited to the specific devices, systems, and methods described, as the associated teachings can be applied or practiced in various ways. The terminology and naming used herein are for descriptive purposes only and are not intended to be limiting. Typically, well-known examples of instructions, protocols, structures, and techniques are not necessarily shown in detail.
[0021] As used herein, the terms “coupled” or “connected” refer to any logical, optical, physical, or electrical connection (including links, etc.) through which electrical or magnetic signals generated or provided by one system element are transmitted to another coupled or connected system element. Unless otherwise stated, coupled or connected elements or devices are not necessarily directly connected to each other and may be separated by intermediate components, elements, or communication media, one or more of which may modify, manipulate, or carry electrical signals. The term “on” means that the element is directly supported by the element or indirectly supported by another element integrated into or supported by the element.
[0022] For purposes of illustration and discussion, the orientation of devices and associated components, and any other complete device incorporating a camera or inertial measurement unit, as shown in any of the accompanying drawings, is given by way of example only. In operation, the eye-wearing device may be oriented in any other direction suitable for the specific application of the eye-wearing device, such as up, down, sideways, or any other orientation. Furthermore, for the purposes of this document, any directional terms such as front, back, inside, outside, towards, left, right, sideways, longitudinal, up, down, high, low, top, bottom, side, horizontal, vertical, and diagonal are used by way of example only and do not limit the orientation or orientation of any camera or inertial measurement unit as constructed or otherwise described herein.
[0023] Other objects, advantages, and novel features of the examples will be set forth in part in the detailed description below, and in part will become apparent to those skilled in the art upon examination of the following description and the accompanying drawings, or may be learned by the generation or operation of the examples. The objects and advantages of this subject matter may be realized and achieved by means of the methods, means, and combinations particularly pointed out in the appended claims.
[0024] Now refer in detail to the accompanying drawings and the examples discussed below.
[0025] Figure 1AThis is a side view (right) of an exemplary hardware configuration of an eye-wearing device 100 including a touch-sensitive input device or touchpad 181. As shown, the touchpad 181 may have subtle and barely perceptible boundaries; alternatively, the boundaries may be clearly visible or include raised or otherwise tactile edges that provide feedback to the user about the position and boundaries of the touchpad 181. In other embodiments, the eye-wearing device 100 may include a touchpad on the left side.
[0026] The surface of touchpad 181 is configured to detect finger touches, taps, and gestures (e.g., movement touches) for use with the GUI displayed on the image display of the eye-wearing device, thereby allowing users to navigate and select menu options in an intuitive way, which improves and simplifies the user experience.
[0027] Detection of finger input on touchpad 181 enables several functions. For example, touching anywhere on touchpad 181 can cause the GUI to display or highlight an item on a display screen, which can be projected onto at least one of optical components 180A, 180B. Double-clicking on touchpad 181 selects an item or icon. Sliding or swiping a finger in a specific direction (e.g., from front to back, from back to front, from top to bottom, or from bottom to top) allows an item or icon to slide or scroll in that direction; for example, to move to the next item, icon, video, image, page, or slideshow. Sliding a finger in another direction allows sliding or scrolling in the opposite direction; for example, to move to the previous item, icon, video, image, page, or slideshow. Touchpad 181 can be located virtually anywhere on the eye-wearing device 100.
[0028] In one example, a recognized finger gesture clicked on touchpad 181 initiates the selection or pressing of graphical user interface elements in an image displayed on the image displays of optical components 180A and 180B. Adjustments to the image displayed on the image displays of optical components 180A and 180B based on the recognized finger gesture can be primary actions such as selecting or submitting graphical user interface elements on the image displays of optical components 180A and 180B for further display or execution.
[0029] As shown in the figure, the eye-wearing device 100 includes a right visible light camera 114B. As further described herein, two cameras 114A and 114B capture image information of the scene from two separate viewpoints. The two captured images can be used to project a 3D display onto an image display for viewing using 3D glasses.
[0030] The eye-worn device 100 includes a right optical component 180B, which has an image display for presenting images, such as depth images. Figure 1A and Figure 1BAs shown, the eye-wearing device 100 includes a right visible light camera 114B. The eye-wearing device 100 may include multiple visible light cameras 114A, 114B forming a passive three-dimensional camera, such as a stereo camera, wherein the right visible light camera 114B is located at the right corner 110B. Figures 1C to 1D As shown, the eye-wearing device 100 also includes a left visible light camera 114A.
[0031] Left and right visible light cameras 114A and 114B are sensitive to wavelengths within the visible light range. Each of the visible light cameras 114A and 114B has a different forward field of view, which overlaps to enable the generation of a three-dimensional depth image; for example, the right visible light camera 114B depicts a right field of view 111B. Typically, a "field of view" is a portion of a scene that is visible in a specific location and orientation in space via the camera. Fields of view 111A and 111B have an overlapping field of view 304. Figure 3 When a visible light camera captures an image, objects or object features outside the field of view 111A, 111B are not recorded in the original image (e.g., a photograph or picture). The field of view describes the angular range or amplitude of electromagnetic radiation of a given scene picked up by the image sensors of visible light cameras 114A, 114B in an image captured of that scene. The field of view can be expressed as the angular size of the view frustum; i.e., the viewing angle. The viewing angle can be measured horizontally, vertically, or diagonally.
[0032] In the examples, visible light cameras 114A and 114B have a field of view between 40° and 110° (e.g., approximately 100°) and a resolution of 480×480 pixels or greater. "Coverage angle" describes the visible light camera 114A, 114B, or infrared camera 410 (see...). Figure 2A The effective field of view is the range of angles within which a camera lens can image. Typically, a camera lens produces an image circle large enough to completely cover the camera's film or sensor, which may include some degree of vignetting (e.g., the image darkens towards the edges compared to the center). If the camera lens's coverage angle does not extend across the sensor, the image circle will be visible, typically with strong vignetting towards the edges, and the effective field of view will be limited to the coverage angle.
[0033] Examples of such visible light cameras 114A and 114B include high-resolution complementary metal-oxide-semiconductor (CMOS) image sensors and digital VGA cameras (video graphics arrays) with resolutions of 640p (e.g., 640 × 480 pixels, totaling 0.3 megapixels), 720p, or 1080p. Cameras 114A and 114B can be rolling shutter cameras where rows of the sensor array are exposed sequentially, or global shutter cameras where all rows of the sensor array are exposed simultaneously. Other examples of visible light cameras 114A and 114B can be used, for example, to capture high-definition (HD) still images and store these images at a resolution of 1642 × 1642 pixels (or greater); or to record high-definition video at a high frame rate (e.g., thirty to sixty frames per second or more) and store the recording at a resolution of 1216 × 1216 pixels (or greater).
[0034] The eye-wearing device 100 can capture image sensor data from visible light cameras 114A and 114B, as well as geolocation data digitized by an image processor, for storage in memory. The visible light cameras 114A and 114B capture corresponding left and right raw images in a two-dimensional spatial domain. These raw images include a pixel matrix in a two-dimensional coordinate system, which includes an X-axis for horizontal positioning and a Y-axis for vertical positioning. Each pixel includes color attribute values (e.g., red pixel light value, green pixel light value, or blue pixel light value); and positioning attributes (e.g., X-axis coordinates and Y-axis coordinates).
[0035] In order to capture stereoscopic images for later display as a 3D projection, the image processor 412 (in...) Figure 4 (As shown in the diagram) Visible light cameras 114A and 114B can be coupled to receive and store visual image information. Image processor 412 or another processor controls the operation of visible light cameras 114A and 114B to act as stereo cameras simulating human binocular vision and can add timestamps to each image. The timestamps on each pair of images allow the images to be displayed together as part of a 3D projection. The 3D projection produces an immersive and realistic experience, which is desired in various contexts including virtual reality (VR) and video games.
[0036] Figure 1B yes Figure 1A A cross-sectional perspective view of the right corner 110B of the eye-wearing device 100, which depicts the right visible light camera 114B and the circuit board of the camera system. Figure 1C yes Figure 1A A side view (left) of an exemplary hardware configuration of an eye-wearing device 100, showing the left visible light camera 114A of the camera system. Figure 1D yes Figure 1CA cross-sectional perspective view of the left corner portion 110A of the eye-wearing device, which depicts the left visible light camera 114A of the three-dimensional camera and the circuit board.
[0037] Except for the connection and coupling located on the left side 170A, the structure and arrangement of the left visible light camera 114A are basically similar to those of the right visible light camera 114B. For example... Figure 1B As shown in the example, the eye-wearing device 100 includes a right visible light camera 114B and a circuit board 140B, which may be a flexible printed circuit board (PCB). A right hinge 126B connects the right corner 110B to the right temple 125B of the eye-wearing device 100. In some examples, the right visible light camera 114B, the flexible PCB 140B, or other components such as electrical connectors or contacts may be located on the right temple 125B or the right hinge 126B.
[0038] The right corner portion 110B includes a corner body 190 and a corner cover. Figure 1B The corner caps are omitted from the cross-section. Inside the right corner 110B are various interconnected circuit boards, such as PCBs or flexible PCBs, including those for the right visible light camera 114B, a microphone, and low-power wireless circuitry (e.g., for use via Bluetooth). TM Controller circuits for short-range wireless network communication and high-speed wireless circuits (e.g., for wireless LAN communication via Wi-Fi).
[0039] The right visible light camera 114B is coupled to or disposed on the flexible PCB 140B and is covered by a visible light camera lens, which is aimed through an opening formed in the frame 105. For example, the right edge 107B of the frame 105, as... Figure 2A As shown, it connects to the right corner 110B and includes an opening for a visible light camera cover lens. The frame 105 includes a front side configured to face outwards and away from the user's eye. The opening for the visible light camera cover lens is formed on and extends through the front or outer side of the frame 105. In the example, the right visible light camera 114B has an outward-facing field of view 111B. Figure 3 (As shown), its line of sight or viewing angle is related to the right eye of the user of the eye-wearing device 100. The visible light camera cover lens can also be attached to the front side or outward-facing surface of the right corner 110B, wherein the opening forms an outward-facing coverage angle, but facing a different outward direction. Coupling can also be achieved indirectly via an intermediary member.
[0040] like Figure 1BAs shown, the flexible PCB 140B is disposed within the right corner portion 110B and coupled to one or more other components housed in the right corner portion 110B. Although shown as being formed on a circuit board in the right corner portion 110B, the right visible light camera 114B may be formed on a circuit board in the left corner portion 110A, temples 125A, 125B, or frame 105.
[0041] Figure 2A and Figure 2B This is a rear perspective view of an exemplary hardware configuration of the eye-wearing device 100, including two different types of image displays. The size and shape of the eye-wearing device 100 are designed to be configured for wear by a user; in this example, it is in the form of glasses. The eye-wearing device 100 may take other forms and may be combined with other types of frames, such as headbands, headphones, or helmets.
[0042] In the example of eyeglasses, the eye-wearing device 100 includes a frame 105 comprising a left edge 107A connected to the right edge 107B via a nose bridge 106 adapted for support by the user's nose. The left and right edges 107A, 107B include corresponding apertures 175A, 175B that hold corresponding optical elements 180A, 180B, such as lenses and display devices. The term "lens" as used herein is intended to include a sheet of transparent or translucent glass or plastic having a curved or flat surface that causes light to converge / diverge or to cause little or no convergence or divergence.
[0043] Although shown as having two optical elements 180A, 180B, the eyewear device 100 may include other arrangements, such as a single optical element (or it may not include any optical elements 180A, 180B), depending on the application of the eyewear device 100 or the intended user. As previously described, the eyewear device 100 includes a left corner portion 110A adjacent to the left side face 170A of the frame 105 and a right corner portion 110B adjacent to the right side face 170B of the frame 105. The corner portions 110A, 110B may be integrated into the corresponding sides 170A, 170B of the frame 105 (as shown) or implemented as separate components attached to the corresponding sides 170A, 170B of the frame 105. Alternatively, the corner portions 110A, 110B may be integrated into the temples (not shown) attached to the frame 105.
[0044] In one example, the image display of optical components 180A and 180B includes an integrated image display. For example... Figure 2AAs shown, each optical component 180A, 180B includes a suitable display matrix 177, such as a liquid crystal display (LCD), an organic light-emitting diode (OLED) display, or any other such display. Each optical component 180A, 180B also includes one or more optical layers 176, which may include lenses, optical coatings, prisms, mirrors, waveguides, optical strips, and other optical components and any combinations thereof. Optical layers 176A, 176B, ..., 176N (in... Figure 2A The optical layer 176A-N (shown as 176A-N) may include a prism having suitable dimensions and construction and including a first surface for receiving light from a display matrix and a second surface for emitting light toward a user's eye. The prism of the optical layer 176A-N extends over all or part of apertures 175A, 175B formed in the left and right edges 107A, 107B to allow the user to see the second surface of the prism when viewing through the corresponding left and right edges 107A, 107B. The first surface of the prism of the optical layer 176A-N faces upward from the frame 105, and the display matrix 177 covers the prism such that photons and light emitted by the display matrix 177 illuminate the first surface. The prism is sized and shaped such that light is refracted within the prism and directed to the user's eye by the second surface of the prism of the optical layer 176A-N. In this respect, the second surface of the prism of the optical layer 176A-N may be convex to direct light toward the center of the eye. The prism can be selectively designed in size and shape to magnify the image projected by the display matrix 177, and light passes through the prism such that the image viewed from the second surface is larger than the image emitted from the display matrix 177 in one or more dimensions.
[0045] In one example, optical layers 176A-N may include a transparent LCD layer (keeping the lens open) unless and until a voltage is applied to make the layer opaque (closing or blocking the lens). An image processor 412 on the eyewear device 100 may execute a program to apply voltage to the LCD layer to create an active shutter system, thereby adapting the eyewear device 100 for viewing visual content displayed as a three-dimensional projection. Technologies other than LCDs may be used in the active shutter mode, including other types of reactive layers that respond to voltage or another type of input.
[0046] In another example, the image display device with optical components 180A and 180B includes, for example... Figure 2BThe projected image display shown. Each optical component 180A, 180B includes a laser projector 150, which is a three-color laser projector using a scanning mirror or galvanometer. During operation, a light source (such as the laser projector 150) is positioned in or above one of the temples 125A, 125B of the eyewear device 100. In this example, optical component 180B includes one or more optical strips 155A, 155B, ... 155N (in... Figure 2B (shown as 155A-N), which are spaced apart on the width of the lens of each optical component 180A, 180B, or on the depth of the lens between the front and rear surfaces of the lens.
[0047] As photons projected by the laser projector 150 travel through the lens of each optical component 180A, 180B, they encounter optical strips 155A-N. When a particular photon encounters a particular optical strip, it is either redirected to the user's eye or passed to the next optical strip. A combination of modulation of the laser projector 150 and modulation of the optical strips can control a particular photon or beam of light. In the example, the processor controls the optical strips 155A-N by emitting mechanical, acoustic, or electromagnetic signals. Although shown as having two optical components 180A, 180B, the eye-wearing device 100 may include other arrangements, such as single or three optical components, or each optical component 180A, 180B may be arranged in a different configuration, depending on the application of the eye-wearing device 100 or the intended user.
[0048] like Figure 2A and Figure 2B As further shown, the eye-wearing device 100 includes a left corner portion 110A adjacent to the left side surface 170A of the frame 105 and a right corner portion 110B adjacent to the right side surface 170B of the frame 105. The corner portions 110A and 110B can be integrated into the corresponding sides 170A and 170B of the frame 105 (as shown) or implemented as separate components attached to the corresponding sides 170A and 170B of the frame 105. Alternatively, the corner portions 110A and 110B can be integrated into the temples 125A and 125B attached to the frame 105.
[0049] In another example, Figure 2BThe eye-wearing device 100 shown may include two projectors, a left projector 150A (not shown) and a right projector 150B (shown as projector 150). The left optical assembly 180A may include a left display matrix 177A (not shown) or left optical strips 155'A, 155'B, ..., 155'N (155', A to N, not shown), configured to interact with light from the left projector 150A. Similarly, the right optical assembly 180B may include a right display matrix 177B (not shown) or right optical strips 155"A, 155"B, ..., 155"N (155", A to N, not shown), configured to interact with light from the right projector 150B. In this example, the eye-wearing device 100 includes a left display and a right display.
[0050] Figure 3 This is a graphical depiction of a 3D scene 306, a left raw image 302A captured by a left visible light camera 114A, and a right raw image 302B captured by a right visible light camera 114B. As shown, the left field of view 111A may overlap with the right field of view 111B. The overlapping field of view 304 represents the portion of the image captured by both cameras 114A and 114B. The term "overlapping" in relation to field of view means that the pixel matrix in the generated raw image overlaps by thirty percent (30%) or more. "Substantially overlapping" means that the pixel matrix in the generated raw image or the pixel matrix in the infrared image of the scene overlaps by fifty percent (50%) or more. As described herein, the two raw images 302A and 302B may be processed to include a timestamp that allows the images to be displayed together as part of a 3D projection.
[0051] To capture stereoscopic images, such as Figure 3 The diagram shows a pair of raw red-green-blue (RGB) images capturing a real-world scene 306 at a given moment: a left raw image 302A captured by the left camera 114A and a right raw image 302B captured by the right camera 114B. When these raw images 302A and 302B are processed (e.g., by an image processor 412), a depth image is generated. The generated depth image can be viewed on the optical components 180A and 180B of an eye-wearing device, on another display (e.g., an image display 580 on a mobile device 401), or on a screen.
[0052] The generated depth image is in a three-dimensional spatial domain and may include a vertex matrix in a three-dimensional positional coordinate system, which includes an X-axis for horizontal positioning (e.g., length), a Y-axis for vertical positioning (e.g., height), and a Z-axis for depth (e.g., distance). Each vertex may include color attributes (e.g., red pixel light value, green pixel light value, or blue pixel light value); positional attributes (e.g., X-coordinate, Y-coordinate, and Z-coordinate); texture attributes; reflectance attributes; or combinations thereof. Texture attributes quantify the perceptual texture of the depth image, such as the spatial arrangement of colors or intensities in the vertex regions of the depth image.
[0053] In one example, an interactive augmented reality system 400 ( Figure 4 The device includes an eye-wearing device 100, which includes a frame 105, a left temple 110A extending from the left side 170A of the frame 105, and a right temple 125B extending from the right side 170B of the frame 105. The eye-wearing device 100 may further include at least two visible light cameras 114A, 114B having overlapping fields of view. In one example, the eye-wearing device 100 includes a left visible light camera 114A having a left field of view 111A, such as... Figure 3 As shown. The left camera 114A is attached to the frame 105 or the left temple 110A to capture a left raw image 302A from the left side of scene 306. The eye-wearing device 100 further includes a right visible light camera 114B having a right field of view 111B. The right camera 114B is attached to the frame 105 or the right temple 125B to capture a right raw image 302B from the right side of scene 306.
[0054] Figure 4 This is a functional block diagram of an exemplary interactive augmented reality system 400, which includes wearable devices (e.g., eye-wearing device 100), mobile devices 401, and server systems 498 connected via various networks 495 (such as the Internet). The interactive augmented reality system 400 includes a low-power wireless connection 425 and a high-speed wireless connection 437 between the eye-wearing device 100 and the mobile device 401.
[0055] like Figure 4As shown and described herein, the eye-wearing device 100 includes one or more visible light cameras 114A, 114B that capture still images, video images, or both. Cameras 114A, 114B may have direct memory access (DMA) to high-speed circuitry 430 and function as stereo cameras. Cameras 114A, 114B can be used to capture initial depth images, which can be rendered into three-dimensional (3D) models, which are texture-mapped images of a red-green-blue (RGB) imaged scene. Device 100 may also include a depth sensor 213 that uses infrared signals to estimate the location of an object relative to device 100 (e.g., a high-contrast region). In some examples, depth sensor 213 includes one or more infrared emitters 215 and an infrared camera 410.
[0056] The eye-wear device 100 further includes two image displays for each optical component 180A, 180B (one associated with the left side 170A and one associated with the right side 170B). The eye-wear device 100 also includes an image display driver 442, an image processor 412, low-power circuitry 420, and high-speed circuitry 430. The image displays for each optical component 180A, 180B are used to present images, including still images, video images, or still and video images. The image display driver 442 is coupled to the image displays for each optical component 180A, 180B to control the display of the images.
[0057] The eye-wearing device 100 also includes one or more speakers 440 (e.g., one associated with the left side of the eye-wearing device and another associated with the right side of the eye-wearing device). The speakers 440 are incorporated into the frame 105, temple 125, or corner 110 of the eye-wearing device 100. The one or more speakers 440 are driven by an audio processor 443 under the control of a low-power circuit 420, a high-speed circuit 430, or both. The speakers 440 are used to present audio signals, including, for example, a beat track. The audio processor 443 is coupled to the speakers 440 to control the presentation of sound.
[0058] Figure 4 The components shown for the eye-wearing device 100 are located on one or more circuit boards, such as printed circuit boards (PCBs) or flexible printed circuit boards (FPCs) located in the edges or temples. Alternatively or additionally, the depicted components may be located in the corners, frames, hinges, or nose bridge of the eye-wearing device 100. The left and right visible light cameras 114A, 114B may include digital camera elements, such as complementary metal-oxide-semiconductor (CMOS) image sensors, charge-coupled devices, lenses, or any other corresponding visible or light-capturing elements that can be used to capture data, including still images or videos of scenes with unknown objects.
[0059] like Figure 4 As shown, the high-speed circuit 430 includes a high-speed processor 432, a memory 434, and a high-speed wireless circuit 436. In this example, an image display driver 442 is coupled to the high-speed circuit 430 and operated by the high-speed processor 432 to drive the left and right image displays of each optical component 180A, 180B. The high-speed processor 432 can be any processor capable of managing the high-speed communication and operation of any general-purpose computing system required by the eye-wear device 100. The high-speed processor 432 includes the processing resources required to manage high-speed data transmission over a high-speed wireless connection 437 to a wireless local area network (WLAN) using the high-speed wireless circuit 436.
[0060] In some examples, the high-speed processor 432 executes an operating system, such as the LINUX operating system or other such operating system of the eye-wear device 100, and the operating system is stored in memory 434 for execution. Among other duties, the high-speed processor 432, which executes the software architecture of the eye-wear device 100, also manages data transmissions utilizing the high-speed wireless circuit 436. In some examples, the high-speed wireless circuit 436 is configured to implement the Institute of Electrical and Electronics Engineers (IEEE) 802.11 communication standard, also referred to herein as Wi-Fi. In other examples, the high-speed wireless circuit 436 may implement other high-speed communication standards.
[0061] Low-power circuitry 420 includes a low-power processor 422 and a low-power wireless circuitry 424. The low-power wireless circuitry 424 and high-speed wireless circuitry 436 of the eye-wear device 100 may include a short-range transceiver (Bluetooth). TM Or Bluetooth Low Energy (BLE) and wireless wide area network, local area network or wide area network transceivers (e.g. cellular or Wi-Fi). Mobile device 401, including transceivers communicating via low-power wireless connection 425 and high-speed wireless connection 437, can be implemented using the architectural details of eye-wearing device 100, just like other components of network 495.
[0062] Memory 434 includes any storage device capable of storing various data and applications, including camera data generated by the left and right visible light cameras 114A, 114B, the infrared camera 410, the image processor 412, and images generated for display by the image display driver 442 for each optical component 180A, 180B. While memory 434 is shown as integrated with high-speed circuitry 430, in other examples, memory 434 may be a separate, independent component of the eye-wearing device 100. In some such examples, electrical wiring may provide a connection to memory 434 of a chip, including either a high-speed processor 432 of the image processor 412 or a low-power processor 422. In other examples, the high-speed processor 432 may manage addressing of memory 434 such that the low-power processor 422 will activate the high-speed processor 432 whenever a read or write operation involving memory 434 is required.
[0063] like Figure 4 As shown, the high-speed processor 432 of the eye-wearing device 100 can be coupled to a camera system (visible light cameras 114A, 114B), an image display driver 442, a user input device 491, and a memory 434. Figure 5 As shown, the CPU 540 of the mobile device 401 can be coupled to the camera system 570, the mobile display driver 582, the user input layer 591, and the memory 540A.
[0064] Server system 498 may be one or more computing devices as part of a service or network computing system, including, for example, a processor, memory, and a network communication interface to communicate with eye-wearing device 100 and mobile device 401 via network 495.
[0065] The output components of the eye-worn device 100 include visual elements, such as those associated with each lens or such as Figure 2A and Figure 2BThe optical components 180A and 180B are associated with left and right image displays (e.g., displays such as liquid crystal displays (LCDs), plasma display panels (PDPs), light-emitting diode (LED) displays, projectors, or waveguides). The eye-wearing device 100 may include user-facing indicators (e.g., LEDs, speakers, or vibration actuators) or outward-facing signals (e.g., LEDs, speakers). The image display of each optical component 180A and 180B is driven by an image display driver 442. In some exemplary configurations, the output components of the eye-wearing device 100 further include additional indicators, such as audible elements (e.g., speakers), tactile elements (e.g., actuators, such as vibration motors for generating tactile feedback), and other signal generators. For example, the device 100 may include a user-facing set of indicators and an outward-facing set of signals. The user-facing set of indicators is configured to be seen or otherwise perceived by a user of the device 100. For example, the device 100 may include an LED display positioned so that a user can see it, one or more speakers positioned to generate sounds that a user can hear, or actuators providing tactile feedback that a user can feel. Outward-facing signal arrays are configured to be seen or otherwise perceived by an observer near device 100. Similarly, device 100 may include LEDs, speakers, or actuators configured and positioned to be perceived by an observer.
[0066] The input components of the eye-wearing device 100 may include alphanumeric input components (e.g., a touchscreen or touchpad configured to receive alphanumeric input, a photographic optical keyboard, or other alphanumeric-configured elements), point-based input components (e.g., a mouse, touchpad, trackball, joystick, motion sensor, or other pointing instrument), haptic input components (e.g., a push-button switch, a touchscreen or touchpad that senses the position, force, or position and force of a touch or touch gesture, or other haptic-configured elements), and audio input components (e.g., a microphone). The mobile device 401 and server system 498 may include alphanumeric, point-based, haptic, audio, and other input components.
[0067] In some examples, the eye-wearing device 100 includes motion-sensing components referred to as an inertial measurement unit (IMU) 472. These motion-sensing components can be microelectromechanical systems (MEMS) with micro-moving parts, typically small enough to be part of a microchip. In some exemplary configurations, the IMU 472 includes an accelerometer, a gyroscope, and a magnetometer. The accelerometer senses the linear acceleration (including acceleration due to gravity) of the device 100 relative to three orthogonal axes (x, y, z). The gyroscope senses the angular velocity of the device 100 about three rotational axes (pitch, roll, yaw). Together, the accelerometer and gyroscope provide positioning, orientation, and motion data about the device relative to six axes (x, y, z, pitch, roll, yaw). If a magnetometer is present, it senses the heading of the device 100 relative to the magnetic north pole. The positioning of device 100 can be determined by position sensors such as GPS unit 473, one or more transceivers for generating relative positioning coordinates, altitude sensors or barometers, and other orientation sensors. Such positioning system coordinates can also be received from mobile device 401 via low-power wireless circuit 424 or high-speed wireless circuit 436 through wireless connections 425 and 437.
[0068] IMU472 may include, or cooperate with, a digital motion processor or program that acquires raw data from components and calculates multiple useful values regarding the positioning, orientation, and motion of device 100. For example, acceleration data acquired from an accelerometer may be integrated to obtain velocity relative to each axis (x, y, z); and integrated again to obtain the positioning of device 100 (represented in linear coordinates x, y, and z). Angular velocity data from a gyroscope may be integrated to obtain the positioning of device 100 (represented in spherical coordinates). The program used to calculate these effective values may be stored in memory 434 and executed by the high-speed processor 432 of the eye-wearing device 100.
[0069] The eye-worn device 100 may optionally include additional peripheral sensors, such as biometric sensors, characteristic sensors, or display elements integrated with the eye-worn device 100. For example, peripheral device elements may include any I / O components, including output components, motion components, positioning components, or any other such components described herein. For example, biometric sensors may include components that detect facial expressions (e.g., gestures, facial expressions, vocal expressions, body posture, or eye tracking), measure biosignals (e.g., blood pressure, heart rate, body temperature, sweating, or brain waves), or identify a person (e.g., identification based on voice, retina, facial features, fingerprints, or electrophysiological signals such as electroencephalogram data).
[0070] Mobile device 401 may be a smartphone, tablet, laptop, access point, or any other such device capable of connecting to eye-wearing device 100 using both low-power wireless connection 425 and high-speed wireless connection 437. Mobile device 401 connects to server system 498 and network 495. Network 495 may include any combination of wired and wireless connections.
[0071] like Figure 4 The illustrated interactive augmented reality system 400 includes a computing device, such as a mobile device 401, coupled via a network to an eye-wearing device 100. The interactive augmented reality system 400 includes a memory for storing instructions and a processor for executing the instructions. The processor 432 executes the instructions of the interactive augmented reality system 400 to configure the eye-wearing device 100 to cooperate with the mobile device 401. The interactive augmented reality system 400 may utilize the memory 434 of the eye-wearing device 100 or the memory elements 540A, 540B, 540C of the mobile device 401. Figure 5 Furthermore, the interactive augmented reality system 400 may utilize the processor elements 432, 422 of the eye-wearing device 100 or the central processing unit (CPU) 540 of the mobile device 401. Figure 5 Furthermore, the interactive augmented reality system 400 can further utilize the memory and processor elements of the server system 498. In this respect, the memory and processing capabilities of the interactive augmented reality system 400 can be shared or distributed across the eye-wearing device 100, the mobile device 401, and the server system 498.
[0072] Memory 434 includes a song file 482 and virtual objects 484. The song file 482 includes a rhythm (e.g., a beat track) and optional note sequences and note values. A note is a symbol representing a specific pitch or other musical sound. Note values include the duration of a played note relative to a rhythm and may include other characteristics such as loudness, emphasis, articulation, and phrasing relative to other notes. In some implementations, the rhythm includes default values for a user interface through which the user can select a specific rhythm to use during song playback. Virtual objects 484 include image data for identifying objects or features in images captured by camera 114. These objects may be physical features, such as known paintings or physical markers used to locate the eye-wearing device 100 within an environment.
[0073] The memory 434 also includes a positioning detection program 460, a marker registration program 462, a positioning program 464, a virtual object rendering program 466, a physics engine 468, and a prediction engine 470, executed by the processor 432. The positioning detection program 460 configures the processor 432 to determine positioning (location and orientation) within the environment, for example, using the positioning program 464. The marker registration program 462 configures the processor 432 to register markers within the environment. Markers can be predefined physical markers with known locations within the environment, or specific locations specified by the processor 432 relative to the environment in which the eyewear device 100 is operating, or relative to the eyewear device itself. The positioning program 464 configures the processor 432 to obtain positioning data for determining the positioning of the eyewear device 100, virtual objects rendered by the eyewear device, or combinations thereof. The positioning data can be derived from a series of images, the IMU unit 472, the GPS unit 473, or combinations thereof. Virtual object rendering program 466 configures processor 432 to render virtual images for display by image display 180 under the control of image display driver 442 and image processor 412. Physics engine 468 configures processor 432 to apply physical laws (such as gravity and friction) to virtual words, for example, between virtual game objects. Prediction engine 470 configures processor 432 to predict the expected movement of an object (such as eye-wearing device 100) based on its current heading, input from sensors (such as IMU 472), images of the environment, or a combination thereof.
[0074] Figure 5 This is a high-level functional block diagram of an exemplary mobile device 401. Mobile device 401 includes flash memory 540A storing programs to be executed by CPU 540 to run all or a subset of the functions described herein.
[0075] The mobile device 401 may include a camera 570, which includes at least two visible light cameras (first and second visible light cameras with overlapping fields of view) or at least one visible light camera with substantially overlapping fields of view and a depth sensor. Flash memory 540A may further include a plurality of images or videos generated via camera 570.
[0076] As shown in the figure, the mobile device 401 includes an image display 580, a mobile display driver 582 that controls the image display 580, and a display controller 584. Figure 5 In one example, the image display 580 includes a user input layer 591 (e.g., a touchscreen) that is overlaid on top of the screen used by the image display 580 or otherwise integrated into the screen.
[0077] Examples of usable touchscreen mobile devices include (but are not limited to) smartphones, personal digital assistants (PDAs), tablets, laptops, or other portable devices. However, the structure and operation of touchscreen devices are provided by way of example; the subject matter described herein is not intended to be limited thereto. For the purposes of this discussion, Figure 5 Therefore, a block diagram illustration of an exemplary mobile device 401 with a user interface is provided, the user interface including a touch screen input layer 591 for receiving input (touch via hand, stylus or other tool, multi-touch or gesture, etc.) and an image display 580 for displaying content.
[0078] like Figure 5 As shown, mobile device 401 includes at least one digital transceiver (XCVR) 510 for digital wireless communication via a wide-area wireless mobile communication network, shown as a WWANXCVR. Mobile device 401 also includes additional digital or analog transceivers, such as those for communication via NFC, VLC, DECT, ZigBee, Bluetooth, etc. TM Or a short-range transceiver (XCVR) 520 for short-range network communication via Wi-Fi. For example, the short-range XCVR 520 can take the form of any available bidirectional wireless local area network (WLAN) transceiver, which is compatible with one or more standard communication protocols implemented in a wireless local area network (e.g., one of the Wi-Fi standards compliant with IEEE 802.11).
[0079] To generate location coordinates for locating mobile device 401, mobile device 401 may include a Global Positioning System (GPS) receiver. Alternatively or additionally, mobile device 401 may utilize either or both of a short-range XCVR520 and a WWAN XCVR510 to generate location coordinates for locating. For example, based on cellular networks, Wi-Fi, or Bluetooth. TM The positioning systems can generate very accurate location coordinates, especially when used in combination. These location coordinates can be transmitted to the eye-wearing device via one or more network connections through the XCVR510, 520.
[0080] Transceivers 510 and 520 (i.e., network communication interfaces) conform to one or more of the various digital wireless communication standards utilized by modern mobile networks. Examples of WWAN transceivers 510 include (but are not limited to) transceivers configured to operate according to Code Division Multiple Access (CDMA) and 3rd Generation Partnership Project (3GPP) network technologies, including, for example, but not limited to, 3GPP Type 2 (or 3GPP2) and LTE, sometimes referred to as "4G". For example, transceivers 510 and 520 provide bidirectional wireless communication of information including digitized audio signals, still images and video signals, web page information for display and web-related input, and various types of mobile messaging communications to / from mobile device 401.
[0081] Mobile device 401 further includes a microprocessor used as a central processing unit (CPU); such as Figure 4 The CPU540 is shown in the image. A processor is a circuit whose elements are constructed and arranged to perform one or more processing functions (typically various data processing functions). Although discrete logic components can be used, these examples utilize components that form a programmable CPU. A microprocessor, for example, includes one or more integrated circuit (IC) chips that incorporate electronic components that perform the functions of the CPU. For example, the CPU540 can be based on any known or available microprocessor architecture, such as Reduced Instruction Set Computing (RISC) using the ARM architecture, as is commonly used today in mobile devices and other portable electronic devices. Of course, other arrangements of the processor circuitry can be used to form the CPU540 or processor hardware in smartphones, laptops, and tablets.
[0082] By configuring the mobile device 401 to perform various operations, for example, according to instructions or programs executable by the CPU 540, the CPU 540 acts as a programmable host controller for the mobile device 401. Such operations may include, for example, various general operations of the mobile device, as well as operations related to programs for applications on the mobile device. Although the processor can be configured using hardwired logic, a typical processor in a mobile device is a general-purpose processing circuit configured by executing programs.
[0083] Mobile device 401 includes a memory or storage system for storing programs and data. In this example, the memory system may include, as needed, flash memory 540A, random access memory (RAM) 540B, and other memory components 540C. RAM 540B serves as a short-term storage device for instructions and data processed by CPU 540, for example, as working data processing memory. Flash memory 540A typically provides long-term storage.
[0084] Therefore, in the example of mobile device 401, flash memory 540A is used to store programs or instructions executed by CPU 540. Depending on the type of device, mobile device 401 stores and runs a mobile operating system, through which specific applications are executed. Examples of mobile operating systems include Google Android, Apple iOS (for iPhone or iPad devices), Windows Mobile, Amazon FireOS, RIM BlackBerry OS, etc.
[0085] The processor 432 within the eye-wearing device 100 constructs a map of the environment surrounding the eye-wearing device 100, determines the position of the eye-wearing device within the mapped environment, and determines the relative position of the eye-wearing device with respect to one or more objects (e.g., high-contrast areas) within the mapped environment. In one example, the processor 432 constructs the map and uses a visual inertial odometry (VIO) algorithm applied to data received from one or more sensors to determine position and orientation information. In augmented reality scenarios, the VIO algorithm is used to determine the localization and orientation of the device 100 in real time by analyzing associated camera images obtained from the device's environment. Mathematical solutions can be approximated using various statistical methods, such as particle filtering, Kalman filtering, extended Kalman filtering (EKF), covariance intersection, nonlinear optimization, and machine learning.
[0086] Sensor data includes images received from one or more cameras (e.g., cameras 114A, 114B), IMU 472, depth sensor, distance received from laser rangefinder, positioning information received from GPS unit 473, or a combination of two or more such sensor data, or data from other sensors that provide data for determining positioning information.
[0087] Figure 6 An exemplary environment 600 is depicted, connecting elements used for natural feature tracking (NFT; for example, tracking applications using SLAM algorithms). A user 602 of the eye-wearing device 100 exists within the exemplary physical environment 600 (in... Figure 6In the interior room (as shown in the image), the processor 432 of the eye-wear device 100 uses the captured images to determine its location relative to one or more high-contrast regions (shown as object 604) within the environment 600, constructs a map of the environment 600 using the coordinate system (x, y, z) of the environment 600, and determines its position within the coordinate system. Additionally, the processor 432 determines the head pose (roll, pitch, and yaw) of the eye-wear device 100 within the environment by using two or more location points (e.g., three location points 606a, 606b, and 606c) associated with a single high-contrast region 604a, or by using one or more location points 606 associated with two or more high-contrast regions 604a, 604b, and 604c. In one example, the processor 432 of the eye-wear device 100 locates virtual objects 408 (such as...) within the environment 600. Figure 6 (The key shown) is used for augmented reality viewing via image display 180.
[0088] Figure 7 This is a flowchart 700 depicting a method for visual inertial tracking of a wearable device (e.g., an eye-worn device). Although these steps are described herein based on an eye-worn device 100, other specific embodiments of the described steps will be understood by those skilled in the art based on the description herein for other types of devices. Additionally, it is conceivable that in... Figure 7 One or more steps shown in the figures and described herein may be omitted, performed simultaneously or sequentially, performed in a different order than those shown and described, or performed in combination with additional steps.
[0089] At frame 702, the eye-wearing device 100 captures one or more input images of the physical environment 600 in the vicinity of the eye-wearing device 100. The processor 432 can continuously receive input images from the visible light camera 114 and store these images in memory 434 for processing. Additionally, the eye-wearing device 100 can capture information from other sensors (e.g., location information from the GPS unit 473, orientation information from the IMU 472, or distance information from a laser distance sensor).
[0090] At frame 704, the eye-wearing device 100 compares objects in the captured image with objects stored in an image library to identify matches. In some implementations, the processor 432 stores the captured image in memory 434. The image library of known objects is stored in a virtual object database 484.
[0091] In one example, processor 432 is programmed to identify predefined specific objects (e.g., a specific photograph 604a hanging at a known location on a wall, a window 604b on another wall, or an object such as a safe 604c located on the floor). Other sensor data, such as GPS data, can be used to narrow down the number of known objects used in the comparison (e.g., images associated only with rooms identified via GPS coordinates). In another example, processor 432 is programmed to identify predefined general objects (e.g., one or more trees in a park).
[0092] At frame 706, the eye-wearing device 100 determines its position relative to an object. The processor 432 determines its position relative to an object by comparing and processing the distances between two or more points in the captured image (e.g., between two or more location points on an object 604 or between location points 606 on each of two objects 604) with known distances between corresponding points in the identified object. If the distances between points in the captured image are greater than the distances between points in the identified object, it indicates that the eye-wearing device 100 is closer to the identified object than the imager capturing the image including the identified object. Conversely, if the distances between points in the captured image are less than the distances between points in the identified object, it indicates that the eye-wearing device 100 is farther from the identified object than the imager capturing the image including the identified object. By processing relative distances, the processor 432 is able to determine its position relative to the object. Alternatively or additionally, other sensor information (such as laser distance sensor information) may be used to determine the position relative to the object.
[0093] At box 708, the eye-wearing device 100 constructs a map of the environment 600 surrounding the eye-wearing device 100 and determines its position within the environment. In one example, if the identified object (box 704) has a predefined coordinate system (x, y, z), the processor 432 of the eye-wearing device 100 uses that predefined coordinate system to construct the map and determines its position within that coordinate system based on its determined location relative to the identified object (box 706). In another example, the eye-wearing device uses an image of a permanent or semi-permanent object 604 within the environment (e.g., a tree or park bench in a park) to construct the map. According to this example, the eye-wearing device 100 may define a coordinate system (x′, y′, z′) for the environment.
[0094] At frame 710, the eye-wearing device 100 determines the head pose (roll, pitch, and yaw) of the eye-wearing device 100 within the environment. The processor 432 determines the head pose using two or more location points on one or more objects 604 (e.g., three location points 606a, 606b, and 606c) or by using one or more location points 606 on two or more objects 604. Using conventional image processing algorithms, the processor 432 determines roll, pitch, and yaw by comparing the angle and length of a line extending between the location points in the captured image and a known image.
[0095] At frame 712, the eye-wearing device 100 presents a visual image to the user. Processor 432 uses image processor 412 and image display driver 442 to present the image to the user on image display 180. The processor develops and presents the visual image via image display in response to the position of the eye-wearing device 100 within environment 600.
[0096] At frame 714, as the user moves within environment 600, the steps described in reference frames 706-712 above are repeated to update the positioning of the eye-wearing device 100 and the content viewed by the user 602.
[0097] Refer again Figure 6 In this example, the method for implementing the augmented reality virtual guidance application described herein includes virtual markers (e.g., virtual marker 610a) associated with physical objects (e.g., painting 604a) and virtual markers associated with virtual objects (e.g., key 608). In one example, the eye-wearing device 100 uses the markers associated with physical objects to determine its location in the environment and uses the markers associated with virtual objects to generate overlay images that present the associated virtual object 608 in the environment 600 at the location of the virtual markers on the display of the eye-wearing device 100. For example, markers are registered at locations in the environment to track and update the positions of users, devices, and objects (virtual and physical) in the mapped environment. Sometimes markers are registered with high-contrast physical objects (such as a relatively dark object 604a mounted on a light-colored wall) to aid cameras and other sensors in the task of marker detection. Markers can be pre-specified or can be specified by the eye-wearing device 100 upon entering the environment. The markers are also registered at locations in the environment to render virtual images at those locations in the mapped environment.
[0098] The tag may be encoded with information or otherwise linked to information. The tag may include location information, physical codes (such as barcodes or QR codes; visible or hidden from the user), or a combination thereof. A set of data associated with the tag is stored in the memory 434 of the eye-wearing device 100. This set of data includes information about the tag 610a, the tag's location (position and orientation), one or more virtual objects, or a combination thereof. The tag location may include the three-dimensional coordinates of one or more tag landmarks 616a, such as... Figure 6 The corners of the roughly rectangular marker 610a are shown. Marker positioning can be represented relative to real-world geographic coordinates, the marker coordinate system, the positioning of the eye-wearing device 100, or other coordinate systems. One or more virtual objects associated with marker 610a can include any of a variety of materials, including still images, videos, audio, haptic feedback, executable applications, interactive user interfaces and experiences, and combinations or sequences of such materials. In this context, any type of content that can be stored in memory and retrieved upon encountering marker 610a or associated with the specified marker can be classified as a virtual object. For example, Figure 6 The key 608 shown is a virtual object displayed as a 2D or 3D still image at the marked location.
[0099] In one example, marker 610a can be registered in memory to be located at physical object 604a (e.g., Figure 6 The marker is located near and associated with the framed artwork shown. In another example, the marker can be registered in memory for a specific location relative to the eye-wearing device 100.
[0100] Figures 8A to 8E Flowcharts 800, 810, 820, 830, and 840 are shown, outlining the steps in an exemplary method for visual inertial tracking. Although these steps are described herein with reference to an eye-worn device 100, those skilled in the art will understand from the description herein that the described steps are applicable to other specific implementations for other types of mobile devices. Furthermore, it is conceivable that in… Figures 8A to 8E One or more steps shown in the figures and described herein may be omitted, performed simultaneously or sequentially, performed in a different order than those shown and described, or performed in combination with additional steps.
[0101] Figure 8 is a flowchart 800 illustrating a method for visual inertial tracking using an eye-worn device 100. At block 802, processor 432 monitors multiple sensors of a visual inertial odometry system (VIOS). Each of these sensors provides input for determining the location of the eye-worn device within its environment. The sensors include one or more cameras (e.g., visible light, depth, infrared, etc.), an inertial measurement unit (IMU) 472, a radar system, and a GPS 473. In one example, the multiple sensors include an inertial measurement unit (IMU) 472 and a first camera. In another case, where the multiple sensors include an inertial measurement unit (IMU) and a first camera, the first camera is a first visible light camera 114A. In yet another case, where the first camera is a first visible light camera 114A, the multiple sensors further include a second camera 114B, a first depth camera, a second depth camera, another IMU, a radar system, and a GPS.
[0102] At block 804, processor 432 determines the state of the visual-inertial odometry system of eye-wear device 100. The state of the visual-inertial odometry system can be based on the current rate of motion of eye-wear device 100, the environment in which eye-wear device 100 operates (e.g., indoor vs. outdoor, sparse vs. crowded, etc.), other environmental characteristics, or combinations thereof. The state provides an indication of the minimum sensor requirements needed to provide suitable tracking results. In one example, states include low, medium, and high settings. A low setting can be associated with relatively low sensor requirements (e.g., only one available sensor is needed at the lowest available sampling rate to determine positioning when moving at a low rate in a known indoor environment), a medium setting can be associated with medium sensor requirements (e.g., three available sensors are needed at a medium range of sampling rates to determine positioning when moving at a fast rate in a known indoor environment), and a high setting can be associated with relatively high sensor requirements (e.g., all available sensors are needed at the highest available sampling rate to determine positioning when moving at a fast rate in an unknown outdoor environment). Eye-wear device 100 can be configured to have more or fewer (e.g., at least two) settings, which can be stored in a lookup table in memory 434. The lookup table can include level, applicable sensor, and applicable sampling rate.
[0103] In one example, at least one of the first or second cameras 114 captures an image, and the processor 432 identifies the physical environment of the eye-wearing device by comparing objects in the images with known objects associated with a particular environment, in order to determine the state of the visual inertial odometry system based on the identified physical environment (see...). Figure 8BIn another example, the state of the visual inertial odometry system is based on comparing the motion rate with a predefined threshold, where processor 432 determines the motion rate of the eye-worn device with the IMU and compares the motion rate with the predefined threshold (see...). Figure 8D In one example, the threshold is a calculated value based on inputs from two or more sensors.
[0104] At box 806, processor 432 adjusts the plurality of sensors based on the state. In the example, adjusting the plurality of sensors includes processor 432 selecting a subset of the plurality of sensors. Processor 432 may turn off any remaining unselected sensors. Additionally or alternatively, processor 432 may adjust the plurality of sensors by changing the sampling rate of one or more of the sensors (or a subset thereof). In one example, processor 432 identifies the desired sensors (and their sampling rates) for a given state by retrieving sensor information (and associated sampling rates) from a lookup table stored in memory 434 based on the determined state for VIOS.
[0105] At box 808, processor 432 uses a plurality of adjusted sensors to determine the location of the eye-wearing device within its environment. In one example, processor 432 determines the location of eye-wearing device 100 by applying a visual-inertial tracking algorithm to input received from a selected subset of sensors. Alternatively or additionally, processor 432 determines the location of eye-wearing device 100 by applying a visual-inertial tracking algorithm to input received at an adjusted sampling rate from one or more of the plurality of sensors (or a subset thereof).
[0106] exist Figure 8B In this diagram, flowchart 810 illustrates an example of the steps for determining the state of a visual inertial tracking system. At block 812, eye-wearing device 100 captures one or more input images of the physical environment 600 in the vicinity of eye-wearing device 100. Processor 432 may continuously receive input images from visible light camera 114 and store these images in memory 434 for processing. At block 814, processor 432 is programmed to identify the physical environment of the eye-wearing device (e.g., internal or external). At block 816, processor 432 determines the state of the visual inertial odometry system based on the identified physical environment 600.
[0107] exist Figure 8CIn flowchart 820, an example of steps for adjusting the plurality of sensors, including at least a first camera and a second camera, is depicted. At block 822, processor 432 compares the identified physical environment with a plurality of known environments. At decision block 824, if new information exists in the physical environment, processor 432 proceeds to block 826 to adjust one or more sensors, for example, by activating the second camera, adjusting its resolution, etc. If no new information exists in the physical environment, processor 432 proceeds to deactivate or adjust the one or more sensors to reduce power consumption.
[0108] exist Figure 8D In flowchart 830, an example of steps for determining the state of a visual inertial tracking system (VIOS) is depicted. At block 832, processor 432 determines the values of parameters related to determining the state of VIOS. By way of example, parameters include motion rate, system uncertainty estimate, number of tracking points, number of observations, quality of tracking points, tracking accuracy, etc. For example, at block 832, processor 432 determines the motion rate of an eye-worn device 100 with an IMU 472. At block 834, processor 432 compares the parameter value (e.g., motion rate) to a predefined threshold. At block 836, processor 432 determines the state of the visual inertial odometry system based on the comparison of the parameter value (e.g., motion rate) to the predefined threshold. In one example, to evaluate the state of VIOS, a second value of the parameter (e.g., uncertainty estimate) is obtained and compared to a first value. The variance between the first and second values is calculated, and the calculated variance is compared to a predefined threshold to determine the state of VIOS. In addition to the system uncertainty estimate, parameters include the number of tracking points, number of observations, quality of tracking points, tracking accuracy, etc.
[0109] exist Figure 8E In the flowchart 840, an example of the steps for adjusting the plurality of sensors to determine the location of the eye-wearing device in the environment using one or more sensors is depicted. Figure 8E The steps for adjusting one or more on-screen cameras are illustrated. At box 842, processor 432 compares the motion rate to a threshold. At decision box 844, if the motion rate is less than a predefined threshold, processing continues and processor 432 shuts down the second camera. Alternatively, if the motion rate is equal to or greater than the predefined threshold, processing returns to box 842, and processor 432 compares the motion rate to the threshold. In other examples, other sensor parameters besides power state can be adjusted; for example, resolution, frame rate, quality adjusted by power modes (e.g., low power mode and high power mode), etc.
[0110] exist Figure 8FIn flowchart 850, an example of steps for determining the state of the VIOS using one or more parameter values is depicted. At block 852, processor 432 determines the values of parameters related to determining the state of the VIOS. By way of example, parameters include motion rate, system uncertainty estimate, number of tracking points, number of observations, quality of tracking points, tracking accuracy, etc. For example, at block 852, processor 432 determines the motion rate or uncertainty parameter value of the eye-worn device 100 with IMU 472. At block 854, processor 432 determines the state of the visual inertial odometry system based on an option that maps the determined parameter values (e.g., motion rate or uncertainty) to a VIOS state configuration option. By way of example, state configuration options include low power level and high power level. For example, when the motion parameter value and uncertainty parameter value are low, the VIOS is placed at a lower power level. Conversely, when the motion parameter value and uncertainty parameter value are high, the VIOS is placed at a higher power level. Table 1 shows an example of how the determined parameter values can be mapped to an option in the VIOS state configuration options.
[0111] Table 1
[0112]
[0113] Table 1 shows a matrix depicting the rate of motion change of the device sensed by sensors and an estimate of the system's uncertainty. The rate of change is expressed as low or high, where, as in this example, a low rate of change can be specified as 0 to 1 feet per second, while a high rate of change can be greater than 1 foot per second. Uncertainty can be defined as a reference range of 0 to 1, where, as in the example in Table 1, values of 0.1 or less are considered low, while values greater than 0.1 are considered high. The corresponding state configurations are shown. An example of a low power level is operating a camera at approximately five frames per second. An example of a medium power level could be operating a camera at 60 frames per second with the IMU sensor activated. An example of a high power level could include operating all sensors at their maximum rates.
[0114] As described herein, any of the functions described herein can be embodied in one or more computer software applications or sets of programming instructions. According to some examples, a “function,” “application,” “instruction,” or “program” is a program that performs the functions defined within it. Various programming languages can be used to generate one or more applications structured in various ways, such as object-oriented programming languages (e.g., Objective-C, Java, or C++) or procedural programming languages (e.g., C or assembly language). In a particular example, a third-party application (e.g., an entity other than a platform-specific vendor using Android) may be used. TM or iOSTM Applications developed using a Software Development Kit (SDK) can be included in mobile operating systems such as iOS. TM ANDROID TM , Mobile software running on a phone or another mobile operating system. In this example, a third-party application may invoke API calls provided by the operating system to facilitate the functionality described herein.
[0115] Therefore, machine-readable media can take many forms of tangible storage media. Non-volatile storage media include, for example, optical discs or disks, any storage device such as any computer device, such as client devices, media gateways, code converters, etc., that can be used to implement the figures shown. Volatile storage media include dynamic memory, such as the main memory of computer platforms. Tangible transmission media include coaxial cables; copper wires and optical fibers, including wires that form buses within a computer system. Carrier transmission media can take the form of electrical or electromagnetic signals, or sound or light waves, such as those generated during radio frequency (RF) and infrared (IR) data communications. Therefore, common forms of computer-readable media include, for example: floppy disks, floppy disks, hard disks, magnetic tapes, any other magnetic media, CD-ROMs, DVDs or DVD-ROMs, any other optical media, punched card tapes, any other physical storage media with a perforated pattern, RAM, PROMs and EPROMs, FLASH-EPROMs, any other memory chips or cartridges, carrier waves for transmitting data or instructions, cables or links for transmitting such carrier waves, or any other medium from which a computer can read program code or data. Many of these forms of computer-readable media can be used to carry one or more sequences of one or more instructions to a processor for execution.
[0116] In addition to what has just been stated above, whether or not it is stated in the claims, the stated or described content is not intended or should not be construed as causing any part, step, feature, object, benefit, advantage or equivalent to be offered to the public.
[0117] It should be understood that, unless otherwise specified herein, the terms and expressions used herein have the general meaning consistent with those in the corresponding fields of investigation and research. Relational terms such as “first” and “second” are used only to distinguish one entity or action from another, and do not necessarily require or imply any actual such relationship or order between these entities or actions. The terms “comprising,” “including,” “containing,” “having,” or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that includes or comprises a list of elements or steps includes not only those elements or steps, but may also include other elements or steps not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element prefixed with “a” or “an” does not exclude the presence of additional identical elements in the process, method, article, or apparatus that includes that element.
[0118] Unless otherwise stated, any and all measurements, values, ratings, positions, quantities, dimensions, and other specifications set forth in this specification, including those in the appended claims, are approximate, not precise. Such quantities are intended to have a reasonable range consistent with the functions they relate to and the conventions in the fields to which they pertain. For example, unless otherwise expressly stated, parameter values, etc., can vary from said quantities by up to ±10%.
[0119] Furthermore, as can be seen in the foregoing specific embodiments, various features have been combined in various examples for the purpose of simplifying this disclosure. The disclosed method should not be construed as reflecting an intention to require more features than expressly recited in each claim in the claimed examples. Rather, as reflected in the following claims, the claimed subject matter lies in fewer features than in any single disclosed example. Therefore, the following claims are hereby incorporated into the specific embodiments, wherein each claim exists independently as a separately claimed subject matter.
[0120] While examples considered to be best practices and other examples have been described above, it should be understood that various modifications may be made therein, and the subject matter disclosed herein can be implemented in various forms and examples, and is applicable to many applications, of which only some have been described herein. The appended claims are intended to claim protection for any and all modifications and variations falling within the true scope of the inventive concept.
Claims
1. A method for visual inertial tracking using an eye-worn device, the method comprising: The system monitors multiple sensors of the Visual Inertial Odometry System (VIOS), which includes a first camera and a second camera, wherein each of the multiple sensors provides input for determining the location of the eye-wearing device in the environment. Determine the state of the VIOS; The plurality of sensors are adjusted based on the determined state, wherein the adjustment operation includes selecting a subset of the plurality of sensors containing the first camera and turning off the remaining sensors containing the second camera; as well as The location of the eye-wearing device within the environment is determined using a number of adjusted sensors.
2. The method of claim 1, wherein the adjustment further comprises: The sampling rate for one sensor in the subset of sensors is selected based on the determined state. as well as Sample one of the sensors in the subset of sensors at the selected sampling rate; The determination of the location of the eye-wearing device includes using one of the sensors in the subset of sensors at a selected sampling rate to determine the location of the eye-wearing device within the environment.
3. The method of claim 1, wherein the adjustment includes: The sampling rate for one of the plurality of sensors is selected based on the determined state; as well as Sample one of the plurality of sensors at a selected sampling rate; The determination of the location of the eye-wearing device includes using one of the plurality of sensors at a selected sampling rate to determine the location of the eye-wearing device within the environment.
4. The method of claim 1, wherein the plurality of sensors includes an inertial measurement unit (IMU).
5. The method of claim 4, wherein the first camera is a first visible light camera, and wherein the plurality of sensors further comprises one or more of a second visible light camera, a first depth camera, a second depth camera, another IMU, a radar system, or GPS.
6. The method of claim 4, wherein determining the state of the VIOS comprises: Capture images using the first camera; Identify the physical environment of the eye-wearing device; Compare the identified physical environment with the previous physical environment; Identify new information in the identified physical environment; as well as The state of the VIOS is determined based on the new information in the identified physical environment.
7. The method of claim 6, wherein the adjustment comprises: In response to the new information, at least one of the plurality of sensors is adjusted, wherein the adjustment includes changing the rate, resolution, quality, or turning the sensor on or off.
8. The method of claim 4, wherein determining the state of the VIOS comprises: Determine at least one of the motion parameter value or the uncertainty parameter value of the eye-wearing device; as well as Map at least one of the determined motion parameter values or the uncertain parameter values to one of the multiple VIOS state configuration options.
9. The method of claim 8, wherein the VIOS state configuration options include at least a low power level and a high power level, and wherein the adjustment includes: When the motion parameter value and the uncertainty parameter value are low, the VIOS is set to the low power level; as well as When the motion parameter value and the uncertainty parameter value are high, the VIOS is set to the high power level.
10. An eye-worn device with visual inertial tracking, the eye-worn device comprising: A visual inertial odometry system (VIOS) includes multiple sensors, including a first camera and a second camera, wherein each of the multiple sensors provides input for determining the location of the eye-wearing device in the environment; A processor configured to determine the state of the VIOS, adjust the plurality of sensors based on the determined state, wherein the adjustment operation includes selecting a subset of the plurality of sensors containing a first camera, disabling the remaining sensors containing a second camera, and using the adjusted plurality of sensors to determine the positioning of the eye-wearing device within the environment; and A frame supporting the VIOS and the processor is configured to be worn on a user's head, wherein the VIOS is configured to capture images using a first camera and a second camera, and the processor is configured to identify the physical environment in which the eye-wearing device is located and determine the state of the VIOS based on the identified physical environment.
11. The device of claim 10, wherein the plurality of sensors include an inertial measurement unit (IMU).
12. The device of claim 11, wherein the first camera is a first visible light camera, and wherein the plurality of sensors further comprises one or more of a second visible light camera, a first depth camera, a second depth camera, another IMU, a radar system, or GPS.
13. The device of claim 11, wherein the VIOS is configured to capture an image with the first camera, and the processor is configured to identify the physical environment of the eye-wearing device and determine the state of the VIOS based on the identified physical environment.
14. The device of claim 13, wherein the processor is configured to: Capture images using the first camera; Identify the physical environment of the eye-wearing device; Compare the identified physical environment with the previous physical environment; Identify new information in the identified physical environment; and The state of the VIOS is determined based on the new information in the identified physical environment.
15. The device of claim 11, wherein determining the state of the VIOS comprises: Determine at least one of the motion parameter value or the uncertainty parameter value of the eye-wearing device; as well as Map at least one of the determined motion parameter values or the uncertain parameter values to one of the multiple VIOS state configuration options.
16. The device of claim 15, wherein the VIOS state configuration options include at least a low power level and a high power level, and wherein the adjustment includes: When the motion parameter value and the uncertainty parameter value are low, the VIOS is set to the low power level; as well as When the motion parameter value and the uncertainty parameter value are high, the VIOS is set to the high power level.
17. A non-transitory computer-readable medium storing program code for visual inertial tracking when executed by an eye-wearing device having a plurality of sensors, a processor, and a memory, the program code, when executed, causes an electronic processor to perform the following steps: The system monitors multiple sensors of the Visual Inertial Odometry System (VIOS), each of which provides input for determining the location of the eye-wearing device within its environment. Determine the status of VIOS; The plurality of sensors are adjusted based on the determined state, wherein the adjustment operation includes selecting a subset of the plurality of sensors and placing the remaining sensors in a low-power mode, wherein the low-power mode includes reducing the frame rate, resolution, quality, or one or more combinations thereof. Determining the location of the eye-wearing device includes using a subset of sensors to determine its location within the environment; and The location of the eye-wearing device within the environment is determined using a number of adjusted sensors.
18. The non-transitory computer-readable medium of claim 17, wherein the adjustment step further comprises: The sampling rate for one sensor in the subset of sensors is selected based on the determined state. as well as Sample one of the plurality of sensors at a selected sampling rate; The determination of the location of the eye-wearing device includes using one of the sensors in the subset of sensors at a selected sampling rate to determine the location of the eye-wearing device within the environment.