Compensation for deformation in head-mounted display systems
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- MAGIC LEAP INC
- Filing Date
- 2026-01-14
- Publication Date
- 2026-07-02
AI Technical Summary
Head-mounted display systems prone to deformation due to their lightweight and flexible design cause misalignment of virtual content, leading to distorted images and discomfort for users.
A calibration process that determines image transformations for each eye display to correct for deformation, allowing real-time adjustment of virtual content alignment with the user's perception of the real-world environment.
Ensures accurate and comfortable viewing experiences by independently recalibrating each display to compensate for deformation, maintaining alignment and reducing physiological strain.
Smart Images

Figure 00000018_0000 
Figure 00000018_0001 
Figure 00000018_0002
Abstract
Description
Technical Field
[0001] (Technical Field) This specification generally relates to image processing and display alignment calibration for head-mounted displays.
Background Art
[0002] (Background) As head-mounted display systems become lighter, thinner, and more flexible, promoting transportation, comfort, and aesthetics, these wearable devices are also more prone to deformation. When the system is not structurally stable, the display moves, deforms, and becomes misaligned. These deformations introduce distortions and other errors into the virtual binocular images. When this occurs, the person perceiving the image on the display may become confused or uncomfortable. This results in a poor viewing experience.
Summary of the Invention
Means for Solving the Problems
[0003] (Summary) The innovative aspect of the subject matter described in this specification relates to the calibration of head-mounted display devices used within virtual or augmented reality (VAR) systems. Specifically, a VAR system can be used to display virtual content for extending the view of the physical reality. Calibration may be required to ensure that the virtual content is properly displayed when one or more display-related components of the VAR system are deformed or do not operate as desired.
[0004] In some cases, a transparent display is used, which allows virtual content in the form of an image to be superimposed on a view of the real-world environment. If one of the displays deforms, the virtual content is moved relative to the user's eyes in accordance with the deformation. If this is not taken into consideration, the user's perception of images related to the real-world environment will be incorrect. This may be perceived by a person as two separate images or dual vision (i.e., one image per eye). Instead of seeing the intended single image, the user sees them separately with space between them. This can result in an unpleasant viewing experience.
[0005] Research shows that binocular mismatch can cause physiological strain on the human visual system and that people are sensitive to binocular rotational mismatch of virtual images with respect to pitch, roll, and yaw axes of up to two arc minutes. For example, in some cases, a two-minute pitch difference between one eye's display and the other eye's display is sufficient to cause discomfort. Furthermore, distortion in one display can cause individual color channels of the display (e.g., red, green, and blue channels) to shift relative to each other.
[0006] The systems and methods described herein operate to mitigate or prevent this unpleasant visual experience. This is achieved by using a calibration process to determine image transformations for the left-eye display and for the right-eye display. The calibration process is configured to display one image per eye on each of the two eye displays (also referred to as eyepieces), observe how the image on each eye display changes in response to deformation of each display, and determine a transformation (e.g., mapping or lookup table (LUT)) for each display. The transformation associated with the left-eye display is then applied to each subsequent image that will be displayed on the left-eye display, and similarly, the transformation associated with the right-eye display is then applied to each subsequent image that will be displayed on the right-eye display. The calibration process is then repeated as needed (e.g., when deformation is detected), when triggered (e.g., by blinking), or periodically (e.g., every second). In some cases, the transformations for each eye display are determined together and / or interdependent. However, in some cases, the conversion is determined and applied to the display associated with one eye.
[0007] Calibration techniques are advantageous because they allow each display to be recalibrated independently and require no input from a human operator. Another advantage is that they allow the physical hardware to be lightweight and compact. For example, head-mounted displays can be implemented in a small form factor, such as eyeglasses. This allows the device to tolerate deformation and furthermore, any display problems caused by this deformation can be corrected in near real-time (e.g., within 100 milliseconds).
[0008] Other implementations of this aspect include a corresponding system, a device, and a computer program recorded on a computer storage device, each configured to perform the operation of this method.
[0009] Details of one or more implementations of the subject matter described herein are illustrated in the accompanying drawings and the following description. Other features, aspects, and advantages of the subject matter will be evident from the description, drawings, and claims. The present invention provides, for example, the following: (Item 1) A computer implementation method, Receiving data of a first target image associated with the non-deformed state of a first eyepiece of a head-mounted display device, wherein the first eyepiece comprises a first projector optically coupled to the first eyepiece. The method involves receiving data of a first captured image associated with the deformation state of the first eyepiece of the head-mounted display device, wherein the first captured image represents the transformation state of the first target image and is received by a first imaging sensor optically coupled to the first projector of the first eyepiece, and the first imaging sensor is part of the head-mounted display device. Determining a first transformation that maps the first captured image to the first target image, Applying the first transformation to a subsequent image for viewing on the first eyepiece of the head-mounted display device. Computer implementation methods including (Item 2) The computer implementation method according to item 1, further comprising transmitting the converted subsequent image to the first projector which is optically coupled to the first eyepiece. (Item 3) The computer implementation method according to item 1, further comprising transmitting the data of the first target image to the first projector. (Item 4) The computer implementation method according to item 1, further comprising the first imaging sensor, which is optically coupled to the first projector, capturing the data of the first target image. (Item 5) Receiving the temperature measurement value from the first imaging sensor, The received temperature measurement value is taken into consideration in the first conversion. The computer implementation method described in item 1, further including the method described in item 1. (Item 6) The computer implementation method according to item 1, further comprising receiving data from one or more sensors representing the state of the real-world environment outside the head-mounted display device. (Item 7) The computer implementation method described in item 1, wherein the first transformation aligns the features of the real-world environment with corresponding features from the first target image. (Item 8) The computer implementation method according to item 1, wherein the first transformation aligns each individual color channel of the first captured image with each individual color channel of the first target image. (Item 9) The computer implementation method according to item 1, further comprising receiving a trigger signal and determining the first conversion in response to receiving the trigger signal. (Item 10) The computer implementation method according to item 1, wherein the first eyepiece of the head-mounted display device is optically transparent. (Item 11) The computer implementation method according to item 1, wherein the first conversion takes into account the relative position of the first imaging sensor with respect to the first eyepiece. (Item 12) The method involves receiving data of a second target image associated with the non-deformed state of the second eyepiece of the head-mounted display device, wherein the second eyepiece comprises a second projector optically coupled to the second eyepiece. The method involves receiving data of a second captured image associated with the deformation state of the second eyepiece, wherein the second captured image represents the transformation state of the second target image and is received by a second imaging sensor optically coupled to the second projector of the second eyepiece, and the second imaging sensor is part of the head-mounted display device. Determining a second transformation that maps the second captured image to the second target image, Applying the second transformation to the subsequent image for viewing on the second eyepiece of the head-mounted display device. The computer implementation method described in item 1, further including the method described in item 1. (Item 13) The computer implementation method described in item 12, wherein the first conversion depends on the second conversion. (Item 14) The computer implementation method according to item 12, wherein the second conversion aligns the subsequent image for viewing on the second eyepiece with the subsequent image for viewing on the first eyepiece. (Item 15) The computer implementation method described in item 12, wherein the first imaging sensor is securely connected to the second imaging sensor. (Item 16) The computer mounting method according to item 12, wherein the first eyepiece and the second eyepiece are deformable relative to the first and second imaging sensors. (Item 17) The computer implementation method according to item 12, wherein the second conversion takes into account the relative position of the second imaging sensor with respect to the second eyepiece. (Item 18) Receiving the temperature measurement value from the second imaging sensor, The received temperature measurement value is taken into consideration in the second conversion. The computer implementation methods described in item 12, further including the methods described in item 12. (Item 19) A head-mounted display device, comprising: A wearable frame, and A first eyepiece lens elastically mounted on the wearable frame, the first eyepiece lens comprising a first projector firmly and optically coupled to the first eyepiece lens, the first projector being configured to emit light into the first eyepiece lens; a first eyepiece lens, A first imaging sensor elastically mounted on the wearable frame and configured to capture light emitted from the first eyepiece lens, A processor communicatively coupled to the first projector and the first imaging sensor, Transmitting data of a first target image to the first projector, Receiving, from the first imaging sensor, data of a first captured image representing a transformed state of the first target image, Determining a first transformation for mapping the first captured image to the first target image, Applying the first transformation to a subsequent image for viewing on the first eyepiece lens of the head-mounted display device, And a processor configured to perform the above operations, A head-mounted display device comprising the above components. (Item 20) The head-mounted display device according to item 19, further comprising one or more sensors associated with and mounted on the wearable frame, the one or more sensors being configured to receive data regarding the state of the real-world environment external to the head-mounted display device. (Item 21) The head-mounted display device according to item 19, wherein the first transformation aligns features of the real-world environment with corresponding features from the first target image. (Item 22) The head-mounted display device according to item 19, further comprising a trigger sensor configured to transmit a signal to the processor in order to initiate the process performed by the processor. (Item 23) The first eyepiece is optically transparent, as described in item 19, for the head-mounted display device. (Item 24) The head-mounted display device according to item 19 further comprises a temperature sensor configured to measure the temperature of the first imaging sensor, wherein the first conversion is at least partially based on the measured temperature of the first imaging sensor. (Item 25) A second eyepiece elastically mounted on the wearable frame, the second eyepiece comprising a second projector firmly and optically coupled to the second eyepiece, the second projector configured to emit light to the second eyepiece, A second imaging sensor is elastically mounted on the wearable frame and configured to capture light emitted from the second eyepiece. The processor further comprises, The data of the second target image is transmitted to the second projector, The second imaging sensor receives data of a second captured image that represents the transformation state of the second target image, Determining a second transformation that maps the second captured image to the second target image, Applying the second transformation to the subsequent image for viewing on the second eyepiece of the head-mounted display device. A head-mounted display device as described in item 19, further configured to perform the following actions. (Item 26) The head-mounted display device described in item 25, wherein the second transformation aligns features of the real-world environment with corresponding features from the second target image. (Item 27) The head-mounted display device described in item 25, wherein the second conversion depends on the first conversion. (Item 28) A head-mounted display device, Wearable frame and Two or more eyepieces elastically mounted on the wearable frame, each of the two or more eyepieces being optically coupled to the respective projector, A rigidity sensing element is elastically mounted on the wearable frame and configured to measure the geometric difference between the two or more eyepieces. A head-mounted display device equipped with the following features. (Item 29) The system further comprises a processor that is communicatively coupled to each projector and the stiffness sensing element, the processor is Receiving data from the stiffness sensing element that represents the geometric difference between the two or more eyepieces, The method involves determining an image transformation for each of the two or more eyepieces, wherein the image transformation is applied to the subsequent image projected by each projector to produce a coordinated binocular representation of the subsequent image. A head-mounted display device as described in item 28, configured to perform the following: (Item 30) The stiffness sensing element is an imaging sensor, as described in item 28, for the head-mounted display device. (Item 31) The stiffness sensing element is a position sensing diode, as described in item 28, for the head-mounted display device. (Item 32) The stiffness sensing element is a LiDAR sensor, as described in item 28, for the head-mounted display device. [Brief explanation of the drawing]
[0010] (Brief explanation of the drawing) [Figure 1] Figures 1A-1B depict a wearable frame for a head-mounted display in its non-deformed state.
[0011] [Figure 2] Figures 2A-2B depict a wearable frame for a head-mounted display in a deformed state.
[0012] [Figure 3] Figures 3A-3B illustrate the calibration process for a head-mounted display.
[0013] [Figure 4] Figure 4 is a flowchart of the calibration process for a head-mounted display.
[0014] [Figure 5] Figure 5 is a diagram of the head-mounted display system.
[0015] Similar reference numbers and symbols in various drawings indicate the same elements. [Modes for carrying out the invention]
[0016] (Detailed explanation) Figures 1A and 1B illustrate the head-mounted display device 100 of the VAR system in an undeformed or ideal state. Figure 1A shows a top view of the head-mounted display device 100 with matched left and right eyepieces (or displays) 70L, 70R. Virtual content may be presented and perceived to the left and right eyes, respectively, through the pair of eyepieces 70L, 70R, as part of the virtual image generation system.
[0017] Figure 1B illustrates the left and right monocular virtual content 72L, 72R to the user's eyes through two eyepieces 70L, 70R as binocular-aligned virtual content 74. Figure 1B shows the head-mounted display device 100 in its non-deformed state. When the two eyepieces 70L, 70R are in their non-deformed state, the monocular virtual content 72L, 72R combine to produce the appropriate binocular-aligned virtual content 74 as shown. This is represented by the perfectly overlapping left and right monocular virtual content 72L, 72R within the virtual content 74.
[0018] The VAR system may operate as an augmented reality or mixed reality system capable of providing images of virtual objects mixed with physical objects within the user's field of view, thereby making the virtual objects(s) appear as if they exist within the user's physical environment. It may be desirable to spatially position the various virtual objects relative to each physical object within the user's field of view. The projection assemblies (also called projectors) 108L, 108R of the head-mounted display device 100 project the virtual objects onto the eyepieces 70L, 70R for display. The virtual objects may be referred to as virtual tags, tags, or callouts and may be implemented in various preferred forms.
[0019] Examples of virtual objects may include, but are not limited to, virtual text objects, virtual numeric objects, virtual alphanumeric objects, virtual tag objects, virtual field objects, virtual chart objects, virtual map objects, virtual measurement objects, or virtual visual representations of physical objects. For example, the VAR system may determine when the user is viewing an empty chair in a room and project images representing a person seated in a chair onto each eyepiece 70L, 70R associated with each eye of the head-mounted display device 100, so that the user perceives that a virtual person is seated in an actual chair in the room.
[0020] As shown in Figures 1A and 1B, the two eyepieces 70L and 70R are aligned with each other in an ideal or non-deformed manner. In other words, the alignment of the two eyepieces 70L and 70R has not changed since the manufacture of the head-mounted display device 100. For example, when viewing a person in a chair as described above, each of the two eyepieces 70L and 70R displays monocular virtual content 72L and 72R of the person seated in the chair. The user perceives this combination of virtual content as virtual content 74.
[0021] This is important because, with respect to 3D perception, the user's brain may not properly associate a single monocular image with 3D depth information. However, the user's brain can properly associate two monocular images that are properly aligned with 3D depth information, creating the illusion that the user is viewing a 3D object located at a certain distance from the user. For example, this could create the illusion that a person is seated in a chair at a certain distance from the user, even though the image of the person is on eyepieces 70L and 70R, i.e., less than one inch from the user's eye.
[0022] The head-mounted display device 100 includes a wearable frame 102 that is mounted on the user's head during use. The wearable frame 102 includes left and right temples 302L, 302R that can be positioned over the user's left and right ears, respectively. A nose rest 306 is provided to allow the wearable frame 102 to rest comfortably against the user's nose. The wearable frame 102 is preferably made from injection-molded plastic and remains lightweight.
[0023] The wearable frame 102 includes two cantilever arm portions 312 that extend away from the bridge 304. The cantilever arm portions 312 provide elastic mounting for a display subsystem 104, which is intended to be positioned above the user's nose and in front of the eyes, similar to the position of eyeglass lenses. The cantilever arm portions 312 connect to left and right cantilever arms 310L, 310R. Each left and right cantilever arm 310L, 310R includes an attachment arm portion 314 that extends from the respective cantilever arm portion 312 in a plane parallel to the plane of the end user's eyes.
[0024] The left and right eyepieces 70L and 70R are attached to the attachment arm portion 314, respectively, and the left and right projection subassemblies 108L and 108R are attached to the outer ends of the attachment arm portion 314, respectively. This facilitates the introduction of light beams into the left and right eyepieces 70L and 70R, respectively, so that the light rays are emitted from the left and right eyepieces 70L and 70R, respectively, and the left and right monocular images are displayed as binocular images to the user wearing the head-mounted display device 100.
[0025] The left and right eyepieces 70L and 70R effectively function as display interfaces when image data is projected onto them. In some implementations, the displays of the left and right eyepieces (or displays) 70L and 70R may be "optical see-through" displays through which the user can directly view light from real objects via transparent (or semi-transparent) elements. In this case, the left and right eyepieces 70L and 70R may be fully transparent or partially transparent so that each eyepiece can superimpose light from the projection subsystems 108L and 108R onto the user's real-world view.
[0026] The display subsystem 104 is configured to present light-based emission patterns to each of the user's eyes. These emission patterns are intended to be comfortably perceived as an extension of physical reality with high-quality 2D or 3D image content.
[0027] The left and right projection subsystems 108L and 108R may project the left and right monocular images onto the left and right eyepieces 70L and 70R, respectively. The eyepieces 70L and 70R may be positioned directly in front of the user's eye so that the monocular images are viewed as binocular images. In some cases, the eyepieces 70L and 70R are less than one inch from the user's eye. In addition, the eyepieces 70L and 70R may be positioned within the user's field of view between the user's eye and the surrounding environment so that direct light from the surrounding environment can pass through the eyepieces 70L and 70R to the user's eye.
[0028] The projection assemblies 108L and 108R may provide the scanned light to the eyepieces 70L and 70R, respectively. In some implementations, the projection subsystems 108L and 108R may be implemented as optical fiber scanning-based projection devices, and the eyepieces 70L and 70R may be implemented as waveguide-based displays, with the scanned light from the respective projection subsystems 108L and 108R incident upon them. The display subsystem 104 may output a series of frames acquired from the frame buffer at various frequencies. In some cases, the display subsystem 104 may output frames at high frequencies to provide the perception of a single coherent scene.
[0029] Each of the projection subsystems 108L and 108R may include a spatial light modulator ("SLM") such as a liquid crystal on silicon ("LCoS") component or a microelectromechanical ("MEMs") scanning mirror. The left projection subsystem 108L may project light representing virtual content toward the left eyepiece 70L, which in turn directs this light toward a diffractive optical element (DOE) configured to input couple and provide orthogonal pupil dilation (OPE) and / or exit pupil dilation (EPE) functionality. Most of the directed light may exit the eyepiece 70L as the light crosses the DOE(s) (e.g., directed toward the user's left eye), but some of this light may continue toward the output coupled DOE 190L, where it is output coupled from the eyepiece 70L as light (represented by ray 203) and can be intercepted, at least in part, by the photosensing assembly 122.
[0030] The right projection subsystem 108R, together with the right eyepiece 70R and its DOE(s) (e.g., output coupling element 190R, input coupling elements (ICE), OPE, and EPE), may operate in a similar manner to the projection subsystem 108L. For example, the projection subsystem 108R, the right eyepiece 70R, and its DOE(s) may present virtual content to the user's right eye, output coupling the light representing the virtual content through the output coupling DOE 190R, and direct it to the photosensing assembly 122.
[0031] As shown in Figure 1A, the light-sensing assembly 122 is located within the bridge 304 of the wearable frame 102. The light-sensing assembly 122 includes separate cameras (imaging sensors) for the left and right eyepieces. The cameras are configured to determine the virtual content displayed on the left and right eyepieces.
[0032] The light sensing assembly 122 can be sufficiently rigid, so that the camera associated with the left eyepiece has a fixed position and orientation relative to the position and orientation of the camera associated with the right eyepiece. The light sensing assembly 122 is preferably made from a rigid material such as aluminum, titanium, or ceramic. The light sensing assembly 122 is also referred to as a rigid sensing element. Both cameras in the light sensing assembly 122 are configured to capture light rays 203 that represent the image displayed on their respective eyepieces 70L, 70R. The light sensing assembly 122 includes a temperature sensor for monitoring the temperature for each camera. The light sensing assembly 122 is preferably elastically mounted on the wearable frame 102 using an insulating material such as foam or rubber.
[0033] Although the light sensing assembly 122 is described as a camera, in some cases a position sensing diode or a LiDAR sensor may be used instead of or in addition to the camera. For example, a LiDAR sensor may project a dense point cloud onto each of the respective eyepieces 70L, 70R, which is reflected and analyzed by a processor to determine the relative position of each eyepiece 70L, 70R. Similarly, a position sensing diode may measure the position of a ray 203, and this information may be analyzed by a processor to determine the relative position of each eyepiece 70L, 70R.
[0034] The head-mounted display device 100 and / or VAR system may also include one or more sensors mounted on the wearable frame 102 for detecting the position and movement of the user's head and / or the position and interocular distance of the user's eyes. Such sensors (one or more) may include an image acquisition device (such as a camera), a microphone, an inertial measurement unit, an accelerometer, a compass, a GPS unit, a wireless device, and / or a gyroscope. For example, a blink sensor may indicate when the user blinks, and this information can be used to trigger a calibration process by the VAR system.
[0035] The ends of the left and right cantilever arms 310L and 310R, which are away from the user's nose, include cameras 103L and 103R, respectively. The left camera 103L and the right camera 103R are configured to capture images of the user's environment, for example, objects directly in front of the user.
[0036] Figure 1A illustrates the left and right eyepieces 70L and 70R, which are further away from the user, and the light-sensing assembly 122, which is closer to the user. However, in some cases, the left and right eyepieces 70L and 70R are closer to the user, and the light-sensing assembly 122 is further away from the user.
[0037] Figures 2A and 2B illustrate the head-mounted display device 100 of the VAR system in a deformed or non-ideal state. An eyepiece 70R is shown bent toward the user. When one or both eyepieces 70L, 70R are in a deformed state, the monocular virtual content 72L, 72R combine to produce binocular mismatched virtual content 74 as shown in Figure 2B. Figure 2B illustrates this mismatch of the monocular virtual content 72L, 72R in the virtual content 74 due to the pitch of the right eyepiece 70R. The mismatch is represented by the left and right monocular virtual content 72L, 72R not perfectly overlapping in the virtual content 74. Such mismatch between the left and right eyepieces 70L, 70R may result in perceived translational and / or rotational mismatch between the left and right virtual content 72L, 72R.
[0038] The head-mounted display device 100 may be deformed during use, for example, by movement of the head-mounted display device 100, or by accidental contact with the user's hand or other objects in the room. In some cases, the head-mounted display device 100 may be deformed during transport or even during initial assembly. As previously described, a movement of only two arc minutes between one eyepiece 70L and the other eyepiece 70R may be sufficient to cause discomfort. This is the case when the eyepieces 70L, 70R are translucent or opaque (for example, as in a virtual reality system). Taking this misalignment into consideration will improve user comfort.
[0039] The system and method described allow the display to be recalibrated to account for this deformation. This can be achieved by capturing the intended light for the left and right eyes and relatively detecting and correcting for relative inconsistencies within the virtual image.
[0040] Figures 1A and 2A illustrate how the light ray 203 is transmitted from the left and right output coupled DOEs 190L and 190R. In the unmodified state of Figure 1A, the light ray 203 arrives at each camera of the photodetector assembly 122 substantially simultaneously. This time-of-flight information depends on the relative position of the left eyepiece 70L to the right eyepiece 70R. For example, light arriving at the display earlier than expected may indicate that the eyepiece is bent away from the user and the output coupled DOE is closer to the photodetector assembly 122. Conversely, light arriving at the display later than expected may indicate that the eyepiece is bent towards the user and the output coupled DOE is further away from the photodetector assembly 122.
[0041] Furthermore, the images captured by the photodetector assembly 122 can be processed to determine information about the position of each respective eyepiece. Light received from the left output coupled DOE 190L is generated in the left captured image using the left camera of the photodetector assembly 122. Light received from the right output coupled DOE 190R is generated in the right captured image using the right camera of the photodetector assembly 122.
[0042] For example, by comparing the left capture image with the left target image representing the undeformed state, the VAR system can determine that the left eyepiece 70L is deformed. The VAR system can also determine the transformation that should be applied to the subsequent image to correct this deformation. The transformation is then applied to the subsequent image with respect to the left eyepiece and transmitted to the left projection subsystem 108L. The user then recovers a comfortable viewing experience.
[0043] The features of the captured image can also show deformation. For example, a rotated image may show the roll of the eyepiece, and a trapezoidal image may show yawing or pitching depending on the length of the trapezoid's sides. A larger image may indicate that the eyepiece is further away from the light-sensing assembly 122, while a smaller image may indicate that the eyepiece is closer to the light-sensing assembly.
[0044] Figures 3A and 3B illustrate the conversion and calibration processes associated with each eyepiece. The left panel of Figure 3A illustrates the mismatched left and right monocular virtual contents 72L, 72R resulting from the pitching of the right eyepiece 70R, as shown in Figure 2B. After conversion, the right monocular virtual content 72R perfectly overlays the left monocular virtual content 72L (as shown in Figure 3B), even though the right eyepiece 70R is still pitching. Figure 3B illustrates how the converted image shown on the mismatched frame of Figure 1B has a proper binocular representation after the conversion process.
[0045] Figure 4 is a flowchart of the calibration process 400. Calibration is determined for each eyepiece 70L, 70R. In some cases, calibration is triggered, for example, by the user blinking, or by the user requesting that calibration be performed (for example, through the user interface or via the setting mode) (step 408). Calibration is performed so quickly that the user is unaware that the calibration process is taking place. In some cases, calibration is completed within 100 milliseconds. In other cases, a separate trigger signal is received for each eyepiece.
[0046] In the context of the left (first) eyepiece, the VAR system receives data of the left target image associated with the undeformed state of the left eyepiece (step 410L). The left target image can be retrieved from a database of target images. The left target image may contain geometric features distinguishable by the processor of the VAR system. For example, the left target image may be a checkerboard pattern of squares with a unique image in each square. The left target image may be monochrome or color. A color left target image can be used to indicate the alignment of color channels in the image (e.g., red, green, and blue color channels). The left target image may be adjusted with respect to the left eyepiece 70L and may differ from the right target image with respect to the right eyepiece 70R. However, in some cases, the left target image and the right target image are the same.
[0047] In some cases, the data from the left target image is transmitted to the left projector for display on the left eyepiece (step 412L). The left camera of the light sensing assembly 122 captures light from the output coupled DOE representing the data from the left capture image (step 418L). The data from the left capture image, associated with the deformation state, is received from the left camera of the left eyepiece of the head-mounted display device (step 420L).
[0048] In some cases, the calibration process 400 includes receiving temperature measurements from the left imaging sensor (step 426L). In other cases, data representing the state of the real-world environment outside the head-mounted display device is received from one or more sensors (step 428L).
[0049] A left transformation is determined (step 430L) that maps the left capture image to the left target image. In some cases, the received temperature is taken into consideration in the left transformation. In some cases, the left transformation matches features of the real-world environment with corresponding features from the left target image. In some cases, the left transformation takes into consideration the relative position of the left imaging sensor with respect to the left eyepiece. In some cases, the left transformation is determined when a trigger signal associated with the left eyepiece is received, but the left transformation can also be determined when a trigger signal associated with the right eyepiece, and / or a trigger signal associated with both the left and right eyepieces, is received. In some cases, one or more trigger signals are received.
[0050] In some cases, the processor determines that the virtual content is located within the upper-left region of the frame, identifies the location and pixel value of the pixels associated with the content detected within the frame, and uses this information to determine the transformation. In other cases, the direction of light entering the image sensor is used to determine the transformation.
[0051] For example, the transformation matches pixels in a display buffer. In this case, the video buffer maps each pixel to a virtual light space, and the transformation maps the pixels to light sources, resulting in a smoothed transformation map. In some cases, it is preferable to have tens of points (e.g., 15-30 points) horizontally and tens of points (e.g., 15-30 points) vertically in the image. This allows the processor to determine where the pixels have moved in the pixel space. For example, if the target image has a cluster of white pixels at location A, and that cluster of white pixels moves to location B, the transformation may simply be a pixel shift (inverse shift) of the entire image from B to A. As another example, if the target image has a cluster of white pixels of size A, and that cluster of white pixels expands to size B, the transformation may simply be a pixel scaling (inverse scaling) of the entire image from size B to size A. Other transformations may follow imaging processing techniques known in the art.
[0052] In some cases, more than one left transformation may be determined. For example, several captured images may be processed to determine several transformations. These may be averaged or filtered to increase the accuracy of the transformations.
[0053] Left conversion is applied to the subsequent image for viewing on the left eyepiece of a head-mounted display device (step 440L). In some cases, the converted subsequent image is transmitted to a left projector optically coupled to the left eyepiece for viewing on the left eyepiece (step 450L). In some cases, process 400 is repeated as desired or as needed.
[0054] In some cases, this process is repeated with respect to the right (second) display. In such cases, data of the right target image, associated with the undeformed state of the right eyepiece of the head-mounted display device, is received (step 410R). In some cases, the right target image is identical to the left target image (step 410) or depends on the left target image in a different way. For example, if the left target image is monochrome, the right target image may also be monochrome. These and other properties of the target images may be communicated to each other by the processor (step 410). For example, left and right transformations may match each individual color channel of the respective captured image with each individual color channel of the respective target image.
[0055] In some cases, data from the right target image is transmitted to the right projector (step 412R). Data from the right capture image is captured by the right imaging sensor, which is optically coupled to the right projector (step 418R). In some cases, data from the right capture image is received, associated with the deformation state of the right eyepiece of the head-mounted display device (step 420R). In some cases, a temperature measurement from the right imaging sensor is received (step 426).
[0056] A right transformation is determined (step 430R) that maps the right capture image to the right target image. In some cases, the received temperature is taken into consideration in the right transformation. In some cases, the right transformation matches features of the real-world environment with corresponding features from the right target image. In some cases, the right transformation takes into consideration the relative position of the right imaging sensor with respect to the right eyepiece. In some cases, the right transformation is determined when a trigger signal associated with the right eyepiece is received, but the right transformation can also be determined when a trigger signal associated with the left eyepiece, and / or a trigger signal associated with both the left and right eyepieces, is received. In some cases, one or more trigger signals are received.
[0057] In some cases, the right-eye transformation depends on the left-eye transformation (step 430). For example, in some cases, a binocular transformation algorithm determines the preferred transformation with respect to both the left-eye and right-eye transformations. The binocular transformation is determined not only from the left and right displays but also from one or more sensors on the head-mounted display device 100 or VAR system. The sensors are used to measure the real world around the user and to appropriately align the superimposed visual content with the real world.
[0058] For example, consider a case where, even if the left eyepiece is shifted 3 pixels to the left, objects in the real world cannot be properly aligned with either eyepiece. In this case, a left transformation may shift the image, and a right transformation may shift the image, so that both displays are aligned with each other and with their real-world surroundings.
[0059] Right-conversion is applied to the subsequent image for viewing on the right eyepiece of a head-mounted display device (step 440R). In some cases, the converted subsequent image is transmitted to a right projector optically coupled to the right eyepiece for viewing on the right eyepiece (step 450R).
[0060] Figure 5 is a diagram of the VAR system 500. The VAR system 500 includes a control subsystem with various software and hardware components, and includes a head-mounted display device 100 shown in Figures 1A and 2A. The VAR system 500 includes a computer processing unit (CPU) 502 and a graphics processing unit (GPU) 504 for performing the processing tasks of the VAR system 500. Left and right monocular calibration algorithms and left and right thermal models are stored in memory and executed on the processor to assist the calibration process.
[0061] In some cases, the left monocular calibration algorithm includes steps 410L–450L of the calibration process 400, and the right monocular calibration algorithm includes steps 410R–450R of the calibration process 400 (but is not limited to these steps). Other steps of the calibration process 400 can be implemented in the left and right monocular calibration algorithms. The left and right display calibration servers also communicate with the processor. Furthermore, the binocular calibration algorithm, online calibration, ancillary calibration servers, and the combined left and right thermal models communicate with the processor for the calibration process. In some cases, the binocular calibration algorithm includes steps 410 and 430 of the calibration process 400 (but is not limited to these steps). Other steps of the calibration process 400 can be implemented in the binocular calibration algorithm.
[0062] The VAR system 500 includes sensors for monitoring the real-world environment 510. These sensors 510 are shown to include user-facing sensors and angle-sensing assemblies, but the sensors described earlier are also included in the VAR system 500. The VAR system 500 also includes a three-dimensional (3D) database 508 for storing 3D scene data. The CPU 502 can control the overall operation of the VAR system 500, while the GPU 504 renders frames from the 3D data stored in the 3D database 508 (e.g., converting a 3D scene to a 2D image) and stores these frames in a frame buffer 506.
[0063] The system also includes left and right frame buffers 506L and 506R, which transmit images to the left and right projection subsystems 108L and 108R, respectively. Specifically, the images are transmitted to the projection system and displayed on the eyepiece of the head-mounted display 100. Once captured by the left and right image sensors of the light sensing assembly 122, the captured images are returned to the processor as part of a calibration process.
[0064] Generally, the control subsystem may include various controllers such as microcontrollers, microprocessors, CPUs, digital signal processors, GPUs, application-specific integrated circuits (ASICs), programmable gate arrays (PGAs), field PGAs (FPGAs), and / or programmable logic controllers (PLUs). The control subsystem may include and / or communicate with one or more processors, such as CPU 502 and GPU 504, which perform the operations described herein, for example, through the execution of executable instructions. Although not shown, one or more integrated circuits may be used to control the loading of one or more frames into and / or from the frame buffer 118, and the operation of the left and right projection subsystems 108L and 108R of the display subsystem 104.
[0065] Specifically, the CPU 502 may receive and process data acquired by the photodetector assembly 122. The CPU 502 may compare data derived from light incident on the photodetector assembly 122 when the wearable frame 102 is in an undeformed state with data derived with respect to light incident on the photodetector assembly 122 when the wearable frame 102 is in a deformed state to determine the relative deformation state of the left and right eyepieces 70L, 70R. In response to detecting a relative deformation state or inconsistency in the virtual image, the VAR system 500 may perform one or more calibration procedures to compensate the virtual or displayed image according to the deformation / inconsistency.
[0066] The VAR system 500 and the various techniques disclosed herein may also be employed in applications other than augmented reality and virtual reality subsystems. While some implementations are described in the context of augmented reality subsystems or virtual reality subsystems, the VAR system 500 is not limited to such subsystems.
[0067] In some cases, the processing aspects (CPU 502 and GPU 504) of the VAR system 500 are implemented within a device that is attached to the user's waist or located in the user's pocket; in other cases, the processing aspects are implemented directly within the head-mounted display device 100 shown in Figures 1A and 2A; in other cases, the processing aspects are implemented within a nearby computer and connected to the head-mounted display device 100 via wired or wireless communication; in other cases, at least some processing aspects are implemented on a remote server and connected to the head-mounted display device 100 via wireless communication using WiFi or similar.
[0068] The calibration technique may be performed by one or more processors, such as CPU 502 or GPU 504 (hereinafter simply referred to as processors). Although the calibration process described above is described in the context of a single captured image, in some cases more than one captured image is processed and more than one transformation is determined for each eyepiece. In this case, the transformations can be averaged or filtered to determine the most accurate transformation.
[0069] In some implementations, the VAR system 500 may be calibrated with respect to the colors of the virtual content displayed to the user. For example, if only blue virtual content is displayed, the processor may perform calibration using a blue test pattern. If only red virtual content is displayed, the processor may perform calibration using a red test pattern. If only green virtual content is displayed, the processor may perform calibration using a green test pattern. If virtual content with a combination of red, blue, and green is displayed, the processor may perform calibration using a combination of red, blue, and green calibration frames.
[0070] The various characteristics of the calibration frame (such as intensity) may be configured to match or resemble the characteristics of a representative virtual content frame. For example, if the intensity of the virtual content is determined to be above or equal to a minimum threshold level, the intensity of the calibration frame may be equal to the intensity of the corresponding virtual content. If the intensity of the virtual content is determined to be below a minimum threshold level, the intensity of the calibration frame may be set to the minimum threshold level.
[0071] In some implementations, the image properties of the calibration frame, such as contrast ratio or brightness, may be configured to further reduce the perceptibility of the test frame. In some implementations, the calibration frame may be dilute by hiding the test image behind the edges of the virtual content. The calibration frame may be further camouflaged by using textures and colors similar to the virtual content.
[0072] The systems, methods, and techniques described may be implemented in digital electronic networks, computer hardware, firmware, software, or combinations thereof. Apparatus implementing these techniques may include appropriate input and output devices, a computer processor, and computer program products tangibly embodied in machine-readable storage devices for execution by a programmable processor. The process of implementing these techniques may be carried out by a programmable processor that executes a program of instructions to perform a desired function by acting on input data and producing appropriate outputs. The techniques may also be implemented using one or more computer programs or non-temporary computer-readable storage media containing instructions executable on a programmable system including a data storage system, at least one input device, and at least one output device, coupled to receive and transmit data and instructions to them.
[0073] Each computer program may be implemented in a high-level procedural or object-oriented programming language, or, if desired, in assembly or machine language, in which case the language may be a compiled or interpreted language. Suitable processors include, for example, both general-purpose and dedicated microprocessors. Generally, the processor receives instructions and data from read-only memory and / or random-access memory. Suitable storage devices for tangibly embodying computer program instructions and data include semiconductor memory devices such as erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and flash memory devices, as well as all forms of non-volatile memory, including, for example, magnetic disks such as internal hard disks and removable disks, magneto-optical disks, and compact disk read-only memory (CD-ROM). Any of the foregoing may be complemented by or incorporated into specially designed ASICs (Application-Specific Integrated Circuits).
[0074] Computer-readable media may be machine-readable storage devices, machine-readable storage substrates, memory devices, material compositions that produce machine-readable propagating signals, or a combination of one or more of these. The term “data processing device” encompasses any device, apparatus, and machine for processing data, including, for example, a programmable processor, a computer, or multiple processors or computers. In addition to hardware, an apparatus may include code that creates an execution environment for the computer program (e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of these). Propagating signals are artificially generated signals (e.g., machine-generated electrical, optical, or electromagnetic signals produced to encode information for transmission to a suitable receiver device).
[0075] Computer programs, also known as programs, software, software applications, scripts, plugins, or code, may be written in any form of programming language, including compiled or interpreted languages, and may be deployed in any form (including as standalone programs or as modules, components, subroutines, or other units suitable for use in a computing environment). Computer programs do not necessarily correspond to files in a file system. A program may be stored in a single file dedicated to that program, or in part of a file that holds other programs or data in multiple collaborative files. A computer program may run on a single computer, or on multiple computers located in a single facility, or distributed across multiple facilities and interconnected by a communication network.
[0076] The processes and logic flows described herein may be carried out by one or more sensor-programmable processors, which execute one or more sensor computer programs to perform actions by acting on input data and generating outputs. The processes and logic flows may also be carried out by a dedicated logic network (e.g., an FPGA (Field-Programmable Gate Array) or an ASIC (Application-Specific Integrated Circuit)), and the device may be implemented as such.
[0077] Suitable processors for executing computer programs include, by example, both general-purpose and dedicated microprocessors, and any one or more sensor processors of any type of digital computer. Generally, processors receive instructions and data from read-only memory, random-access memory, or both.
[0078] The elements of a computer may include a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer also includes one or more sensory mass storage devices (e.g., magnetic, magneto-optical disks, or optical disks) for storing data, or is operablely coupled to them to receive data from them, or to transfer data to them, or both. However, a computer does not have to have such devices. Furthermore, a computer may be embedded in another device (e.g., to name just a few, a tablet computer, a mobile phone, a personal data assistant (PDA), a mobile audio player, a VAR system). Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., internal hard disks or removable disks), magneto-optical disks, and CD-ROM and DVD-ROM disks, for example. The processor and memory may be complemented by or incorporated into a dedicated logic network.
[0079] This specification contains many details, which should be interpreted not as limitations on the scope of this disclosure or the claims, but rather as descriptions of features specific to particular embodiments. Certain features described herein in the context of a separate embodiment may be implemented in combination in a single embodiment. Conversely, various features described in the context of a single embodiment may be implemented separately in multiple embodiments or in any preferred partial combination. Furthermore, features may be described above as acting in a combination, and may even be claimed as such, but one or more features from a claimed combination may, in some cases, be removed from the combination, and the claimed combination may be a partial combination or a variation of a partial combination. For example, mapping operations are described as a series of separate operations, but the various operations may be divided into additional operations, combined into fewer operations, have their execution order changed, or be eliminated, depending on the desired implementation.
[0080] Similarly, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and the described program components and systems may generally be integrated together within a single software product or packaged within multiple software products. For example, while some operations are described as being performed by a processing server, one or more of these operations may be performed by a smart meter or other network component.
[0081] The terms used herein, in particular in the accompanying claims (e.g., in the main body of the accompanying claims), are generally intended to be "non-restrictive" terms (for example, the term "including" should be interpreted as "including, but not limited to," the term "having" should be interpreted as "having at least," and the term "includes" should be interpreted as "including, but not limited to," etc.).
[0082] In addition, if a specific number of claims to be introduced is intended, such intention is explicitly stated in the claim; if no such statement is made, such intention does not exist. For example, for the sake of understanding, the following appended claims may introduce a claim description by including the use of the introductory phrases “at least one” and “one or more.” However, even when the same claim includes the introductory phrase “one or more” or “at least one” and an indefinite article such as “a” or “an,” the use of such phrases should not be interpreted as implying that the introduction of a claim description by the indefinite article “a” or “an” limits any specific claim containing such introduced claim description to only one embodiment containing such description (for example, “a” and / or “an” should be interpreted as meaning “at least one” or “one or more”), and the same applies to the use of a definite article used to introduce a claim description.
[0083] In addition, even when a specific number of claims to be introduced is explicitly stated, a person skilled in the art will recognize that such a statement should be interpreted as meaning at least the number to be stated (for example, the literal statement “two statements” without other modifiers means at least two statements, or two or more statements). Furthermore, in those cases where conventions similar to “at least one of A, B, and C, etc.” or “one or more of A, B, and C, etc.” are used, such structures are generally intended to include A only, B only, C only, A and B together, A and C together, B and C together, or A, B, and C together. The term “and / or” is also intended to be interpreted in this way.
[0084] The use of terms such as "first," "second," and "third" in this specification is not necessarily intended to imply a specific order or number of elements. Generally, terms such as "first," "second," and "third" are used as generic identifiers to distinguish different elements. Unless otherwise stated, these terms should not be understood as implying a specific order. Furthermore, unless otherwise stated, these terms should not be understood as implying a specific number of elements. For example, a first widget may be described as having a first side, and a second widget may be described as having a second side. The use of the term "second side" for a second widget may be intended to distinguish such a side of the second widget from the "first side" of the first widget, and does not imply that the second widget has two sides.
Claims
1. A method, Receiving data of a first target image associated with the non-deformed state of the first eyepiece of a head-mounted display device, Receiving data of a second target image associated with the non-deformed state of the second eyepiece of the head-mounted display device, Receiving data of a first captured image associated with the deformation state of the first eyepiece, Receiving data of a second captured image associated with the deformation state of the second eyepiece, The first determined transformation is to determine a first transformation, wherein the first transformation maps the first captured image to the first target image. A second determined transformation is to determine a second transformation, wherein the second transformation maps the second captured image to the second target image. A method that includes this.
2. The method according to claim 1, wherein the first eyepiece and the second eyepiece include receiving a trigger signal, and determining the first and second conversions is in response to receiving the trigger signal.
3. The first target image and the second target image are the same, or The method according to claim 1, wherein the first target image and the second target image are different.
4. The method according to claim 3, wherein the first target image and the second target image are interdependent.
5. The method according to claim 1, wherein the first eyepiece comprises a first projector optically coupled to the first eyepiece, and the second eyepiece comprises a second projector optically coupled to the second eyepiece.
6. Transmitting the data of the first target image to the first projector, The data of the second target image is transmitted to the second projector. The method according to claim 5, including the method described in claim 5.
7. The first captured image is received by a first imaging sensor which is optically coupled to the first projector. The method according to claim 6, wherein the second captured image is received by a second imaging sensor optically coupled to the second projector.
8. To be viewed on the first eyepiece, the first transformation is applied to the first subsequent image as the first transformed subsequent image, In order to view it on the second eyepiece, the second transformation is applied to the second subsequent image as the second transformed subsequent image. The method according to claim 7, including the method described in claim 7.
9. Transmitting the first converted subsequent image to the first projector, The second converted subsequent image is transmitted to the second projector. The method according to claim 8, including the method described in claim 8.
10. A non-temporary computer-readable medium for storing one or more instructions that can be executed by a computer system to perform one or more operations, wherein the one or more operations are: Receiving data of a first target image associated with the non-deformed state of the first eyepiece of a head-mounted display device, Receiving data of a second target image associated with the non-deformed state of the second eyepiece of the head-mounted display device, Receiving data of a first captured image associated with the deformation state of the first eyepiece, Receiving data of a second captured image associated with the deformation state of the second eyepiece, The first determined transformation is to determine a first transformation, wherein the first transformation maps the first captured image to the first target image. A second determined transformation is to determine a second transformation, wherein the second transformation maps the second captured image to the second target image. Non-temporary computer-readable media, including [specific media].
11. The non-temporary computer-readable medium according to claim 10, further comprising the first eyepiece and the second eyepiece receiving a trigger signal, and determining the first and second conversions in response to receiving the trigger signal.
12. The first target image and the second target image are the same, or The non-temporary computer-readable medium according to claim 10, wherein the first target image and the second target image are different.
13. The non-temporary computer-readable medium according to claim 12, wherein the first target image and the second target image are interdependent.
14. The non-temporary computer-readable medium according to claim 10, wherein the first eyepiece comprises a first projector optically coupled to the first eyepiece, and the second eyepiece comprises a second projector optically coupled to the second eyepiece.
15. Transmitting the data of the first target image to the first projector, The data of the second target image is transmitted to the second projector. A non-temporary computer-readable medium according to claim 14, including the following:
16. The first captured image is received by a first imaging sensor which is optically coupled to the first projector. The non-temporary computer-readable medium according to claim 15, wherein the second captured image is received by a second imaging sensor optically coupled to the second projector.
17. To be viewed on the first eyepiece, the first transformation is applied to the first subsequent image as the first transformed subsequent image, In order to view it on the second eyepiece, the second transformation is applied to the second subsequent image as the second transformed subsequent image. A non-temporary computer-readable medium according to claim 16, including the following:
18. Transmitting the first converted subsequent image to the first projector, The second converted subsequent image is transmitted to the second projector. A non-temporary computer-readable medium according to claim 17, including the following: