Display method and electronic device
By reconstructing the perspective of images captured by cameras in VR devices, especially the perspective of the area where the gaze point is located, the problem of dizziness caused by the difference in position between the camera and the display screen is solved, thus improving the user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAWEI TECH CO LTD
- Filing Date
- 2021-09-09
- Publication Date
- 2026-06-16
AI Technical Summary
In VR devices, the difference in position between the camera and the display screen causes the camera's shooting angle to be inconsistent with the human eye's viewing angle, resulting in dizziness and affecting the user experience.
By reconstructing the perspective of images captured by the camera, especially the perspective of the area where the user's gaze point is located, the display position and shape of objects in the image are adjusted to alleviate dizziness.
It effectively alleviates dizziness caused by the difference in position between the camera and the display screen, improves the VR experience, and reduces the probability and degree of image distortion.
Smart Images

Figure CN115793841B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of electronic technology, and more particularly to a display method and electronic device. Background Technology
[0002] Virtual Reality (VR) technology is a human-computer interaction method created using computer and sensor technologies. VR technology integrates various scientific technologies such as computer graphics, computer simulation, sensor technology, and display technology to create virtual worlds. Users can immerse themselves in these virtual worlds by wearing VR wearable devices (such as VR glasses or VR headsets).
[0003] Objects in a virtual world can be entirely fictional, or they can include 3D models of real objects. This allows users to see both fictional and real objects in the virtual world, creating a more realistic experience. For example, VR wearable devices can be equipped with cameras to capture images of real objects, build 3D models of those objects based on those images, and display them in the virtual world. Take VR glasses as an example. Figure 1 This is a schematic diagram of VR glasses. VR glasses include a camera and a display screen. Generally, to ensure a thin and lightweight design, the camera is not located where the display screen is; it is usually positioned below the display screen, such as... Figure 1 .
[0004] This setup causes the human eye's viewing angle to differ from the camera's viewing angle (or shooting direction). For example, Figure 1 In this setup, the camera is pointing downwards, while the human eye's field of vision is forward. Displaying the image captured by the camera directly to the viewer's eyes on a screen can cause discomfort, and prolonged exposure can lead to dizziness and a poor viewing experience. Summary of the Invention
[0005] The purpose of this application is to provide a display method and electronic device for enhancing the VR experience.
[0006] In a first aspect, a display method is provided, applied to a wearable device, the wearable device including at least one display screen and at least one camera; comprising: displaying a first image to a user through the display screen; wherein at least one of the display position or shape of a first object on the first image is different from that of the first object on a second image, and a second object on the first image and the second object on the second image are identical in both display position and shape; the second image is an image captured by the camera; wherein the first object is located in the area where the user's gaze point is located, and the second object is located in an area outside the area where the user's gaze point is located.
[0007] In this embodiment, the wearable device can reconstruct the viewpoint of the user's gaze in the second image captured by the camera, but not reconstruct the viewpoint of the user's gaze in other areas. Reconstructing the viewpoint of the user's gaze can alleviate dizziness (the dizziness caused by the difference between the camera's shooting angle and the human eye's viewing angle due to the position of the display screen and the camera), improve the VR experience, and reduce the workload of reconstructing the viewpoint only in the gaze area, thus reducing the probability or degree of image distortion.
[0008] In one possible design, the displacement offset between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the distance between the camera and the display screen. For example, the greater the distance between the camera and the display screen, the greater the offset between the first object and the second object.
[0009] In one possible design, the displacement offset between the first display position and the second display position increases as the distance between the camera and the display screen increases, and decreases as the distance between the camera and the display screen decreases.
[0010] For example, when the distance between the camera and the display screen is a first distance, the displacement offset between the first display position and the second display position is a first displacement offset. When the distance between the camera and the display screen is a second distance, the displacement offset between the first display position and the second display position is a second displacement offset. When the first distance is greater than or equal to the second distance, the first displacement offset is greater than or equal to the second displacement offset. When the first distance is less than the second distance, the first displacement offset is less than the second displacement offset.
[0011] In one possible design, the offset direction between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the positional relationship between the camera and the display screen. For example, if the camera is located on the left side of the display screen, the first object on the second image is offset to the left to the position of the first object on the first image.
[0012] In one possible design, the offset direction between the first display position and the second display position changes as the orientation between the camera and the display screen changes.
[0013] For example, when the camera is located in a first direction of the display screen, the offset direction between the second display position and the first display position is the first direction. When the camera is located in a second direction of the display screen, the offset direction between the second display position and the first display position is the second direction.
[0014] In one possible design, the positional offset of the first object on the first image relative to the second object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located within the region where the user's gaze point is located and is closer to the edge of the region where the gaze point is located than the first object; the second offset is less than the first offset. That is, the first object at the center of the region where the user's gaze point is located has a larger offset than the third object at the edge, thus achieving a smooth transition between the region where the user's gaze point is located and other regions.
[0015] In one possible design, the positional offset of the first object on the first image relative to the second object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located outside the region where the user's gaze point is located, and this region surrounds the edge of the region where the user's gaze point is located; the second offset is less than the first offset. That is, the first object within the region where the user's gaze point is located has a larger offset than the third object within the outer region (the region outside the region where the user's gaze point is located, and this region surrounds the edge of the region where the user's gaze point is located), thus achieving a smooth transition between the edges of the region where the user's gaze point is located and other regions.
[0016] In one possible design, the degree of shape change of the first object in the first image relative to the first object in the second image is greater than the degree of shape change of the third object in the first image relative to the third object in the second image; the third object is located within the region where the user's gaze point is located, and the third object is closer to the edge of the region where the gaze point is located than the first object. That is, the smaller the degree of shape change of the object from the center to the edge within the region where the user's gaze point is located, the smoother the transition between the region where the user's gaze point is located and other regions.
[0017] In one possible design, the degree of shape change of the first object in the first image relative to the first object in the second image is greater than the degree of shape change of the third object in the first image relative to the third object in the second image; the third object is located outside the region where the user's gaze point is located, and this region surrounds the edge of the region where the gaze point is located. That is, the degree of shape change of the object is smaller from the region where the user's gaze point is located outwards to the peripheral region (the region outside the region where the user's gaze point is located, and this region surrounds the edge of the region where the gaze point is located). This allows for a smooth transition between the edge of the region where the user's gaze point is located and other regions.
[0018] In one possible design, the positional offset of the first object in the first image relative to the first object in the second image is a first offset; the positional offset of the third object in the first image relative to the third object in the second image is a second offset; the third object is located within the region of the user's gaze point, and the third object is located within a first direction range of the first object, the first direction range including the positional offset direction of the first object in the first image relative to the first object in the second image; the second offset is greater than the first offset. Taking an offset direction of downward to the left as an example, within the region of the user's gaze point, objects in the downward to the left have a larger offset, while objects in the upward to the right have a smaller offset. Thus, when the region of the user's gaze point shifts downward to the left, the image in the upward to the right region can smoothly transition with other regions.
[0019] In one possible design, the positional offset of the first object in the first image relative to the first object in the second image is a first offset; the positional offset of the third object in the first image relative to the third object in the second image is a second offset; the third object is located outside the region where the user's gaze point is located, and this region surrounds the edge of the region where the user's gaze point is located; the third object is within a first direction range of the first object, the first direction range including the positional offset direction of the first object in the first image relative to the first object in the second image; the second offset is greater than the first offset. Taking an offset direction of downward left as an example, the offset of objects within the downward left range of the region where the user's gaze point is located is less than the offset of objects within the downward left range of the outer region surrounding the region where the user's gaze point is located. That is, from the downward left direction of the region where the user's gaze point is located, the farther away the object is, the greater the offset, while the image in the upper right region can smoothly transition with other regions.
[0020] In one possible design, the first image includes a first pixel, a second pixel, and a third pixel, wherein the first pixel and the second pixel are located in the region where the user's gaze point is located, and the first pixel is closer to the edge of the region where the user's gaze point is located than the second pixel; the third image information is located in a region outside the region where the user's gaze point is located; and the image information of the first pixel is located between the image information of the second pixel and the image information of the third pixel.
[0021] In other words, the image information of the pixels in the edge region (i.e., the first pixel) of the first region is the intermediate value between the image information of the pixels in the center region (i.e., the second pixel) and the image information of the pixels in the region outside the first region (i.e., the third pixel). In this way, the first region can transition smoothly with other regions. For example, the color, brightness, resolution, etc. of the pixels gradually change from the region outside the first region to the region inside the first region.
[0022] In one possible design, the image information includes at least one of resolution, color, brightness, and color temperature. It should be noted that the image information may also include more information, which is not limited in the embodiments of this application.
[0023] In one possible design, the at least one camera includes a first camera and a second camera, and the at least one display screen includes a first display screen and a second display screen; the first display screen is configured to display an image captured by the first camera; the second display screen is configured to display an image from the second camera; when the positions of the first display screen and the first camera are different, at least one of the display positions or shapes of a first object on the image displayed on the first display screen differs from those of the first object on the image captured by the first camera, and the display positions and shapes of a second object on the image displayed on the first display screen are the same as those of the second object on the second image captured by the first camera; when the positions of the second display screen and the second camera are different, at least one of the display positions or shapes of a first object on the image displayed on the second display screen differs from those of the first object on the image captured by the second camera, and the display positions and shapes of a second object on the image displayed on the second display screen are the same as those of the second object on the image captured by the second camera. In other words, the technical solution provided in this application embodiment can be applied to wearable devices including two display screens and two cameras, such as VR glasses.
[0024] In one possible design, the shape of the first object in the first image differs from that in the second image, including: the edge contour of the first object in the second image is smoother than the edge contour of the first object in the first image. Since the first object in the first image undergoes viewpoint reconstruction, its edges may be uneven, while the first object in the second image does not undergo viewpoint reconstruction, so its edges are smooth. Because the first object undergoes viewpoint reconstruction, users will not experience dizziness when viewing it while wearing the wearable device (the dizziness caused by the difference between the camera's shooting angle and the human eye's viewing angle due to the position of the display and the camera), thus enhancing the VR experience.
[0025] Secondly, a display method is also provided, applied to a wearable device, the wearable device including at least one display screen, at least one camera, and a processor; the camera is configured to transmit images it captures to the processor, the images being displayed on the display screen via the processor, comprising: displaying a first image to a user via the display screen; at least one of the display position or shape of a first object on the first image and the first object on a second image being different, and a second object on the first image and the second object on the second image having the same display position and shape; the second image being an image captured by the camera; wherein the first object is located in the area where the user's gaze point is located, and the second object is located in an area outside the area where the user's gaze point is located.
[0026] In this embodiment, the wearable device can reconstruct the viewpoint of the user's gaze area in the second image captured by the camera, but not the viewpoint of the user's gaze area. Reconstructing the viewpoint of the user's gaze area can alleviate dizziness (the dizziness caused by the difference between the camera's shooting angle and the human eye's viewing angle due to the position of the display screen and the camera), thus improving the VR experience.
[0027] In one possible design, an additional camera is placed at the location of the primary camera, and the image captured by this additional camera is the same as the image captured by the primary camera. That is, the image observed (by a person or captured by the other camera) at the location of the primary camera is the same as the image captured by the primary camera.
[0028] In one possible design, the displacement offset between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the distance between the camera and the display screen.
[0029] In one possible design, the displacement offset between the first display position and the second display position increases as the distance between the camera and the display screen increases, and decreases as the distance between the camera and the display screen decreases.
[0030] For example, when the distance between the camera and the display screen is a first distance, the displacement offset between the first display position and the second display position is a first displacement offset. When the distance between the camera and the display screen is a second distance, the displacement offset between the first display position and the second display position is a second displacement offset. When the first distance is greater than or equal to the second distance, the first displacement offset is greater than or equal to the second displacement offset. When the first distance is less than the second distance, the first displacement offset is less than the second displacement offset.
[0031] In one possible design, the offset direction between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the positional relationship between the camera and the display screen.
[0032] In one possible design, the offset direction between the first display position and the second display position changes as the orientation between the camera and the display screen changes.
[0033] For example, when the camera is located in a first direction of the display screen, the offset direction between the first display position and the second display position is the first direction. When the camera is located in a second direction of the display screen, the offset direction between the first display position and the second display position is the second direction.
[0034] In one possible design, the positional offset of the first object on the first image relative to the second object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located within the area where the user's gaze point is located and is closer to the edge of the area where the gaze point is located than the first object; the second offset is less than the first offset.
[0035] In one possible design, the positional offset of the first object on the first image relative to the second object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located in a first region, which is outside the region where the user's gaze point is located and surrounds the edge of the region where the user's gaze point is located; the second offset is less than the first offset.
[0036] In one possible design, the degree of morphological change of the first object in the first image relative to the first object in the second image is greater than the degree of morphological change of the third object in the first image relative to the third object in the second image; the third object is located within the region where the user's gaze point is located, and the third object is closer to the edge of the region where the gaze point is located than the first object.
[0037] In one possible design, the degree of morphological change of the first object in the first image relative to the first object in the second image is greater than the degree of morphological change of the third object in the first image relative to the third object in the second image; the third object is located in a first region, which is outside the region where the user's gaze point is located and surrounds the edge of the region where the gaze point is located.
[0038] In one possible design, the positional offset of the first object on the first image relative to the first object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located within the area of the user's gaze point, and the third object is within a first direction range of the first object, the first direction range including the positional offset direction of the first object on the first image relative to the first object on the second image; the second offset is greater than the first offset.
[0039] In one possible design, the positional offset of the first object on the first image relative to the first object on the second image is a first offset; the positional offset of the third object on the first image relative to the third object on the second image is a second offset; the third object is located in a first region, the first region being outside the region where the user's gaze point is located and surrounding the edge of the region where the user's gaze point is located; the third object is located within a first direction range of the first object, the first direction range including the positional offset direction of the first object on the first image relative to the first object on the second image; the second offset is greater than the first offset.
[0040] In one possible design, the first image includes a first pixel, a second pixel, and a third pixel, wherein the first pixel and the second pixel are located in the region where the user's gaze point is located, and the first pixel is closer to the edge of the region where the user's gaze point is located than the second pixel; the third image information is located in a region outside the region where the user's gaze point is located; and the image information of the first pixel is located between the image information of the second pixel and the image information of the third pixel.
[0041] In other words, the image information of the pixels in the edge region of the first region (i.e., the first pixel) is the intermediate value of the image information of the pixels in the center region (i.e., the second pixel) and the image information of the pixels in the region outside the first region (i.e., the third pixel). In this way, the edge region of the first region can transition smoothly.
[0042] In one possible design, the image information includes at least one of resolution, color, brightness, and color temperature.
[0043] It should be noted that the image information may include more information, which is not limited in the embodiments of this application.
[0044] In one possible design, the at least one camera includes a first camera and a second camera, and the at least one display screen includes a first display screen and a second display screen; the first display screen is configured to display an image captured by the first camera; the second display screen is configured to display an image from the second camera; when the positions of the first display screen and the first camera are different, at least one of the display positions or shapes of a first object on the image displayed on the first display screen differs from those of the first object on the image captured by the first camera, and the display positions and shapes of a second object on the image displayed on the first display screen are the same as those of the second object on the second image captured by the first camera; when the positions of the second display screen and the second camera are different, at least one of the display positions or shapes of a first object on the image displayed on the second display screen differs from those of the first object on the image captured by the second camera, and the display positions and shapes of a second object on the image displayed on the second display screen are the same as those of the second object on the image captured by the second camera.
[0045] In other words, the technical solution provided in this application embodiment can be applied to wearable devices that include two displays and two cameras.
[0046] In one possible design, the shape of the first object on the first image is different from that of the first object on the second image, including: the edge contour of the first object on the second image is flatter than the edge contour of the first object on the first image.
[0047] Thirdly, an electronic device is also provided, comprising:
[0048] Processor, memory, and one or more programs;
[0049] The one or more programs are stored in the memory, and the one or more programs include instructions that, when executed by the processor, cause the electronic device to perform the steps of the method as described in the first or second aspect above.
[0050] Fourthly, a computer-readable storage medium is also provided for storing a computer program that, when run on a computer, causes the computer to perform the method described in the first or second aspect above.
[0051] Fifthly, a computer program product is also provided, comprising a computer program that, when run on a computer, causes the computer to perform the method described in the first or second aspect above.
[0052] A sixth aspect also provides a graphical user interface for an electronic device, the electronic device having a display screen, a memory, and a processor, the processor being configured to execute one or more computer programs stored in the memory, the graphical user interface including a graphical user interface displayed when the electronic device performs the method described in the first or second aspect above.
[0053] In a seventh aspect, embodiments of this application also provide a chip coupled to a memory in an electronic device, used to call a computer program stored in the memory and execute the technical solutions of the first to second aspects of embodiments of this application. In embodiments of this application, "coupling" means that two components are directly or indirectly combined with each other.
[0054] The beneficial effects of aspects two through seven mentioned above are the same as those of aspect one, and will not be repeated here. Attached Figure Description
[0055] Figure 1 A schematic diagram of VR glasses provided in an embodiment of this application;
[0056] Figure 2A A schematic diagram of a VR system provided in an embodiment of this application;
[0057] Figure 2B This is a schematic diagram of a VR wearable device provided in an embodiment of this application;
[0058] Figure 2C A schematic diagram of eye tracking provided in an embodiment of this application;
[0059] Figure 3 This is a schematic diagram of the structure of the human eye provided in an embodiment of this application;
[0060] Figure 4A This is a schematic diagram illustrating how the human eye observes an object with the naked eye, according to an embodiment of this application.
[0061] Figure 4B This is a schematic diagram illustrating how a human eye observes an object while wearing VR glasses, according to an embodiment of this application.
[0062] Figure 4C This is a schematic diagram illustrating how a human eye observes an object while wearing VR glasses, according to an embodiment of this application.
[0063] Figures 5A to 5B This is a schematic diagram illustrating an application scenario provided in one embodiment of this application;
[0064] Figures 6A to 6B A schematic diagram of a visual reconstruction process provided in an embodiment of this application;
[0065] Figures 7 to 8 A schematic diagram illustrating visual reconstruction provided in an embodiment of this application;
[0066] Figure 9 A schematic diagram of a first coordinate system and a second coordinate system provided in an embodiment of this application;
[0067] Figures 10 to 11 A schematic diagram illustrating the perspective reconstruction of a first region provided in an embodiment of this application;
[0068] Figure 12 A schematic flowchart illustrating a display method provided in an embodiment of this application;
[0069] Figure 13 This is a schematic diagram illustrating an application scenario provided in one embodiment of this application;
[0070] Figure 14 A schematic diagram of a two-dimensional planar image provided in an embodiment of this application;
[0071] Figure 15 A schematic diagram of the convergence angle provided in an embodiment of this application;
[0072] Figure 16 A schematic diagram illustrating the conversion of a two-dimensional planar image into a three-dimensional point cloud according to an embodiment of this application;
[0073] Figure 17 A schematic diagram of a virtual camera provided in an embodiment of this application;
[0074] Figure 18A schematic diagram of an image before reconstruction and an image after reconstruction provided in an embodiment of this application;
[0075] Figure 19 This is a schematic diagram of an electronic device provided in an embodiment of this application. Detailed Implementation
[0076] The following explanations of some terms used in the embodiments of this application are provided to facilitate understanding by those skilled in the art.
[0077] (1) The at least one involved in the embodiments of this application includes one or more; wherein, multiple means two or more. Furthermore, it should be understood that in the description of this application, terms such as "first" and "second" are used only for descriptive purposes and should not be construed as indicating or implying relative importance or order. For example, the first region and the second region do not represent the degree of importance of the two, or their order, but are merely for distinguishing regions. In the embodiments of this application, "and / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Additionally, the character " / " in this document generally indicates that the preceding and following related objects have an "or" relationship.
[0078] (2) Virtual Reality (VR) technology is a human-computer interaction method created using computer and sensor technologies. VR technology integrates various scientific technologies such as computer graphics, computer simulation, sensor technology, and display technology to create virtual environments. These virtual environments include computer-generated, real-time dynamic, three-dimensional, realistic images that provide visual perception to the user. In addition to the visual perception generated by computer graphics, there are also auditory, tactile, force, and motion perceptions, and even olfactory and gustatory perceptions, also known as multi-sensory perception. Furthermore, it can detect the user's head movements, eye movements, gestures, or other human actions. The computer processes the data corresponding to the user's actions and responds to them in real time, feeding back to the user's five senses, thus forming the virtual environment. For example, when wearing VR wearable devices, users can see a VR game interface and interact with it through gestures, controllers, etc., as if they were actually in the game.
[0079] (3) Augmented Reality (AR) technology refers to superimposing computer-generated virtual objects onto real-world scenes to enhance the real world. In other words, AR technology requires capturing real-world scenes and then adding virtual environments to the real world.
[0080] Therefore, the difference between VR and AR technologies lies in the fact that VR technology creates a completely virtual environment where users see only virtual objects; while AR technology overlays virtual objects onto the real world, including both real and virtual objects. For example, a user wearing transparent glasses can see the surrounding real environment through these glasses, and virtual objects can also be displayed on the glasses, allowing the user to see both real and virtual objects.
[0081] (4) Mixed Reality (MR) technology enhances the realism of the user experience by introducing real-world scene information (or real-world scene information) into a virtual environment, thus building an interactive feedback bridge between the virtual environment, the real world, and the user. Specifically, real-world objects are virtualized (for example, using a camera to scan real-world objects for 3D reconstruction to generate virtual objects), and the virtualized real objects are introduced into the virtual environment, so that users can see real objects in the virtual environment.
[0082] It should be noted that the technical solutions provided in this application embodiment can be applied to VR, AR or MR scenarios; or, they can also be applied to other scenarios besides VR, AR and MR. In short, they are applicable to any scenario where it is necessary to display images to users from a shooting perspective that is different from the human eye's viewing perspective.
[0083] For ease of understanding, the following text will mainly use VR scenarios as an example.
[0084] For example, please see Figure 2A This is a schematic diagram of a VR system according to an embodiment of this application. The VR system includes a VR wearable device 100 and an image processing device 200.
[0085] The image processing device 200 may include a host (e.g., a VR host) or a server (e.g., a VR server). The VR wearable device 100 is connected to the VR host or VR server (wired or wireless connection). The VR host or VR server may be a device with significant computing power. For example, the VR host may be a mobile phone, tablet, laptop, or other device, and the VR server may be a cloud server, etc.
[0086] The VR wearable device 100 can be a head-mounted display (HMD), such as glasses or a helmet. The VR wearable device 100 is equipped with at least one camera and at least one display screen. Figure 2ATaking the VR wearable device 100 as an example, it has two displays, namely display 110 and display 112. Display 110 displays the image to the user's right eye, and display 112 displays the image to the user's left eye. It should be noted that displays 110 and 112 are enclosed inside the VR glasses, so... Figure 2A The arrows indicating display screens 110 and 112 are represented by dashed lines. Display screens 110 and 112 can be two independent displays or two different display areas on the same display screen; this application does not limit this. Furthermore, Figure 2A Taking a VR wearable device 100 equipped with two cameras, namely camera 120 and camera 122, as an example, cameras 120 and 122 are used to capture images of the real world. The image captured by camera 120 can be displayed on display screen 110, and the image captured by camera 122 can be displayed on display screen 112. Generally, when a user wears the VR wearable device 100, their eyes are positioned close to the display screen; for example, the right eye is close to display screen 110 to view the image on display screen 110, and the left eye is close to display screen 112 to view the image on display screen 112. Because the cameras are positioned differently from the display screen (for example, camera 120 is located at the lower right of display screen 110, and camera 122 is located at the lower left of display screen 112), the positions of the cameras and the user's eyes are different. Therefore, the camera's shooting angle is different from the user's viewing angle. For example, see [continued] Figure 2A The shooting angle of camera 120 is different from the viewing angle of the right eye, and the shooting angle of camera 122 is different from the viewing angle of the left eye. This can cause discomfort to the user, and prolonged exposure can lead to dizziness and a poor user experience.
[0087] In this embodiment, the VR wearable device 100 can send images captured by a camera to an image processing device 200 for processing. For example, the image processing device 200 uses the perspective reconstruction scheme provided in this application to reconstruct the perspective of the image (the specific implementation process will be described later), and sends the reconstructed image to the VR wearable device 100 for display. For example, the VR wearable device 100 sends image 1 captured by camera 120 to the image processing device 200 for perspective reconstruction to obtain image 2, and then the display screen 110 displays image 2. The VR wearable device 100 sends image 3 captured by camera 122 to the image processing device 200 for perspective reconstruction to obtain image 4, and then the display screen 112 displays image 4. In this way, the user's glasses see the reconstructed image, which can alleviate discomfort (described later). In some embodiments, Figure 2AThe VR system may also exclude the image processing device 200. For example, the VR wearable device 100 may have local image processing capabilities (such as the ability to reconstruct the viewpoint of an image), eliminating the need for processing by the image processing device 200 (VR host or VR server). For ease of understanding, the following explanation will use the VR wearable device 100 reconstructing the viewpoint locally as an example, and will primarily use VR glasses as an example.
[0088] For example, please refer to Figure 2B The diagram illustrates the structure of a VR wearable device 100 according to an embodiment of this application. Figure 2B As shown, the VR wearable device 100 may include a processor 111, a memory 101, a sensor module 130 (which can be used to acquire the user's posture), a microphone 140, a button 150, an input / output interface 160, a communication module 170, a camera 180, a battery 190, an optical display module 1100, and an eye-tracking module 1200, etc.
[0089] It is understood that the structures illustrated in the embodiments of this application do not constitute a specific limitation on the VR wearable device 100. In other embodiments of this application, the VR wearable device 100 may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
[0090] The processor 111 is typically used to control the overall operation of the VR wearable device 100 and may include one or more processing units. For example, the processor 111 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a video processing unit (VPU) controller, memory, a video codec, a digital signal processor (DSP), a baseband processor, and / or a neural network processing unit (NPU), etc. The different processing units may be independent devices or integrated into one or more processors.
[0091] The processor 111 may also include a memory for storing instructions and data. In some embodiments, the memory in the processor 111 is a cache memory. This memory can store instructions or data that the processor 111 has just used or that are used repeatedly. If the processor 111 needs to use the instruction or data again, it can retrieve it directly from the memory. This avoids repeated accesses, reduces the waiting time of the processor 111, and thus improves the efficiency of the system.
[0092] In some embodiments of this application, the processor 111 can be used to control the optical power of the VR wearable device 100. For example, the processor 111 can be used to control the optical power of the optical display module 1100, thereby adjusting the optical power of the wearable device 100. For instance, the processor 111 can adjust the relative positions of the various optical components (such as lenses) in the optical display module 1100, thereby adjusting the optical power of the optical display module 1100. This, in turn, allows the position of the corresponding virtual image plane to be adjusted when the optical display module 1100 images onto the human eye, thus achieving the effect of controlling the optical power of the wearable device 100.
[0093] In some embodiments, the processor 111 may include one or more interfaces. Interfaces may include an inter-integrated circuit (I2C) interface, a universal asynchronous receiver / transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input / output (GPIO) interface, a subscriber identity module (SIM) interface, and / or a universal serial bus (USB) interface, a serial peripheral interface (SPI) interface, etc.
[0094] In some embodiments, the processor 111 can blur objects at different depths of field to varying degrees so that the objects at different depths of field have different levels of clarity.
[0095] The I2C interface is a bidirectional synchronous serial bus that includes a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the processor 111 may include multiple I2C buses.
[0096] The UART interface is a universal serial data bus used for asynchronous communication. This bus can be a bidirectional communication bus. It converts the data to be transmitted between serial and parallel communication. In some embodiments, the UART interface is typically used to connect the processor 111 and the communication module 170. For example, the processor 111 communicates with the Bluetooth module in the communication module 170 via the UART interface to implement Bluetooth functionality.
[0097] The MIPI interface can be used to connect the processor 111 to peripheral devices such as the display screen and camera 180 in the optical display module 1100.
[0098] The GPIO interface is configurable via software. It can be configured as a control signal or a data signal. In some embodiments, the GPIO interface can be used to connect the processor 111 to the camera 180, the display screen in the optical display module 1100, the communication module 170, the sensor module 130, the microphone 140, etc. The GPIO interface can also be configured as an I2C interface, an I2S interface, a UART interface, a MIPI interface, etc. In some embodiments, the camera 180 can capture images including real objects, and the processor 111 can fuse the images captured by the camera with virtual objects, displaying the fused image through the optical display module 1100. In some embodiments, the camera 180 can also capture images including human eyes. The processor 111 performs eye tracking using these images.
[0099] The USB interface conforms to the USB standard specification and can be a Mini USB interface, Micro USB interface, USB Type-C interface, etc. The USB interface can be used to connect a charger to charge the VR wearable device 100, and can also be used for data transfer between the VR wearable device 100 and peripheral devices. It can also be used to connect headphones for audio playback. This interface can also be used to connect other electronic devices, such as mobile phones. The USB interface can be USB 3.0, used for compatibility with high-speed display port (DP) signal transmission, enabling the transmission of high-speed audio and video data.
[0100] It is understood that the interface connection relationships between the modules illustrated in the embodiments of this application are merely illustrative and do not constitute a structural limitation on the wearable device 100. In other embodiments of this application, the wearable device 100 may also employ different interface connection methods or combinations of multiple interface connection methods as described in the above embodiments.
[0101] Additionally, the VR wearable device 100 may include wireless communication functionality; for example, the VR wearable device 100 may receive images from other electronic devices (such as a VR host) for display. The communication module 170 may include a wireless communication module and a mobile communication module. The wireless communication functionality can be implemented using an antenna (not shown), a mobile communication module (not shown), a modem processor (not shown), and a baseband processor (not shown). The antenna is used to transmit and receive electromagnetic wave signals. The VR wearable device 100 may include multiple antennas, each of which can be used to cover one or more communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example, antenna 1 can be reused as a diversity antenna for a wireless local area network. In some other embodiments, the antenna can be used in conjunction with a tuning switch.
[0102] The mobile communication module can provide wireless communication solutions for VR wearable devices 100, including 2G, 3G, 4G, and 5G networks. The mobile communication module may include at least one filter, switch, power amplifier, low-noise amplifier (LNA), etc. The mobile communication module can receive electromagnetic waves via an antenna, filter and amplify the received electromagnetic waves, and transmit them to a modem processor for demodulation. The mobile communication module can also amplify the signal modulated by the modem processor and radiate it as electromagnetic waves via the antenna. In some embodiments, at least some functional modules of the mobile communication module may be housed in the processor 111. In some embodiments, at least some functional modules of the mobile communication module and at least some modules of the processor 111 may be housed in the same device.
[0103] The modem processor may include a modulator and a demodulator. The modulator modulates the low-frequency baseband signal to be transmitted into a mid-to-high frequency signal. The demodulator demodulates the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low-frequency baseband signal to the baseband processor for processing. After processing by the baseband processor, the low-frequency baseband signal is transmitted to the application processor. The application processor outputs sound signals through an audio device (not limited to a speaker), or displays images or videos through a display screen in the optical display module 1100. In some embodiments, the modem processor may be a separate device. In other embodiments, the modem processor may be independent of the processor 111 and may be housed in the same device as the mobile communication module or other functional modules.
[0104] The wireless communication module can provide solutions for wireless communication applications on the VR wearable device 100, including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared (IR) technologies. The wireless communication module can be one or more devices integrating at least one communication processing module. The wireless communication module receives electromagnetic waves via an antenna, modulates and filters the electromagnetic wave signal, and sends the processed signal to the processor 111. The wireless communication module can also receive signals to be transmitted from the processor 111, modulate and amplify them, and then convert them into electromagnetic waves for radiation via the antenna.
[0105] In some embodiments, the antenna of the VR wearable device 100 is coupled to the mobile communication module, enabling the VR wearable device 100 to communicate with networks and other devices via wireless communication technologies. These wireless communication technologies may include Global System for Mobile Communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Time-Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and / or IR technologies. GNSS may include Global Positioning System (GPS), Global Navigation Satellite System (GLONASS), BeiDou Navigation Satellite System (BDS), Quasi-Zenith Satellite System (QZSS), and / or Satellite Based Augmentation Systems (SBAS).
[0106] The VR wearable device 100 implements display functions through a GPU, an optical display module 1100, and an application processor. The GPU is a microprocessor for image processing, connecting the optical display module 1100 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 111 may include one or more GPUs, which execute program instructions to generate or modify display information.
[0107] Memory 101 can be used to store computer executable program code, which includes instructions. Processor 111 executes various functional applications and data processing of VR wearable device 100 by running the instructions stored in memory 101. Memory 101 may include a program storage area and a data storage area. The program storage area may store the operating system, at least one application program required for a function (such as sound playback function, image playback function, etc.), etc. The data storage area may store data created during the use of wearable device 100 (such as audio data, phone book, etc.). In addition, memory 101 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, universal flash storage (UFS), etc.
[0108] The VR wearable device 100 can implement audio functions through an audio module, speaker, microphone 140, headphone jack, and application processor. Examples include music playback and recording. The audio module is used to convert digital audio information into analog audio signals for output, and also to convert analog audio input into digital audio signals. The audio module can also be used for encoding and decoding audio signals. In some embodiments, the audio module can be located in the processor 111, or some functional modules of the audio module can be located in the processor 111. The speaker, also called a "loudspeaker," is used to convert audio electrical signals into sound signals. The wearable device 100 can listen to music or make hands-free calls through the speaker.
[0109] Microphone 140, also known as a "microphone" or "voice transducer," is used to convert sound signals into electrical signals. VR wearable device 100 may have at least one microphone 140. In some embodiments, VR wearable device 100 may have two microphones 140, which, in addition to collecting sound signals, can also perform noise reduction. In other embodiments, VR wearable device 100 may also have three, four, or more microphones 140, enabling sound signal collection, noise reduction, sound source identification, and directional recording, among other functions.
[0110] The headphone jack is used to connect wired headphones. The headphone jack can be a USB interface, or a 3.5 mm Open Mobile Terminal Platform (OMTP) standard interface, or a CTIA (Cellular Telecommunications Industry Association of the USA) standard interface.
[0111] In some embodiments, the VR wearable device 100 may include one or more buttons 150 that can control the VR wearable device and provide users with access to functions on the VR wearable device 100. The buttons 150 may take the form of buttons, switches, dials, and touch or proximity sensing devices (such as touch sensors). Specifically, for example, a user can turn on the optical display module 1100 of the VR wearable device 100 by pressing a button. Buttons 150 include power buttons, volume buttons, etc. Buttons 150 may be mechanical buttons or touch buttons. The wearable device 100 can receive button input and generate key signal inputs related to user settings and function control of the wearable device 100.
[0112] In some embodiments, the VR wearable device 100 may include an input / output interface 160, which can connect other devices to the VR wearable device 100 via suitable components. Components may include, for example, audio / video jacks, data connectors, etc.
[0113] The optical display module 1100, under the control of the processor 111, presents images to the user. The optical display module 1100 can use one or more optical devices, such as mirrors, transmissive mirrors, or optical waveguides, to convert real-pixel images into near-eye projection virtual images, enabling virtual interactive experiences or a combination of virtual and real interactive experiences. For example, the optical display module 1100 receives image data information sent by the processor 111 and presents the corresponding image to the user.
[0114] In some embodiments, the VR wearable device 100 may further include an eye-tracking module 1200, which tracks the movement of the human eye to determine the gaze point. For example, image processing technology can be used to locate the pupil position, obtain the pupil center coordinates, and then calculate the gaze point. In some embodiments, the eye-tracking system can determine the user's gaze point position (or determine the user's gaze direction) using methods such as video eye diagrams, photodiode response methods, or pupil-corneal reflection methods, thereby achieving eye tracking.
[0115] In some embodiments, the pupillary corneal reflex method is used to determine the user's gaze direction as an example. Figure 2C The eye-tracking system may include one or more near-infrared light-emitting diodes (LEDs) and one or more near-infrared cameras. The near-infrared LEDs and cameras are not... Figure 2BAs shown in the diagram. In different examples, the near-infrared LED can be positioned around the eyepiece to provide comprehensive illumination of the human eye. In some embodiments, the center wavelength of the near-infrared LED can be 850 nm or 940 nm. The eye-tracking system can obtain the user's gaze direction by illuminating the human eye with a near-infrared LED, capturing an image of the eyeball with a near-infrared camera, and then determining the position of the reflective point of the near-infrared LED on the cornea in the eyeball image (i.e.,...). Figure 2C The image of the LED spot on the near-infrared camera) and the center of the pupil (i.e., Figure 2C The image at the center of the pupil on a near-infrared camera is used to determine the direction of the optical axis of the eyeball, thereby obtaining the user's line of sight.
[0116] It should be noted that in some embodiments of this application, separate eye-tracking systems can be set up for each of the user's eyes to perform eye tracking synchronously or asynchronously. In other embodiments of this application, an eye-tracking system can be set up near only one eye. The eye-tracking system can be used to obtain the gaze direction of the corresponding eye, and based on the relationship between the fixation points of the two eyes (e.g., when a user observes an object through both eyes, the fixation points of the two eyes are generally close or the same), combined with the distance between the user's eyes, the gaze direction or fixation point position of the other eye can be determined.
[0117] It is understood that the structure illustrated in the embodiments of this application does not constitute a specific limitation on the VR wearable device 100. In other embodiments of this application, the VR wearable device 100 may include more than Figure 2A The embodiments of this application do not limit the number of components, the combination of certain components, the separation of certain components, or the different arrangement of components.
[0118] In order to clearly explain the technical solution of this application, the mechanism of human vision generation will be briefly explained below.
[0119] Figure 3 A schematic diagram of the components of the human eye. (For example...) Figure 3 The human eye consists of the lens, ciliary muscle, and retina at the back of the eye. The lens acts as a focusing lens, converging incoming light onto the retina to create a clear image of objects in a scene. The ciliary muscle adjusts the lens's shape; by contracting or relaxing, it regulates the lens's refractive power, thus adjusting its focal length. This allows objects at varying distances to be clearly imaged on the retina.
[0120] In the real world, when a user (without VR glasses) views an object, the left and right eyes have different perspectives. The user's brain can determine the depth of an object based on the parallax of the same object in the left and right eyes, so the world seen by the human eye is three-dimensional. Generally, the greater the parallax, the smaller the depth, and vice versa. For an example, please refer to [link to example]. Figure 4A In the real world, the observed object 400 (taking a triangle as an example) is captured. When the human eye observes this observed object 400, the left eye captures image 401, in which the triangle is located at position (A1, B1). The right eye captures image 402, in which the triangle is located at position (A2, B2). The brain can determine the position of the object in the real world by the pixel difference (or parallax) of the same object (such as the triangle) in images 401 and 402. For example, based on the positions of the triangle in image 401 (A1, B1) and image 402 (A2, B2), the brain determines the position of the triangle in the real world as (A3, B3, L1), where L1 is the depth of the triangle, i.e., the distance between the triangle and the user's eyes. In other words, when the user is not wearing VR glasses, the distance between the triangle and the user's eyes is L1, and at this time, the distance between the triangle and the user's eyes in the real world is equal to the distance between the triangle and the user's eyes perceived by the brain.
[0121] At this point, the user remains stationary and wears VR glasses, viewing the same observed object 400 (i.e., the triangle) through the VR glasses. However, the position of the observed object 400 seen by the user while wearing VR glasses differs from the position seen without VR glasses.
[0122] For example, please refer to Figure 4B and Figure 2A In some embodiments, camera 120 on the VR glasses is located at the lower right of display screen 110, and camera 122 is located at the lower left of display screen 112. Therefore, the distance B between the two cameras is greater than the interocular distance B of a human eye. Simply put, camera 122 is further to the left than a person's left eye, and camera 120 is further to the right than a person's right eye. Figure 4B In image 422 captured by camera 122, the triangle is located at (A1, B1). Because camera 122 is further to the left than a person's left eye, the position of the triangle in image 422 captured by camera 122 is different from that in image 401 captured by the left eye when not wearing VR glasses (see image 401). Figure 4A The upper triangle is positioned further to the right; that is, (A1, B1) is to the right of (A1, B1). Continuing as... Figure 4BIn image 420 captured by camera 120, the triangle is located at (A2, B2). Because camera 120 is further to the right than the right eye, the position of the triangle in image 420 captured by camera 120 is different from that in image 402 captured by the right eye when not wearing VR glasses (see image 402). Figure 4A The upper triangle is positioned further to the left; that is, (A2, B2) is to the left of (A2, B2). See also... Figure 4B Assuming image 422 is displayed on display screen 112 and image 420 is displayed on display screen 110, the brain perceives the triangle's position as (A3, B3, L2) based on images 422 and 420. This means that the distance between the triangle perceived by the user and their eyes after wearing VR glasses is L2. Since (A1, B1,) is to the right of (A1, B1) or closer to the image center than (A1, B1,), and (A2, B2,) is to the left of (A2, B2) or closer to the image center than (A2, B2,), then... Figure 4B The pixel difference between (A2, B2,) and (A1, B1,) is greater than Figure 4A The pixel difference between (A2, B2) and (A1, B1) is less than the depth L1 of the triangle seen by the user based on the pixel difference between (A1, B1) and (A2, B2).
[0123] In other words, when a user is in the same position, objects appear closer to the user when wearing VR glasses than when not wearing VR glasses. For example, in the real world, an object appears 1 meter away to the user (without VR glasses), but when the user wears VR glasses, the object appears 0.7 meters away, seemingly closer, which is inconsistent with reality. Furthermore, objects that are already relatively close to the user in the real world appear even closer when wearing VR devices, causing discomfort and a sense of pressure. Over time, this can lead to dizziness and a poor user experience.
[0124] The above example uses a single observed object (i.e., a triangle). The following example uses two observed objects. Figure 4CThe observed objects are 400 (triangle) and 401 (square). For ease of understanding, let's take the example of observed object 402 being located at infinity, such as the sun. Without VR glasses, the left eye would see image 460, and the right eye would see image 470. Since the square is at infinity and close to both the left and right eyes, the square is centered in both images 460 and 470. Thus, the brain perceives the real environment based on images 460 and 470. When the user wears VR glasses, the left eye sees image 480 captured by camera 122, and the right eye sees image 490 captured by camera 120. Because camera 122 is positioned to the left of the user's eyes, the distance between the triangle and the square in image 480 observed from the position of camera 122 is greater than the distance between the triangle and the square in image 460 observed from the position of the left eye. Similarly, because camera 120 is positioned to the right of the right eye, the distance between the triangle and the square in image 490 observed from camera 120 is greater than the distance between the triangle and the square in image 490 observed from the right eye. Therefore, when wearing VR glasses, the brain perceives the triangle as being closer to the user based on images 480 and 490, which does not match reality.
[0125] In the above embodiment, the example given is that camera 120 on the VR glasses is located at the lower right of display screen 110, and camera 122 is located at the lower left of display screen 112. It is understood that in other embodiments, camera 120 and camera 122 can be located in other positions. For example, camera 120 may be located above display screen 110, and camera 122 may be located above display screen 112; or the distance between camera 120 and camera 122 may be less than the distance between the two display screens, etc. As long as the camera positions differ from the display screen positions, the distance between the object seen when wearing VR glasses and the viewer's eyes will differ from the distance seen when not wearing VR glasses.
[0126] To facilitate understanding, the following example illustrates an application scenario where a user wears VR glasses to play games at home. Figure 5A In this application scenario, VR glasses can display a real-world scene to the user, allowing them to see their home environment, such as the sofa and table. In other embodiments, VR glasses can display both real-world scenes and virtual objects. In this case, the user will see their home environment along with virtual objects (e.g., game characters, game interfaces, etc., which are not real-world objects). This allows users to play virtual games in a familiar environment, providing a better experience.
[0127] like Figure 5BWhat the user sees when not wearing VR glasses should be Figure 5B The image (a) shows the real world 501. What the human eye sees when the user wears VR glasses is... Figure 5B The virtual world 502 is shown in (b) above. As can be seen, all objects in the virtual world 502 are closer to the user, especially objects that are already close to the user in the real world, such as tables. After the user wears VR glasses, the table will appear even closer to the user, which is inconsistent with reality.
[0128] To address this issue, this application provides a solution called perspective reconstruction. Perspective reconstruction can be simply understood as perspective adjustment / reconstruction. As mentioned earlier, because the camera's shooting angle differs from the human eye's viewing angle, the scene seen by a user wearing VR glasses differs from the scene seen without VR glasses. Therefore, simply put, perspective reconstruction refers to adjusting the camera's shooting angle to match the human eye's viewing angle. However, adjusting the camera's shooting angle is difficult; for example, if the camera is fixed to a certain position on the VR glasses, adjusting its shooting angle requires corresponding hardware / mechanical structures, which is not only costly but also detrimental to device thinning. Therefore, to avoid increasing hardware costs, image post-processing can be used to achieve the effect of adjusting the camera's shooting angle to match the human eye's viewing angle. This involves processing the image captured by the camera, and when the processed image is displayed on the VR glasses, reducing the difference between the scene seen by the user and the scene seen without VR glasses. This image processing is called image perspective reconstruction. Simply put, image perspective reconstruction adjusts the display position of pixels in the image captured by the camera so that the object seen by the human eye based on the adjusted image conforms to reality. Figure 4B and Figure 4A For example, image perspective reconstruction can include Figure 4B In image 422, the position of the triangle (A1', B1') is adjusted to (A1, B1); Figure 4B In image 420, the position of the triangle (A2', B2') is adjusted to (A2, B2). In other words, the image before viewpoint reconstruction is... Figure 4B Images 422 and 420 in the image are reconstructed from the viewpoint. Figure 4A Images 401 and 402 are shown in the image. In this way, the VR glasses display can show the reconstructed image (i.e., display images 401 and 402), so that the human brain can accurately determine the real position of the object (i.e., the triangle) based on images 401 and 402.
[0129] In one implementation, when reconstructing the viewpoint of an image, the viewpoint can be reconstructed for the entire image (which can be called global viewpoint reconstruction).
[0130] The following text is incomplete and cannot be translated. Figure 5A Taking a scenario as an example, and specifically focusing on global view reconstruction of an image captured by a single camera on VR glasses, let's assume the image captured by the camera is... Figure 6A The image in (a) is shown. For example... Figure 6A (b) In some embodiments, the image is divided into four regions: region 601, region 602, region 603, and region 604. Assume that regions 602 and 604 are displayed lower after view reconstruction, and regions 601 and 603 are displayed higher after view reconstruction. The complete image formed by the four regions after view reconstruction is as follows: Figure 6A As can be seen in (c), objects such as walls, sofas, and tables are deformed (or twisted, misaligned, etc.).
[0131] It should be noted that, Figure 6A Taking the example of dividing an image into four regions for viewpoint reconstruction, global viewpoint reconstruction actually involves finer-grained region division, such as into 9, 16, or more regions; or even reconstructing each individual pixel. Understandably, when viewpoint reconstruction is performed on finer-grained regions or individual pixels, the distortion of objects in the image becomes more severe. For example, ... Figure 6B After global perspective reconstruction, the wall surface and the edges of the table are distorted (e.g., appearing as wavy lines). Therefore, the global perspective reconstruction solution is not only extremely labor-intensive, but also results in severe image distortion, significantly impacting the user experience.
[0132] In some implementations, it's unnecessary to reconstruct the viewpoint of the entire image. For example, viewpoint reconstruction can be performed only on the first region of the image (the image captured by the camera), while the second region (other regions outside the first region) remains unreconstructed. The first region can be the area where the user's gaze point is located, the user's region of interest, the default region, the user-specified region, etc. For ease of understanding, reconstructing the viewpoint of the first region of the image can be called region-based viewpoint reconstruction. Because viewpoint reconstruction is only performed on the first region, and not the second region, the workload is reduced. Furthermore, as mentioned earlier, viewpoint reconstruction may cause image distortion, but since the second region does not need reconstruction, the image within the second region will not be distorted. In other words, the probability or degree of image distortion during region-based viewpoint reconstruction is much lower than that during global viewpoint reconstruction, helping to mitigate the image distortion phenomenon that occurs during global viewpoint reconstruction.
[0133] For example, continue with Figure 5A Taking a scenario as an example, suppose the image captured by the VR glasses is like this Figure 7The image shown in (a) illustrates this. Assuming the user's gaze point is located within the area enclosed by the dashed line, viewpoint reconstruction is only performed on the area enclosed by the dashed line, not on other areas. Therefore, in the reconstructed image, the display position and / or shape of objects within the dashed line area changes, while the display position and / or shape of objects in other areas remain unchanged. Figure 7 (b) Therefore, the distortion of objects in the reconstructed image is significantly lower than that in the globally reconstructed image. For example, please compare... Figure 6B and Figure 7 (b) in the middle Figure 6B It is an image reconstructed from a global perspective. Figure 7 The image is reconstructed using the technical solution of this application. It can be seen that other areas (areas outside the area where the gaze point is located) in the image reconstructed using the technical solution of this application are stable. For example, the sofa and the wall are not distorted, which significantly reduces the degree and probability of image distortion.
[0134] It should be noted that, Figure 7 In this example, taking the area enclosed by the dashed line as the location of the user's gaze point, the area enclosed by the dashed line can be the smallest bounding rectangle of the table, or a region greater than or equal to the smallest bounding rectangle of the table. It is understood that it can also be the smallest bounding square, smallest bounding circle, etc., of the table; the shape is not limited. In other embodiments, the area where the user's gaze point is located can also be a portion of the area on the table.
[0135] Figure 7 Taking the reconstruction of a region's viewpoint from an image captured by a single camera as an example, it can be understood that when a VR headset includes two cameras, the region's viewpoint can be reconstructed from the image captured by each camera separately.
[0136] For example, such as Figure 8Image 622 is captured by camera 122 on the VR glasses, and image 620 is captured by camera 120. The VR glasses can reconstruct the viewpoint of the dashed area on image 622 to obtain image 624. The objects within the dashed area in image 624 have different display positions and / or shapes than the objects within the dashed area in image 622. For example, the table in image 624 is displayed to the left of the table in image 622, and / or the table is deformed to some extent. No viewpoint reconstruction is performed on other areas of image 622 (areas outside the dashed area), so the objects (such as the table) in other areas of image 624 have the same display positions and shapes as the objects in other areas of image 622. The VR glasses can also reconstruct the viewpoint of the dashed area on image 620 to obtain image 626. The objects within the dashed area in image 626 have different display positions and / or shapes than the objects within the dashed area in image 620. For example, the table in image 626 is positioned to the right compared to the table in image 620, and / or the table is deformed to some extent. No viewpoint reconstruction is performed on other areas of image 620 (areas outside the dashed lines, such as the area where the sofa is located), so objects in other areas of image 626 have the same display position and shape as objects in other areas of image 620.
[0137] The VR glasses' display screen 112 shows image 624, and display screen 120 shows image 626. Thus, when the user wears the VR glasses, their left eye sees image 624, and their right eye sees image 626. Based on the parallax of the table in images 624 and 626, the depth information of the table is accurately determined because the display positions of the table in images 624 and 626 have been adjusted. After the adjustment, the parallax of the table in the two images is reduced. Based on the smaller parallax, the determined depth information is larger, so the user no longer feels the table is close to them, and the scene they see matches reality. Furthermore, it should be noted that because the viewpoint reconstruction is performed on the area containing the dotted line, the table seen by the user wearing the VR glasses is somewhat distorted. However, since the viewpoint reconstruction is not performed on other areas, objects in other areas seen by the user are not distorted. Compared to global viewpoint reconstruction, the degree of distortion / shape change is reduced. Moreover, since other areas have not undergone perspective reconstruction, the display positions of objects in other areas seen by users wearing VR glasses are inaccurate and differ from the real world. However, because other areas are not the user's focus area and user attention is low, the inaccurate display positions of objects in other areas have little impact on user experience and save workload, thus helping to improve efficiency.
[0138] The following explanation, in conjunction with the accompanying drawings, illustrates the implementation principle of the aforementioned perspective reconstruction.
[0139] First, let's define two coordinate systems: the first coordinate system (X1-O1-Y1) and the second coordinate system (X2-O2-Y2). The first coordinate system (X1-O1-Y1) is based on the display screen. For example... Figure 9 The first coordinate system has its origin at the center of the display screen 112, and the display direction is the Y-axis. It can be understood that the first coordinate system can also be a coordinate system based on the human eye, such as based on... Figure 9 The first coordinate system is established based on the left eye. Considering that establishing a coordinate system based on the human eye is more difficult than creating one based on the display screen, and that the display screen's position is close to the human eye's position, the coordinate system created based on the display screen can be considered to some extent the same as the coordinate system created based on the human eye. The second coordinate system (X2-O2-Y2) is established based on camera 122. For example, as shown... Figure 9 The second coordinate system (X2-O2-Y2) is created based on camera 122; that is, when camera 122 captures an object, the image is formed in the second coordinate system (X2-O2-Y2). Since the image captured by camera 122 and the image displayed on screen 112 are not in the same coordinate system, the camera's viewing angle differs from the human eye's viewing angle. Therefore, reconstructing the viewing angle of the image captured by camera 122 can be understood as performing a coordinate transformation on the image captured by camera 122, that is, transforming it from the second coordinate system to the first coordinate system.
[0140] The transfer from the second coordinate system to the first coordinate system requires an offset, which refers to the difference between the second and first coordinate systems, or the distance between the center of camera 122 and the center of display screen 112. Reconstructing the viewpoint of the image captured by camera 122 includes offsetting the pixels in the image to the target position according to the offset. For example, as mentioned earlier... Figure 4B For example, if the position of the triangle on image 422 captured by camera 122 is (A1', B1'), then (A1', B1') + offset = (A1, B1), which gives the result. Figure 4A The position of the middle triangle (A1, B1) is determined to reconstruct the perspective of the triangle.
[0141] In some embodiments, the offset includes the offset direction and / or offset distance (the offset distance may also be referred to as the displacement offset).
[0142] The offset distance can be the distance between the origin of the second coordinate system and the origin of the first coordinate system. That is, the offset distance is related to the distance between the display screen 112 and the camera 122. For example, the greater the distance between the display screen 112 and the camera 122, the greater the distance between the first and second coordinate systems, i.e., the greater the offset distance. In some embodiments, the offset distance increases as the distance between the camera 122 and the display screen 112 increases, and decreases as the distance between the camera 122 and the display screen 112 decreases. For example, when the distance between the camera 122 and the display screen 112 is a first distance, the offset distance is a first displacement offset. When the distance between the camera 122 and the display screen 112 is a second distance, the displacement distance is a second displacement offset. If the first distance is greater than or equal to the second distance, the first displacement offset is greater than or equal to the second displacement offset. If the first distance is less than the second distance, the first displacement offset is less than the second displacement offset. For example, as mentioned earlier... Figure 4B For example, if the distance between camera 122 and display screen 112 increases, then the position (A1', B1') of the triangle on image 422 captured by camera 122 will be different from the position of the triangle. Figure 4A The displacement between the positions (A1, B1) of the middle triangle increases.
[0143] The offset direction can be the direction from the origin of the second coordinate system to the origin of the first coordinate system. That is, the offset direction is related to the positional relationship between the display screen and the camera. In some embodiments, the offset direction changes with the orientation between the camera and the display screen. For example, when the camera is located in a first orientation of the display screen, the offset direction is the first direction. When the camera is located in a second orientation of the display screen, the offset direction is the second direction. For instance, when camera 122 is located to the left of display screen 112, that is, to the left of the first coordinate system in the second coordinate system, the offset direction is to the left. For example, as mentioned earlier... Figure 4B For example, in the image 422 captured by camera 122, the position of the triangle (A1', B1') shifts to the left. Figure 4A The position of the middle triangle is (A1, B1). Similarly, if camera 120 is located to the right of display screen 110, then the offset direction is to the right. For example, the previous... Figure 4B For example, in the image 420 captured by camera 120, the position of the triangle (A2', B2') is shifted to the right. Figure 4A The position of the middle triangle (A2, B2).
[0144] The first coordinate system, the second coordinate system, and the offset can be stored in the VR glasses beforehand.
[0145] In other embodiments, the offset can vary. In some embodiments, the relative position between the display and the camera can change. For example, the display can move on the VR glasses, and / or the camera can move on the VR glasses. For instance, as the position of the display on the VR glasses is adjusted, and / or the position of the camera is adjusted or the shooting angle changes, the offset between the first coordinate system corresponding to the display and the second coordinate system corresponding to the camera changes accordingly; or, as the distance between the two displays on the VR glasses is adjusted, and / or the distance between the two cameras is adjusted, the corresponding offset changes. For example, the distance between the two displays and / or the distance between the two cameras can be adjusted according to the distance between the user's left and right pupils. This solution can be applied to VR glasses with adjustable display and / or camera positions. Such VR glasses can be suitable for various user groups. For example, when the VR glasses are used by users with wider interpupillary distances, the relative distance between the display and the camera can be adjusted to be larger; when used by users with narrower interpupillary distances, the relative distance between the display and the camera can be adjusted to be smaller, and so on. Therefore, one VR glasses can be used by multiple users, such as a VR glasses that can be used by the whole family. Regardless of how the display and / or camera positions are adjusted, the offset is adjusted accordingly, and the VR glasses can reconstruct the viewpoint based on the adjusted offset.
[0146] In some embodiments, VR glasses can offset all pixels in the image captured by the camera to the target position according to the offset amount (i.e., global view reconstruction).
[0147] In other embodiments, the VR glasses can first determine a first region on the image, and then offset the pixels within the first region to the target position according to the offset amount. That is, only the pixels within the first region are offset, while the pixels in other regions remain unchanged.
[0148] For example, the first region may be the area where the user's gaze point is located on the image. In some embodiments, the VR glasses include an eye-tracking module, which can locate the user's gaze point. One possible implementation is that the VR glasses determine that the user's gaze point is located on an object (e.g., Figure 7If a point on an object (such as a table) is considered as the first region, then the smallest bounding rectangle of that object (e.g., the table) is defined as the first region. It is understood that the smallest bounding rectangle can also be a smallest bounding square, a smallest bounding circle, etc. In other embodiments, when the VR glasses determine that the user's gaze point is located at a point on an object (such as a table), the first region can be a rectangle centered on that point with a preset length as its side, or a circle centered on that point with a preset radius as its first region, and so on. The preset length, preset radius, etc., can be default settings. In this case, the region where the user's gaze point is located may be a portion of the object. In still other embodiments, the first region can also be the entire region with a depth at the depth of the user's gaze point.
[0149] Alternatively, the first region can also be a region of interest for the user in the image. This region of interest can be the area containing an object of interest to the user in the image. For example, the VR glasses can store objects of interest (e.g., people, animals, etc.). When the presence of such an object in the image captured by the camera is detected, the area containing that object is determined to be the first region. The object of interest can be manually stored by the user in the VR glasses, or it can be an object that the user can interact with in the virtual world. The object of interest can also be an object whose interaction frequency and / or duration recorded by the VR glasses exceed a preset number of interactions, etc.
[0150] Alternatively, the first region can be the default region, such as the center region of the image. Considering that users generally focus on the center region of the image first, the first region is the center region by default.
[0151] Alternatively, the first area can also be a user-defined area. For example, a user can set the first area on the VR glasses or on an electronic device (such as a mobile phone) connected to the VR glasses, and so on.
[0152] In other embodiments, the first region can be determined based on different scenarios. For example, in a VR game scenario, if user A participates in the game as a player, the area where user A's game character is located is the first region. Alternatively, if user A is spectating user B's game, the area where user B's game character (i.e., the player being spectated) is located is the first region. As another example, in a VR driving scenario, if user A wears VR glasses and sees themselves driving a virtual vehicle on the road, the first region could be the area where user A's vehicle is located, or the area containing the steering wheel, windshield, etc., of user A's vehicle, or the area where vehicles in front of user A's vehicle are located on the road.
[0153] In summary, the first region is a region on the image captured by the camera. The specific ways to determine the first region include, but are not limited to, the methods mentioned above, which will not be listed in this application.
[0154] In some embodiments, the offset of all pixels within the first region can be the same. For example, the offset distance of all pixels is the distance between the origin of the first coordinate system and the origin of the second coordinate system, as described above, and the offset direction of all pixels is the direction from the origin of the second coordinate system to the origin of the first coordinate system.
[0155] In other embodiments, the offsets of different pixels within the first region can be different. For example, as... Figure 10 In (a), the first region 1000 includes a central region 1010 and an edge region 1020 (the region marked with a diagonal line). The area of the edge region 1020 can be a default value, such as a region formed by extending from the edge of the first region to a preset width within the first region. The offset of pixels within the central region 1010 is greater than the offset of pixels within the edge region 1020. For example, if the distance between the origin of the first coordinate system and the origin of the second coordinate system is L, the offset distance of pixels within the central region 1010 is equal to L, while the offset distance of pixels within the edge region 1020 is less than L, such as L / 2, L / 3, etc. In this way, the pixel displacement amplitude at the center of the first region is larger, while the pixel displacement amplitude at the edge is smaller. Because the edge region connects to other regions, if the pixel displacement amplitude at the edge is small, the connection with other regions can be smoother, avoiding obvious misalignment at the edge of the region where the gaze point is located.
[0156] Understandably, when the offset in the central region is large and the offset in the edge region is small, the degree of deformation (i.e., shape change) of objects in the central region is greater, while the degree of deformation of objects in the edge region is smaller. In other words, the degree of deformation of objects gradually decreases from the center to the edge of the first region.
[0157] In other cases, the offset of pixels in the first region is greater than the offset of pixels in the second region. The second region can be an area outside the first region but surrounding the outer edge of the first region. The area of the second region is not limited; for example, it can be a region formed by a predetermined width extending outward from the outer edge of the first region. Correspondingly, because the offset is larger in the first region and smaller in the second region, the degree of deformation of the object in the first region is greater, while the degree of deformation of the object in the second region is smaller. In other words, the degree of deformation of the object gradually decreases from the first region outward to the second region.
[0158] In other cases, the offsets of different pixels on the edge region 1020 can also be different. For example, as... Figure 10(b) Edge region 1020 includes a first edge region 1022 (the diagonal line portion) and a second edge region 1024 (the black portion). Assuming the offset direction is as shown by the arrow in the diagram, i.e., the first region 1000 offsets to the lower left, then the first edge region 1022 is within the offset direction (i.e., to the lower left of the first region 1000), and the second edge region is within the opposite direction (i.e., to the upper right of the first region 1000). The pixel offsets within the two edge regions are different. Continuing... Figure 10 (b) Assuming the offset direction is as indicated by the arrow, then the offset of pixels in the first edge region 1022 (black region) < the offset of pixels in the center region 1010 < the offset of pixels in the second edge region 1024 (diagonal region). In other words, objects within the offset direction (i.e., objects in the first edge region 1022) in the first region have a larger offset, while objects within the opposite direction (i.e., objects in the second edge region 1024) have a smaller offset. Thus, when the first region offsets according to the offset direction, the edge on the opposite side of the offset direction in the first region can smoothly transition with other regions.
[0159] In some embodiments, the first image information of the first pixel within the edge region of the first region in the reconstructed image can be an intermediate value, such as the average value, of the second and third image information. The second image information is the image information of the second pixel within the central region of the first region, and the third image information is the image information of the third pixel within other regions. For example, ... Figure 11 Pixel A is located in the edge region 1020 of the first region 1000, pixel B is located in other regions, and pixel C is located in the center region 1010 of the first region 1000. The image information of pixel A can be the average of the image information of pixel B and pixel C, and the image information includes one or more of resolution, color, color temperature, or brightness. Pixel C and pixel B can be pixels close to pixel A. Since the edge region 1020 of the first region is a transition region between the first region and other regions, a smooth transition between the first region and other regions will be achieved when the resolution, color, color temperature, brightness, etc. of the pixels in the edge region 1020 are intermediate values.
[0160] Understandably, the above is based on... Figure 8 Taking the image captured by camera 122 as an example for perspective reconstruction, it can be understood that perspective reconstruction can also be performed on the image captured by camera 120. The implementation principle is the same, so it will not be repeated.
[0161] In other embodiments, this application provides a display method. This method is applicable to electronic devices including at least one camera and at least one display screen, such as VR glasses, wherein the camera and display screen are in different positions. For example, Figure 4C Because the camera on VR glasses is positioned differently from the display screen, the user's field of view differs from the camera's viewing angle when wearing VR glasses. For example... Figure 12 This is a flowchart illustrating the display method provided in an embodiment of this application. The method includes:
[0162] S1, the camera captures the second image.
[0163] Among them, the camera can be Figure 4C The VR glasses shown can use any of the cameras, such as left camera 122 or right camera 120. Taking left camera 122 as an example... Figure 4C Within the viewing angle of the left camera, the triangle is located to the left and slightly in front of the square (because the square is at infinity, like the sun). Since the image captured by the left camera 122 is a two-dimensional planar image, the imaging plane of the left camera 122 includes both a triangle and a square, with the triangle to the left of the square, as shown below. Figure 13 This is a schematic diagram of a two-dimensional planar image captured by camera 122. In this image, the triangle is to the left of the square. It can be understood that if another camera is placed at the location of camera 122, the image captured by this other camera is the same as the image captured by camera 122. In other words, the image observed (either by a person or captured by another camera) at the location of camera 122 is the same as the image captured by camera 122.
[0164] S2, determine the first region on the second image.
[0165] There are multiple ways to determine the first region; please refer to the previous text, which will not be repeated here. For example, such as... Figure 13 The first region is the dashed line area on the two-dimensional planar image captured by the camera.
[0166] S3 reconstructs the perspective of the first region.
[0167] As mentioned earlier, reconstructing the viewpoint of the first region includes performing coordinate transformation on the image within the first region, that is, transforming it from the coordinate system corresponding to the camera to the coordinate system corresponding to the display screen or the human eye. Since the image captured by the camera is a two-dimensional planar image, one way to implement the coordinate transformation is to convert the two-dimensional planar image captured by the camera into a three-dimensional point cloud. The three-dimensional point cloud can reflect the position (including depth) of various objects in the real environment. Then, a virtual camera is created by simulating the human eye. By capturing the three-dimensional point cloud with the virtual camera, an image seen from the perspective of the human eye can be obtained, thus reconstructing the viewpoint from the camera's location to the human eye's location. Specifically, step S3 includes the following steps for reconstructing the viewpoint of the first region:
[0168] The first step is to determine the depth information of the pixels within the first region. This determination can be achieved using at least one of methods 1 and 2.
[0169] Method 1: Determine the depth information of a pixel based on the pixel difference between the same pixel in two images captured by the two cameras on the VR glasses. For example, the depth information of the pixel satisfies the following formula:
[0170]
[0171] Where f is the focal length of the camera, B is the distance between the two cameras, disparity is the pixel difference between the same pixel in the two images, and d is the depth information of the pixel.
[0172] Method 2 determines the depth information of a pixel based on the convergence angle between the user's left and right eyes, and the correspondence between the convergence angle and depth information.
[0173] Please see Figure 14 (a) In a real environment, the angle formed by the lines of sight of the left and right eyes when observing an object is called the convergence angle θ. It can be understood that the closer the observed object is to the eye, the larger the convergence angle θ and the smaller the depth of convergence. Conversely, the farther the observed object is from the eye, the smaller the convergence angle θ and the greater the depth of convergence. For example... Figure 14(b) When a user wears VR glasses, the objects seen in the virtual environment presented by the VR glasses are all displayed on the VR glasses' screen. The light emitted from the screen has no depth difference, so after focus adjustment, the eye's focus is fixed on the screen, meaning the convergence angle θ becomes the angle at which the eye's line of sight points to the screen. However, the actual depth of objects in the virtual environment perceived by the user is not the same as the distance between the screen and the user. Therefore, in this embodiment, after the user wears the VR glasses, the user's convergence angle θ is determined. The VR glasses can store a database containing the correspondence between the convergence angle θ and depth information. When the VR glasses determine the convergence angle θ, the corresponding depth information is determined based on this correspondence. This database can be obtained based on experience and pre-stored in the VR glasses; or it can be determined based on deep learning.
[0174] The second step is to determine the 3D point cloud data corresponding to the first region based on the depth information of the pixels in the first region.
[0175] For example, with Figure 13 Taking a two-dimensional planar image captured by camera 122 as an example, after determining the depth information of the pixels in the first region (dashed line region) of the two-dimensional planar image, a three-dimensional point cloud of the pixels in the first region can be obtained, such as... Figure 15 The 3D point cloud corresponding to the first region can map the real-world positions of each pixel within that region. For example... Figure 15 In the 3D point cloud, the point cloud corresponding to the triangle is to the left and in front of the point cloud corresponding to the square. This is because the scene observed from the position of camera 122 is that the triangle is to the left and in front of the square.
[0176] The third step is to create a virtual camera.
[0177] It is understandable that the image acquisition principle of the human eye is similar to the image shooting principle of a camera. In order to simulate the image acquisition process of the human eye, a virtual camera is created. This virtual camera simulates the human eye. For example, the position of the virtual camera is the same as the position of the human eye, and / or the field of view of the virtual camera is the same as the field of view of the human eye.
[0178] For example, generally speaking, the human eye's field of view is 110 degrees vertically and 110 degrees horizontally. Therefore, the virtual camera's field of view is also 110 degrees vertically and 110 degrees horizontally. Another example is that VR glasses can determine the user's eye position, so the virtual camera is positioned at that eye position. There are several ways to determine the eye position. For example, Method 1: First, determine the location of the display screen, then add a distance A to the display screen's location to estimate the user's eye position. This method determines the eye position relatively accurately. Here, distance A is the distance between the display screen and the user's eye, which can be pre-stored. Method 2: The user's eye position equals the display screen's location. This method is simpler, and placing the virtual camera at the display screen can alleviate discomfort caused by the difference between the shooting angle and the human eye's angle. For example,... Figure 16 The virtual camera is positioned at eye level.
[0179] The fourth step is to use a virtual camera to capture images of the 3D point cloud data corresponding to the first region. These images are reconstructed from the perspective of the first region.
[0180] For example, such as Figure 17 In the image, the virtual camera corresponding to the left eye captures a 3D point cloud (converted from the 2D image acquired by the left camera 122). Because the virtual camera corresponding to the left eye is positioned to the right of the left camera 122, the distance between triangles and squares in the image captured by the virtual camera corresponding to the left eye is smaller. For easier comparison, as shown... Figure 18 Image 1701 is a two-dimensional planar image captured by camera 122, and image 1702 is an image captured by the virtual camera corresponding to the left eye (i.e., an image after reconstructing the viewpoint of the first region on the two-dimensional planar image captured by camera 122). The distance between two objects in image 1702 is less than the distance between two objects in image 1701. Image 1702 is equivalent to the image captured by a person's left eye.
[0181] The above explanation uses camera 122 as an example. The same principle applies to camera 120: the planar two-dimensional image captured by camera 120 is converted into a three-dimensional point cloud, and then a virtual camera corresponding to the right eye is created. This virtual camera is then used to capture the three-dimensional point cloud. For example, as shown... Figure 17 Image 1703 is a two-dimensional planar image captured by camera 120, and image 1704 is an image captured by the virtual camera corresponding to the right eye (i.e., the image after reconstructing the viewpoint of the first region). The distance between two objects in image 1704 is less than the distance between two objects in image 1703. This is because the virtual camera corresponding to the right eye is to the left of camera 120. Therefore, image 1704 is equivalent to the image captured by a person's right eye.
[0182] In this application, only the first region is mapped with a 3D point cloud, and the second region is not mapped with a 3D point cloud. Therefore, the image captured by the virtual camera only includes the first region and does not include the second region, which reduces the workload.
[0183] S4, combine the image patch in the second region of the second image with the image patch in the first region after viewpoint reconstruction to form the first image. The second region is the region outside the first region in the second image.
[0184] The second region was not reconstructed in terms of perspective. The image captured by the virtual camera is the image after the perspective of the first region has been reconstructed. Therefore, the first image, relative to the second image, shows that the perspective of the first region has been reconstructed, but the perspective of the second region has not been reconstructed.
[0185] S5, display the first image.
[0186] In some embodiments, the above S2 to S4 can be executed by the processor in the VR glasses. That is, after the camera captures the second image (i.e., S1), it sends the second image to the processor, the processor executes S2 to S4 to obtain the first image, and the processor displays the first image through the display screen.
[0187] Based on the same concept Figure 19 The image shows an electronic device 1900 provided in this application. This electronic device 1900 can be a VR wearable device (e.g., VR glasses) as described above. Figure 19 As shown, the electronic device 1900 may include: one or more processors 1901; one or more memories 1902; a communication interface 1903; and one or more computer programs 1904. These devices can be connected via one or more communication buses 1905. The one or more computer programs 1904 are stored in the memory 1902 and configured to be executed by the one or more processors 1901. The one or more computer programs 1904 include instructions that can be used to perform relevant steps of the VR wearable device as described in the corresponding embodiments above. The communication interface 1903 is used to enable communication with other devices; for example, the communication interface may be a transceiver.
[0188] The methods provided in the embodiments of this application above are described from the perspective of an electronic device (e.g., a VR wearable device) as the executing entity. To implement the functions of the methods provided in the embodiments of this application above, the electronic device may include hardware structures and / or software modules, implementing the above functions in the form of hardware structures, software modules, or a combination of hardware structures and software modules. Whether a particular function is implemented in the form of hardware structures, software modules, or a combination of hardware structures and software modules depends on the specific application and design constraints of the technical solution.
[0189] In the above embodiments, the terms "when..." or "after..." can be interpreted, depending on the context, as meaning "if...", "after...", "in response to determining...", or "in response to detecting...". Similarly, the phrases "when..." or "if (the stated condition or event) is detected" can be interpreted, depending on the context, as meaning "if...", "in response to determining...", "when (the stated condition or event) is detected", or "in response to detecting (the stated condition or event)". Furthermore, in the above embodiments, relational terms such as "first" and "second" are used to distinguish one entity from another, without limiting any actual relationship or order between these entities.
[0190] References to "one embodiment" or "some embodiments" as described in this specification mean that one or more embodiments of this application include a specific feature, structure, or characteristic described in connection with that embodiment. Therefore, the phrases "in one embodiment," "in some embodiments," "in other embodiments," "in still other embodiments," etc., appearing in different parts of this specification do not necessarily refer to the same embodiment, but rather mean "one or more, but not all, embodiments," unless otherwise specifically emphasized. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless otherwise specifically emphasized.
[0191] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in this embodiment are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., a solid-state disk (SSD)). Where there is no conflict, the solutions in the above embodiments can be combined.
[0192] It should be noted that a portion of this patent application contains copyrighted material. The copyright holder retains all rights except for making copies of the contents of patent documents or records from the patent office.
Claims
1. A display method, characterized in that, Applied to a wearable device, the wearable device including at least one display screen and at least one camera; including: The first image is displayed to the user via the screen; The first image is obtained by reconstructing the viewpoint of the second image captured by the camera; Wherein, at least one of the display positions or shapes of the first object on the first image and the first object on the second image are different, and the displacement offset between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the distance between the camera and the display screen, and / or, the offset direction between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the positional relationship between the camera and the display screen, so that the depth information of the first object seen by the user based on the first image when wearing the wearable device is the same as the depth information of the first object seen by the user in the real environment when not wearing the wearable device; The second object in the first image and the second object in the second image have the same display position and shape; The first object is located in the area where the user's gaze point is located, and the second object is located in an area outside the area where the user's gaze point is located.
2. The method according to claim 1, characterized in that, The positional offset of the first object in the first image relative to the first object in the second image is the first offset; The positional offset of the third object in the first image relative to the third object in the second image is the second offset; The third object is located within the area where the user's gaze point is located, and is closer to the edge of the area where the gaze point is located than the first object; The second offset is less than the first offset.
3. The method according to claim 1, characterized in that, The degree of morphological change of the first object in the first image relative to the first object in the second image is greater than the degree of morphological change of the third object in the first image relative to the third object in the second image; The third object is located within the area where the user's gaze point is located, and the third object is closer to the edge of the area where the gaze point is located than the first object.
4. The method according to claim 1, characterized in that, The positional offset of the first object in the first image relative to the first object in the second image is the first offset; The positional offset of the third object in the first image relative to the third object in the second image is the second offset; The third object is located within the area of the user's gaze point, and the third object is within the first direction range of the first object, the first direction range including the position offset direction of the first object on the first image relative to the first object on the second image; The second offset is greater than the first offset.
5. The method according to any one of claims 1-4, characterized in that, The first image includes a first pixel, a second pixel, and a third pixel. The first pixel and the second pixel are located in the region where the user's gaze point is located, and the first pixel is closer to the edge of the region where the user's gaze point is located than the second pixel. The third pixel is located in a region outside the region where the user's gaze point is located. The image information of the first pixel is located between the image information of the second pixel and the image information of the third pixel.
6. The method according to claim 5, characterized in that, The image information includes at least one of the following: resolution, color, brightness, and color temperature.
7. The method according to any one of claims 1-4, characterized in that, The at least one camera includes a first camera and a second camera, and the at least one display screen includes a first display screen and a second display screen; the first display screen is configured to display an image captured by the first camera. The second display screen is configured to display the image from the second camera; When the positions of the first display screen and the first camera are different, the first object in the image displayed on the first display screen is different from the first object in the image captured by the first camera in at least one of the display position or shape, and the second object in the image displayed on the first display screen is the same as the second object in the second image captured by the first camera in both display position and shape; When the positions of the second display screen and the second camera are different, the first object in the image displayed on the second display screen is different from the first object in the image captured by the second camera in at least one of the display positions or shapes, and the second object in the image displayed on the second display screen is the same as the second object in the image captured by the second camera in both display position and shape.
8. The method according to any one of claims 1-4, characterized in that, The first object in the first image has a different shape than the first object in the second image, including: The edge contour of the first object in the second image is flatter than the edge contour of the first object in the first image.
9. A display method, characterized in that, The invention is applied to wearable devices, which include at least one display screen, at least one camera, and a processor. The camera is configured to transmit images it captures to the processor, and the images are displayed on the display screen via the processor, including: The first image is displayed to the user via the screen; The first image is obtained by reconstructing the viewpoint of the second image captured by the camera; Wherein, at least one of the display positions or shapes of the first object on the first image and the first object on the second image are different, and the displacement offset between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the distance between the camera and the display screen, and / or, the offset direction between the first display position of the first object on the first image and the second display position of the first object on the second image is related to the positional relationship between the camera and the display screen, so that the depth information of the first object seen by the user based on the first image when wearing the wearable device is the same as the depth information of the first object seen by the user in the real environment when not wearing the wearable device; The second object in the first image and the second object in the second image have the same display position and shape; The first object is located in the area where the user's gaze point is located, and the second object is located in an area outside the area where the user's gaze point is located.
10. The method according to claim 9, characterized in that, The positional offset of the first object in the first image relative to the first object in the second image is the first offset; The positional offset of the third object in the first image relative to the third object in the second image is the second offset; The third object is located within the area where the user's gaze point is located, and is closer to the edge of the area where the gaze point is located than the first object; The second offset is less than the first offset.
11. The method according to claim 9, characterized in that, The degree of morphological change of the first object in the first image relative to the first object in the second image is greater than the degree of morphological change of the third object in the first image relative to the third object in the second image; The third object is located within the area where the user's gaze point is located, and the third object is closer to the edge of the area where the gaze point is located than the first object.
12. The method according to claim 9, characterized in that, The positional offset of the first object in the first image relative to the first object in the second image is the first offset; The positional offset of the third object in the first image relative to the third object in the second image is the second offset; The third object is located within the area of the user's gaze point, and the third object is within the first direction range of the first object, the first direction range including the position offset direction of the first object on the first image relative to the first object on the second image; The second offset is greater than the first offset.
13. The method according to any one of claims 9-12, characterized in that, The first image includes a first pixel, a second pixel, and a third pixel. The first pixel and the second pixel are located in the region where the user's gaze point is located, and the first pixel is closer to the edge of the region where the user's gaze point is located than the second pixel. The third pixel is located in a region outside the region where the user's gaze point is located. The image information of the first pixel is located between the image information of the second pixel and the image information of the third pixel.
14. The method according to claim 13, characterized in that, The image information includes at least one of the following: resolution, color, brightness, and color temperature.
15. The method according to any one of claims 9-12, characterized in that, The at least one camera includes a first camera and a second camera, and the at least one display screen includes a first display screen and a second display screen; the first display screen is configured to display an image captured by the first camera. The second display screen is configured to display the image from the second camera; When the positions of the first display screen and the first camera are different, the first object in the image displayed on the first display screen is different from the first object in the image captured by the first camera in at least one of the display position or shape, and the second object in the image displayed on the first display screen is the same as the second object in the second image captured by the first camera in both display position and shape; When the positions of the second display screen and the second camera are different, the first object in the image displayed on the second display screen is different from the first object in the image captured by the second camera in at least one of the display positions or shapes, and the second object in the image displayed on the second display screen is the same as the second object in the image captured by the second camera in both display position and shape.
16. An electronic device, characterized in that, include: Processor, memory, and one or more programs; The one or more programs are stored in the memory, and the one or more programs include instructions that, when executed by the processor, cause the electronic device to perform the method as described in any one of claims 1-15.
17. A computer-readable storage medium, characterized in that, The computer-readable storage medium is used to store a computer program that, when run on a computer, causes the computer to perform the method as described in any one of claims 1 to 15.
18. A computer program product, characterized in that, Includes a computer program that, when run on a computer, causes the computer to perform the method as described in any one of claims 1-15.