Display method and device, head-mounted display device, and storage medium

By identifying the region of interest and adjusting the exposure on a head-mounted display device, the system captures and displays images of the target's real-world scene, thus solving the problem of inappropriate image brightness and improving the video perspective display effect and user experience.

CN115883816BActive Publication Date: 2026-06-19GEER TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
GEER TECH CO LTD
Filing Date
2022-11-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In the video perspective function of existing head-mounted display devices, the image seen by the user's eyes is often too dark or too bright, which affects the user's experience.

Method used

By installing a first camera on a head-mounted display device, the user's region of interest in a real-world scene image is determined, and the target exposure is determined based on the image content of that region. The target real-world scene image is then captured and displayed, while the brightness of the virtual scene image is adjusted to match that of the real-world scene image, thus achieving image fusion display.

Benefits of technology

It solves the problem of inappropriate screen brightness, improves the video perspective display effect and user experience of head-mounted display devices, and ensures that the screen brightness meets the viewing needs of the human eye.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115883816B_ABST
    Figure CN115883816B_ABST
Patent Text Reader

Abstract

This invention discloses a display method, apparatus, head-mounted display device, and storage medium, relating to the field of head-mounted display devices. The method is applied to a head-mounted display device equipped with a first camera for capturing images of real-world scenes. First, the exposure of the first camera capturing the real-world scene image is determined based on the image content of the user's region of interest in the real-world scene image. Then, the target real-world scene image captured by the first camera based on the target exposure is displayed. This solves the technical problem of head-mounted display devices displaying images that are either too dark or too bright due to inappropriate exposure in the human visual field, thus failing to meet the user's viewing needs. By displaying a properly bright real-world scene image, the VST video perspective display effect of the head-mounted display device is improved, enhancing the user experience.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of head-mounted display technology, and more particularly to a display method, display device, head-mounted display equipment, and computer-readable storage medium. Background Technology

[0002] As a mainstream interactive tool, head-mounted display devices (HMDs) work on the following principle using VST (Vision See-Through): After a user puts on the HMD, the camera on the HMD captures images of the real scene, and then displays these images on the HMD's screen for the user to view.

[0003] However, current head-mounted display devices have the following problems with video perspective: the image seen by the user's eyes is either too dark or too bright, which affects the user experience of the head-mounted display device. Summary of the Invention

[0004] The main objective of this invention is to provide a display method, display device, head-mounted display device, and computer-readable storage medium, aiming to solve the technical problem that the brightness of the images displayed by existing head-mounted display devices is either too dim or too bright, failing to meet the viewing needs of the human eye for image brightness.

[0005] To achieve the above objectives, the present invention provides a display method applied to a head-mounted display device, wherein a first camera for capturing images of a real scene is installed on the head-mounted display device, and the display method includes:

[0006] Identify the user's region of interest in a real-world scene image;

[0007] The target exposure of the first camera is determined based on the image content of the region of interest.

[0008] The first camera acquires a real-world image of the target scene based on the target exposure, and then displays the real-world image of the target scene.

[0009] Optionally, the step of determining the user's region of interest in a real-world scene image includes:

[0010] Acquire an image of the user's eyes, and determine the region of interest of the user in a real-world scene image based on the eye image.

[0011] Optionally, the step of determining the user's region of interest in the real-world scene image based on the eye image includes:

[0012] The user's gaze coordinates are calculated based on the eye image, and the region of interest of the user in the real scene image is determined based on the gaze coordinates.

[0013] Optionally, the step of determining the user's region of interest in the real-world scene image based on the gaze coordinates includes:

[0014] Determine the preset angle range corresponding to when the content is clearly visible to the human eye;

[0015] The region of interest in a real scene image is determined based on the coordinates of the gaze point and a preset angle range.

[0016] Optionally, after the step of displaying the target real-world scene image, the method further includes:

[0017] The virtual image brightness of the virtual scene image is adjusted based on the real image brightness of the target real scene image to obtain the target virtual scene image;

[0018] The target virtual scene image and the target real scene image are fused and displayed.

[0019] Optionally, the display method further includes:

[0020] On the acquired real-world scene image, determine the peripheral region corresponding to the region of interest, and determine the peripheral brightness corresponding to the peripheral region;

[0021] The outer scene image is obtained by adjusting the real scene image corresponding to the outer region to the outer brightness.

[0022] Display the target real scene image and the surrounding scene image.

[0023] Optionally, the step of determining the peripheral brightness corresponding to the peripheral region includes:

[0024] Determine the true image brightness of the target real scene image, and determine the peripheral brightness corresponding to the peripheral region based on the true image brightness;

[0025] The brightness of the actual image is greater than the brightness of the surrounding area.

[0026] Furthermore, to achieve the above objectives, the present invention also provides a display device, the display device comprising:

[0027] The region of interest determination module is used to determine the user's region of interest in a real-world scene image;

[0028] A target exposure determination module is used to determine the target exposure of the first camera based on the image content of the region of interest.

[0029] The display module is used to acquire the target real scene image captured by the first camera based on the target exposure, and to display the target real scene image.

[0030] In addition, to achieve the above objectives, the present invention also provides a head-mounted display device, the head-mounted display device comprising: a first camera for acquiring images of a real scene, a second camera for acquiring images of a user's eyes, a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the display method as described in any of the preceding claims.

[0031] In addition, to achieve the above objectives, the present invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the display method as described in any of the preceding claims.

[0032] This invention provides a display method, display device, head-mounted display device, and computer-readable storage medium. The display method is applied to a head-mounted display device, on which a first camera for capturing images of a real scene is installed. The display method includes: determining a region of interest (ROI) for a user in the real scene image; determining a target exposure for the first camera based on the image content of the ROI; acquiring a target real scene image captured by the first camera based on the target exposure; and displaying a virtual target real scene image.

[0033] First, the exposure of the first camera that captures the real scene image is determined based on the image content of the user's region of interest in the real scene image. Then, the target real scene image captured by the first camera based on the target exposure is displayed.

[0034] This technology solves the technical problem of head-mounted displays displaying images that are either too dark or too bright due to improper exposure in the human visual field, thus failing to meet the user's viewing needs. By displaying real-world scene images with appropriate brightness, the technology improves the VST video perspective display effect of head-mounted displays and enhances the user experience. Attached Figure Description

[0035] Figure 1 This is a schematic diagram of the terminal structure of the hardware operating environment involved in the embodiments of the present invention;

[0036] Figure 2 This is a flowchart illustrating an embodiment of the method of the present invention;

[0037] Figure 3This is a schematic diagram of the transmission area according to an embodiment of the display method of the present invention;

[0038] Figure 4 This is a schematic diagram of an embodiment of the display device of the present invention.

[0039] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0040] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0041] Reference Figure 1 , Figure 1 This is a schematic diagram of the operating device of the hardware operating environment involved in the embodiments of the present invention.

[0042] like Figure 1 As shown, the operating device may include: a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen and an input unit such as a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1005 may be a high-speed random access memory (RAM) or a stable non-volatile memory (NVM), such as a disk drive. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001.

[0043] Those skilled in the art will understand that Figure 1 The structure shown does not constitute a limitation on the operating equipment and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0044] like Figure 1 As shown, the memory 1005, which serves as a storage medium, may include an operating system, a data storage module, a network communication module, a user interface module, and computer programs.

[0045] exist Figure 1In the illustrated operating device, the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with the user; the processor 1001 and memory 1005 in the operating device of the present invention can be installed in the operating device, and the operating device calls the computer program stored in the memory 1005 through the processor 1001 and performs the following operations:

[0046] Identify the user's region of interest in a real-world scene image;

[0047] The target exposure of the first camera is determined based on the image content of the region of interest.

[0048] The first camera acquires a real-world image of the target scene based on the target exposure, and then displays the real-world image of the target scene.

[0049] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0050] The step of determining the user's region of interest in a real-world scene image includes:

[0051] Acquire an image of the user's eyes, and determine the region of interest of the user in a real-world scene image based on the eye image.

[0052] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0053] The step of determining the user's region of interest in a real-world scene image based on the eye image includes:

[0054] The user's gaze coordinates are calculated based on the eye image, and the region of interest of the user in the real scene image is determined based on the gaze coordinates.

[0055] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0056] The step of determining the user's region of interest in a real-world scene image based on the gaze point coordinates includes:

[0057] Determine the preset angle range corresponding to when the content is clearly visible to the human eye;

[0058] The region of interest in a real scene image is determined based on the coordinates of the gaze point and a preset angle range.

[0059] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0060] After the step of displaying the target real-world scene image, the method further includes:

[0061] The virtual image brightness of the virtual scene image is adjusted based on the real image brightness of the target real scene image to obtain the target virtual scene image;

[0062] The target virtual scene image and the target real scene image are fused and displayed.

[0063] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0064] The display method further includes:

[0065] On the acquired real-world scene image, determine the peripheral region corresponding to the region of interest, and determine the peripheral brightness corresponding to the peripheral region;

[0066] The outer scene image is obtained by adjusting the real scene image corresponding to the outer region to the outer brightness.

[0067] Display the target real scene image and the surrounding scene image.

[0068] Furthermore, the processor 1001 can call a computer program stored in the memory 1005 and also perform the following operations:

[0069] The step of determining the peripheral brightness corresponding to the peripheral region includes:

[0070] Determine the true image brightness of the target real scene image, and determine the peripheral brightness corresponding to the peripheral region based on the true image brightness;

[0071] The brightness of the actual image is greater than the brightness of the surrounding area.

[0072] Reference Figure 2 This invention provides a display method applied to a head-mounted display device, wherein a first camera for capturing images of a real scene is installed on the head-mounted display device, and the display method includes:

[0073] Step S10: Determine the region of interest for the user in the real-world scene image.

[0074] The first camera used by a head-mounted display device, such as AR glasses, to capture images of the real-world scene is called a camera (which can be a monochrome camera or an RGB camera). The second camera used by the head-mounted display device, such as AR glasses, to capture images of the user's eyes is called an Eyetracking camera. Here, the real-world scene image refers to the image of the real world scene captured by the camera on the head-mounted display device, such as AR glasses, and the region of interest refers to the area determined by the specific coordinates of the user's gaze point captured by the Eyetracking camera.

[0075] Furthermore, the step of determining the region of interest (ROI) on the acquired real-world scene image also includes: acquiring the user's voice information, parsing the voice information to obtain the main object identified by the user, matching a target subject in the real-world scene image with a similarity greater than a preset threshold based on the main object, and taking the area where the target subject is located as the ROI. Thus, in addition to determining the ROI through the user's gaze point, it also allows the user to actively determine the ROI on the real-world scene image, thereby improving the applicability of the display method and providing users with more optional operations for achieving fused display.

[0076] Further, the step of determining the region of interest (ROI) on the acquired real-scene image includes: recognizing the user's first gesture and determining the user's ROI on the real-scene image based on the first gesture. Besides determining the ROI on the real-scene image through the user's voice information, the ROI can also be determined through the user's first gesture. The user can select one or more sub-regions as the ROI on a real-scene image that has been divided according to a preset method, or they can select a region as the ROI on an undivided real-scene image through the first gesture. In this embodiment, the specific gesture and action of the first gesture are not limited. Thus, the display capabilities of the head-mounted display device are fully utilized, providing the user with the option to determine the ROI through selection or selection methods in addition to voice selection.

[0077] Step S20: Determine the target exposure of the first camera based on the image content of the region of interest.

[0078] After determining the region of interest, it is necessary to further determine the target exposure of the first camera based on the image content of the region of interest. The first camera then acquires a real scene image of the target based on the target exposure, and then displays the real scene image of the target.

[0079] Step S30: Obtain the target real scene image captured by the first camera based on the target exposure, and display the target real scene image.

[0080] Video perspective (VST) refers to the process where head-mounted display devices, such as AR glasses, capture images of real-world scenes using miniature cameras mounted on the glasses. The AR glasses then use scene understanding and analysis to overlay the desired information and image signals onto the camera's video signal, simultaneously merging the virtual scene generated by the AR glasses with the real-world scene. Finally, the results are presented to the user through the AR glasses' display screen.

[0081] Optionally, after the step of displaying the target real scene image, the method further includes: adjusting the virtual image brightness of the virtual scene image based on the real image brightness of the target real scene image to obtain a target virtual scene image; and merging and displaying the target virtual scene image and the target real scene image.

[0082] During the display of a target real-world scene image, there may be a discrepancy between the brightness of the target real-world scene image and the brightness of the original virtual scene image, leading to eye discomfort and a disconnect in the immersive experience for the user. Therefore, after determining the target exposure of the first camera based on the image content of the region of interest and changing the brightness of the real-world scene image captured by the first camera, it is also necessary to synchronously adjust the brightness of the original virtual scene image. Thus, the virtual image brightness of the virtual scene image is adjusted based on the real image brightness of the target real-world scene image to obtain the target virtual scene image. Preferably, the virtual image brightness is adjusted to be the same as the real image brightness. Furthermore, since the human eye does not notice all details in the field of vision during viewing, only the area near the central visual focus is clear, and any area exceeding a preset angle range (e.g., 5 degrees) beyond the center of the human eye's gaze area will gradually lose clarity. Therefore, the virtual image brightness within the region of interest is set to be the same as the real image brightness, while the virtual image brightness within the non-region of interest is set to be lower than the real image brightness. Finally, after adjusting the brightness of the virtual scene image to obtain the target, the adjusted target virtual scene image and the acquired target real scene image are merged and displayed. In addition to avoiding the problem of inconsistent brightness caused by adjusting the brightness of the real scene image alone, adjusting the real scene image and the virtual scene image to be consistent or setting the brightness of the virtual image to be lower than that of the real image can also make the viewing experience of the user more in line with the viewing habits of the human eye due to biological attributes when viewing the screen displayed on the head-mounted display device.

[0083] In this embodiment, the display method is applied to a head-mounted display device, on which a first camera for capturing real-scene images is installed. The display method includes: determining a region of interest for the user in the real-scene image; determining the target exposure of the first camera based on the image content of the region of interest; acquiring a target real-scene image captured by the first camera based on the target exposure; and displaying the target real-scene image.

[0084] First, the exposure of the first camera that captures the real scene image is determined based on the image content of the user's region of interest in the real scene image. Then, the target real scene image captured by the first camera based on the target exposure is displayed.

[0085] This technology solves the technical problem of head-mounted displays displaying images that are either too dark or too bright due to improper exposure in the human visual field, thus failing to meet the user's viewing needs. By displaying real-world scene images with appropriate brightness, the technology improves the VST video perspective display effect of head-mounted displays and enhances the user experience.

[0086] Furthermore, in another embodiment of the display method of the present invention, the step of determining the user's region of interest in a real scene image includes: acquiring an image of the user's eyes, and determining the user's region of interest in a real scene image based on the image of the eyes.

[0087] Gaze tracking, also known as eye tracking, uses sensors such as infrared cameras to capture and extract eye feature information, measure eye movement, and thus estimate the direction of gaze or the position of the eye's gaze point. In head-mounted display devices such as AR glasses, the second camera that captures images of the user's eyes is called an eye-tracking camera. By acquiring images of the user's eyes through the second camera of the head-mounted display device, the user's region of interest in the real-world scene can be determined based on these images.

[0088] Optionally, the step of determining the user's region of interest in the real scene image based on the eye image includes: calculating the user's gaze coordinates based on the eye image, and determining the user's region of interest in the real scene image based on the gaze coordinates.

[0089] After acquiring the user's eye image through the second camera of the head-mounted display device, the user's gaze coordinates are calculated based on the eye image, thereby determining the region of interest based on the gaze coordinates. In this embodiment, the pupil-corneal reflection method is used to determine the gaze coordinates: under the condition that the positions of the infrared light source and the eye-tracking camera in the gaze tracking system of the head-mounted display device remain unchanged, and based on the structure of the eyeball model, the corneal curvature center is calculated using the flicker point and the position of the light source. The pupil center is calculated using image processing technology, and the optical axis of the eyeball is obtained by connecting the corneal curvature center and the pupil center. The actual gaze direction, i.e., the visual axis, and the gaze coordinates are calculated using the angle between the optical axis and the visual axis.

[0090] Optionally, the step of determining the user's region of interest in the real scene image based on the gaze point coordinates includes: determining a preset angle range corresponding to when the content being gazed at by the human eye is clear; and determining the region of interest in the real scene image based on the gaze point coordinates and the preset angle range.

[0091] During visual perception, the human eye does not notice all details in the field of vision; only the area near the central visual focus is clear. Any area exceeding a preset angle range (e.g., 5 degrees) beyond the center of the eye's fixation zone gradually loses clarity and is considered an ignored area. This is because the concentration of cone cells on the retina responsible for observing color and detail varies. The area with a high density of cone cells is called the fovea, which corresponds to the fixation point in the human eye's visual field. Due to the structure of the human retina, the fovea has the highest resolution, while the visual quality of the peripheral field of vision is relatively lower. The location of the fovea is called the fixation point region. In VST technology, the camera's focus area must coincide with the fixation point region in real time to ensure that the perceived view remains consistent with the visual range. Therefore, in this embodiment, the region of interest is determined in the real-world scene image based on the fixation point coordinates and a preset angle range. The determined region of interest can be circular or rectangular, and its shape or size must at least include the area within the preset angle range centered on the fixation point coordinates.

[0092] In this embodiment, the user's eye images are first acquired using an eye-tracking camera and an infrared LED. An eye-tracking algorithm is then used for image preprocessing, including grayscale conversion, binarization, and edge detection. Pupil center localization and corneal reflective spot center localization are then performed to calculate the user's gaze coordinates, thus determining the user's gaze direction and specific gaze coordinates. Furthermore, the region of interest (ROI) in the real-world scene image can be determined based on the gaze coordinates and a preset angle range. Therefore, in addition to determining the ROI through the user's voice information and by selection or circling, the ROI can also be determined solely based on the user's gaze point, achieving seamless and intelligent fusion display and enhancing the augmented reality effect and experience of the head-mounted display device.

[0093] Furthermore, in another embodiment of the display method of the present invention, the display method further includes:

[0094] On the acquired real-world scene image, determine the peripheral region corresponding to the region of interest, and determine the peripheral brightness corresponding to the peripheral region;

[0095] The outer scene image is obtained by adjusting the real scene image corresponding to the outer region to the outer brightness.

[0096] Display the target real scene image and the surrounding scene image.

[0097] After determining the region of interest (ROI), a corresponding peripheral region is defined on the acquired real-world scene image. This peripheral region can be a ring-shaped area surrounding the ROI, or a rectangular area with equal width on all four sides. The method of defining the peripheral region, as well as its size and shape, are not limited in this embodiment. Considering that the human eye does not notice all details in its field of vision due to varying concentrations of cone cells on the retina responsible for observing color and detail, only the area near the central visual focus is clear. Any area exceeding a preset angle range (e.g., 5 degrees) beyond the center of the eye's gaze area will gradually lose clarity. Therefore, this embodiment considers not only the ROI with the highest clarity and the area most easily noticed by the user's eyes, but also the corresponding peripheral region. This peripheral region is used as a corresponding area with gradually decreasing brightness, and the real-world scene image corresponding to the peripheral region is adjusted to the peripheral brightness to obtain the peripheral scene image. Finally, the target real-world scene image and the peripheral scene image are displayed to simulate and reproduce the user's realistic viewing experience as closely as possible. This reduces the sense of disconnect between virtual and reality when users wear and use head-mounted displays, providing a viewing experience that closely resembles human vision.

[0098] Optionally, the step of determining the peripheral brightness corresponding to the peripheral region includes:

[0099] Determine the true image brightness of the target real scene image, and determine the peripheral brightness corresponding to the peripheral region based on the true image brightness;

[0100] The brightness of the actual image is greater than the brightness of the surrounding area.

[0101] After determining the outer region, the outer brightness corresponding to the outer region is determined based on the real image brightness of the target real scene image. Further, the step of determining the outer brightness corresponding to the outer region based on the real image brightness includes: determining the partitioning level of the outer region, and determining the outer brightness of the outer scene image corresponding to the real image brightness of the target real scene image based on the partitioning level. The outer region of the region of interest is pre-divided to obtain different distance levels, and each distance level is preset with a brightness that gradually decreases outward based on the real image brightness. The outer region directly determines its corresponding outer brightness based on its partitioning level. Alternatively, the outer region of the region of interest is pre-divided, and different brightness weights less than 1 and gradually decreasing for outer regions at different distances are set. The outer brightness corresponding to the outer region is determined based on the distance of the outer region and the brightness weight corresponding to that distance.

[0102] In addition, it is also taken into consideration that due to the different concentrations of cone cells on the retina responsible for observing color and detail, the human eye does not notice all the details in the field of vision during the process of seeing objects. Only the area near the central visual focus is clear. Any area that exceeds the preset angle range of the center of the human eye's gaze zone, such as 5 degrees or more, will gradually reduce the clarity. Therefore, the brightness of the real image of the target real scene is greater than the brightness of the outer area, thus highlighting the area of ​​interest in the image displayed on the head-mounted display device.

[0103] Furthermore, in another embodiment of the display method of the present invention, the display method further includes:

[0104] Identify the main objects in the region of interest;

[0105] The target exposure of the first camera is determined based on the image content of the main object.

[0106] Besides determining the target exposure of the first camera based on the image content of the region of interest (ROI), the target exposure of the first camera can also be accurately determined solely based on the main object within the ROI. Further, the step of identifying the main object within the ROI includes: identifying each candidate object within the ROI; selecting the candidate object with the largest area as the main object; selecting the candidate object on the topmost layer as the main object; or determining whether each candidate object is in the foreground or background, and selecting one or more candidate objects in the foreground as the main object. Similarly, determining the main object from multiple candidate objects can also be achieved through voice, gestures, or other operations. The specific operation steps are similar to the method described above for determining the ROI on the acquired real-world scene image, and will not be elaborated upon here.

[0107] The main object is identified in the region of interest, and the target exposure of the first camera is determined based on the image content of the main object. This reduces the amount of data and computation when the head-mounted display device is used for display, improves the processing efficiency of the head-mounted display device, and further improves the accuracy of exposure, ultimately enhancing the display effect of the head-mounted display device and the user's augmented reality experience.

[0108] Reference Figure 3 In another embodiment of the display method of the present invention, the user first puts on a head-mounted display device such as VR / AR glasses. When the camera, eye-tracking camera, and IR-LED image acquisition functions are working properly, a mature gaze tracking algorithm is invoked to calculate the user's gaze direction. Specifically, clear eye images are acquired using the eye-tracking camera and IR-LED. After image preprocessing, including grayscale conversion, image filtering, binarization, and image edge detection, pupil center localization and corneal reflective spot center localization are performed to calculate the user's gaze point. Then, based on the image content and gaze point location information, the target exposure of the first camera acquiring the real-scene image is determined. Finally, the virtual scene image and the target real-scene image acquired by the first camera based on the target exposure are fused and displayed on the VR / AR glasses display screen. This technology addresses the technical problem of head-mounted displays displaying images that are either too dark or too bright due to inappropriate exposure in the human visual field, thus failing to meet the user's viewing needs. By displaying appropriately bright real-world scene images, the technology improves the VST video perspective display effect of head-mounted displays, thereby enhancing the user experience.

[0109] In addition, refer to Figure 4 The present invention also provides a display device, the display device comprising:

[0110] The region of interest determination module M1 is used to determine the user's region of interest in a real-world scene image;

[0111] The target exposure determination module M2 is used to determine the target exposure of the first camera based on the image content of the region of interest.

[0112] Display module M3 is used to acquire the target real scene image captured by the first camera based on the target exposure, and to display the target real scene image.

[0113] Optionally, the region of interest determination module M1 is also used to acquire the user's eye image and determine the user's region of interest in the real scene image based on the eye image.

[0114] Optionally, the region of interest determination module M1 is further configured to calculate the user's gaze coordinates based on the eye image, and determine the user's region of interest in the real scene image based on the gaze coordinates.

[0115] Optionally, the region of interest determination module M1 is also used to determine the preset angle range corresponding to when the content being viewed by the human eye is clear;

[0116] The region of interest in a real scene image is determined based on the coordinates of the gaze point and a preset angle range.

[0117] Optionally, the display device further includes: an advanced fusion module, used to adjust the virtual image brightness of the virtual scene image based on the real image brightness of the target real scene image to obtain the target virtual scene image;

[0118] The target virtual scene image and the target real scene image are fused and displayed.

[0119] Optionally, the display device further includes: a peripheral region fusion module, used to determine the peripheral region corresponding to the region of interest on the acquired real scene image, and to determine the peripheral brightness corresponding to the peripheral region;

[0120] The outer scene image is obtained by adjusting the real scene image corresponding to the outer region to the outer brightness.

[0121] Display the target real scene image and the surrounding scene image.

[0122] Optionally, the peripheral region fusion module is further configured to determine the true image brightness of the target real scene image, and determine the peripheral brightness corresponding to the peripheral region based on the true image brightness;

[0123] The brightness of the actual image is greater than the brightness of the surrounding area.

[0124] The display device provided by this invention, employing the display method described in the above embodiments, solves the technical problem that the brightness of the images displayed by existing head-mounted display devices is either too dim or too bright, failing to meet the viewing needs of the user's eyes. Compared with the prior art, the beneficial effects of the display device provided by the embodiments of this invention are the same as those of the display method provided in the above embodiments, and other technical features of this display device are the same as those disclosed in the methods of the above embodiments, and will not be repeated here.

[0125] Furthermore, embodiments of the present invention also provide a head-mounted display device, the head-mounted display device comprising: a first camera for acquiring images of a real scene, a second camera for acquiring images of a user's eyes, a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the display method as described in any of the preceding embodiments.

[0126] Furthermore, embodiments of the present invention also provide a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the display method as described in any of the preceding claims.

[0127] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

[0128] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0129] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0130] The above are merely preferred embodiments of the present invention and do not limit the scope of the patent. Any equivalent structural or procedural transformations made based on the description and drawings of the present invention, or direct or indirect applications in other related technical fields, are similarly included within the scope of patent protection of the present invention.

Claims

1. A display method, characterized in that, The display method is applied to a head-mounted display device, wherein a first camera for capturing images of a real scene is installed on the head-mounted display device, and the display method includes: Determine the region of interest (ROI) of the user in a real scene image, wherein the ROI includes the ROI in the real scene image determined by the user's voice information or first gesture action, or the ROI in the real scene image determined by the user's gaze point coordinates and the preset angle range corresponding to when the content being gazed at by the human eye is clear; The target exposure of the first camera is determined based on the image content of the region of interest. Acquire a real-world image of the target scene captured by the first camera based on the target exposure, and display the real-world image of the target scene; The virtual image brightness of the virtual scene image is adjusted based on the real image brightness of the target real scene image to obtain the target virtual scene image; wherein, the virtual image brightness in the region of interest in the virtual scene image is set to be the same as the real image brightness, and the virtual image brightness in the non-region of interest is set to be lower than the real image brightness; The target virtual scene image and the target real scene image are fused and displayed; The display method further includes: On the acquired real scene image, the outer region corresponding to the region of interest is determined. The outer region is the outer ring region of the region of interest and is a region with progressively decreasing brightness. The outer region is pre-divided to obtain different distance levels, and each distance level is preset with a brightness that gradually decreases outward based on the brightness of the real image. The true image brightness of the target real scene image is determined, and the peripheral brightness corresponding to the peripheral region is determined based on the true image brightness. Specifically, the division level of the peripheral region is determined, and the peripheral brightness of the peripheral scene image corresponding to the true image brightness of the target real scene image is determined based on the division level. The true image brightness is greater than the peripheral brightness. The outer scene image is obtained by adjusting the real scene image corresponding to the outer region to the outer brightness. Display the target real scene image and the surrounding scene image.

2. The display method as described in claim 1, characterized in that, The step of determining the user's region of interest in a real-world scene image includes: Acquire an image of the user's eyes, and determine the region of interest of the user in a real scene image based on the eye image.

3. The display method as described in claim 2, characterized in that, The step of determining the user's region of interest in a real-world scene image based on the eye image includes: The user's gaze coordinates are calculated based on the eye image, and the region of interest of the user in the real scene image is determined based on the gaze coordinates.

4. The display method as described in claim 3, characterized in that, The step of determining the user's region of interest in a real-world scene image based on the gaze point coordinates includes: Determine the preset angle range corresponding to when the content is clearly visible to the human eye; The region of interest is determined in the real scene image based on the coordinates of the gaze point and the preset angle range.

5. A display device, characterized in that, The display device includes: The region of interest determination module is used to determine the user's region of interest in a real scene image. The region of interest includes the region of interest in the real scene image determined by the user's voice information or first gesture action, or the region of interest in the real scene image determined according to the user's gaze point coordinates and the preset angle range corresponding to when the content being gazed at by the human eye is clear. A target exposure determination module is used to determine the target exposure of the first camera based on the image content of the region of interest. The display module is used to acquire a real scene image of the target captured by the first camera based on the target exposure, and to display the real scene image of the target; An advanced fusion module is used to adjust the virtual image brightness of a virtual scene image based on the real image brightness of the target real scene image to obtain a target virtual scene image; wherein, the virtual image brightness in the region of interest of the virtual scene image is set to be the same as the real image brightness, and the virtual image brightness in the non-region of interest is set to be lower than the real image brightness; the target virtual scene image and the target real scene image are then fused and displayed. The peripheral region fusion module is used to determine the peripheral region corresponding to the region of interest on the acquired real scene image. The peripheral region is a ring-shaped area surrounding the region of interest, and its brightness decreases progressively. The peripheral region is pre-divided into different distance levels, each with a pre-defined brightness that gradually decreases outwards based on the brightness of the real image. The module then determines the real image brightness of the target real scene image and, based on this brightness, determines the peripheral brightness corresponding to the peripheral region. Specifically, it determines the division level of the peripheral region and, based on this division level, determines the peripheral brightness of the peripheral scene image corresponding to the real image brightness of the target real scene image. The real image brightness is greater than the peripheral brightness. The module then adjusts the real scene image corresponding to the peripheral region to the peripheral brightness to obtain the peripheral scene image. Finally, it displays both the target real scene image and the peripheral scene image.

6. A head-mounted display device, characterized in that, The head-mounted display device includes: a first camera for capturing images of a real scene, a second camera for capturing images of the user's eyes, a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the steps of the display method as described in any one of claims 1 to 4.

7. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the display method as described in any one of claims 1 to 4.