Light field display method and system, and storage medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The light field display method addresses the limitation of 2D imaging by integrating active detection and data processing to enhance 3D display through interaction data collection and rearrangement, improving user experience.

JP2026518914APending Publication Date: 2026-06-11BEIJING SHIYAN TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: BEIJING SHIYAN TECH CO LTD
Filing Date: 2024-05-22
Publication Date: 2026-06-11

Application Information

Patent Timeline

22 May 2024

Application

11 Jun 2026

Publication

JP2026518914A

IPC: G06F3/01; G06T19/00; G06T7/70; G06F3/0481

CPC: G06F3/017; G06F3/013; G06F3/012; G06F3/0487; G06F3/04815; G06F3/011; G06F3/04842; G02B30/33

AI Tagging

Application Domain

Input/output for user-computer interaction Image analysis

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Information processing apparatus
US20260161234A1Input/output for user-computer interaction Image enhancement
Image display method and system, computer readable storage medium, and electronic device
US20260164002A1Input/output for user-computer interactionCathode-ray tube indicators
Dynamic message board and light bar controller and method for control
US20260167097A1Input/output for user-computer interactionAnti-theft devices
System and method for providing a user with access to security threat data within a virtual reality environment
US20260162380A1Input/output for user-computer interaction Image data processing
Smart glasses using event camera
WO2026121372A1Input/output for user-computer interaction Steroscopic systems

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Conventional display solutions cannot fully express various dimensions of visual information, leading to a decline in display effect as they only display 2D images.

⚗Method used

A light field display method that collects raw image data, identifies interaction data including spatial position of face, gaze point, and gesture, and rearranges data to display light field data, using grayscale, IR, and depth cameras for active interaction.

🎯Benefits of technology

Enables efficient active interaction and improved user experience by reconstructing light field information, enhancing 3D display capabilities through active detection and data processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026518914000001_ABST

Patent Text Reader

Abstract

This application discloses a light field display method and system, as well as a storage medium, relating to the technical field of three-dimensional imaging. The method includes the steps of: collecting raw image data; identifying the raw image data and obtaining corresponding interaction data; transmitting the interaction data to a content generation unit and obtaining displayable content rendered by the content generation unit based on the interaction data; and combining the interaction data with the displayable content to obtain and display light field data by rearranging the data according to the displayable content, wherein the raw image data includes a face image, an eye image, and a gesture image, and the interaction data includes the spatial position of the face, the spatial position of the gaze point, the spatial position of the gesture, and the gesture category. The present invention can effectively reduce interaction delay.

Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】【0001】 Cross - reference to Related Applications This disclosure claims priority to an application with application number 202310665056.8, filing date of June 6, 2023, and invention title "Light Field Display Method and System, and Storage Medium", and the entire content of the Chinese patent application is incorporated herein by reference. 【0002】 This disclosure relates to the technical field of 3D imaging, and particularly to an active interactive detection light field display method, an active interactive detection light field display system, and a storage medium. 【Background Art】【0003】 In nature, the human eye can receive light emitted from an object and see a three - dimensional object. However, conventional display solutions can only display 2D images, and when generating and displaying content, various dimensions of visual information cannot be fully expressed, resulting in a decline in the display effect. 【0004】 It should be noted that the information disclosed in the above background art section is only used to deepen the understanding of the background of this disclosure, and thus may include information that does not constitute the prior art known to those of ordinary skill in the art. 【Summary of the Invention】【0005】 According to one aspect of the present invention, a light field display method applicable to a light field display system is provided, the method comprising the steps of: collecting raw image data; identifying the raw image data and obtaining corresponding interaction data; transmitting the interaction data to a content generation unit and obtaining display target content rendered by the content generation unit based on the interaction data; and combining the interaction data with the display target content to obtain and display light field data by rearranging the data according to the display target content, wherein the interaction data includes one or more combinations of the spatial position of a face, the spatial position of a gaze point, the spatial position of a gesture, and a gesture category. 【0006】 In one exemplary embodiment of the present invention, the step of obtaining the display target content rendered by the content generation unit based on the interaction data includes the steps of: calculating an overlapping area based on the spatial position of the gesture and the position of the current display content; and, if an overlapping area is determined to exist, determining an updated content corresponding to the current display content according to the gesture category, rendering the updated content, and obtaining the display target content. 【0007】 In one exemplary embodiment of the present invention, the step of rendering the updated content to obtain the content to be displayed includes the step of determining a high-resolution region and a low-resolution region of the updated content according to the spatial position of the point of gaze, and the step of rendering the high-resolution region and the low-resolution region, respectively, to obtain the content to be displayed. 【0008】 In one exemplary embodiment of the present invention, the raw image data includes a face image, and the step of identifying the raw image data and obtaining corresponding interaction data includes recognizing the face image and determining the ROI region of the face within the face image; determining the spatial position of the corresponding face based on the ROI region of the face; and updating the ROI region of the face according to a preset rule and performing face tracking based on the updated ROI region of the face. 【0009】 In one exemplary embodiment of the present invention, the raw image data includes an image of an eye, and the step of identifying the raw image data and obtaining corresponding interaction data includes performing image recognition on the image of the eye to determine the ROI region of the eye within the image of the eye, determining the spatial position of the corresponding gaze point based on the ROI region of the eye, and updating the ROI region of the eye according to a set of rules and performing eye tracking using the updated ROI region of the eye. 【0010】 In one exemplary embodiment of the present invention, the raw image data includes a gesture image, and the step of identifying the raw image data and obtaining corresponding interaction data includes performing image recognition on the gesture image to determine a hand ROI region, determining the corresponding spatial position of the hand based on the hand ROI region, updating the hand ROI region according to a predefined rule and performing hand tracking using the updated hand ROI region, and identifying the hand ROI region using a gesture recognition model and determining the corresponding gesture category. 【0011】 In one exemplary embodiment of the present invention, the method further includes the step of setting a corresponding ROI region for each sensor used to collect raw image data in response to a trigger operation. 【0012】 In one exemplary embodiment of the present invention, the method further includes the step of setting the rendering mode of the content generation unit in response to a setting command. 【0013】 In one exemplary embodiment of the present invention, the step of collecting raw image data includes collecting a face image using several grayscale cameras, collecting an eye image using an IR camera, and collecting a gesture image using a depth camera, wherein the several grayscale cameras, IR camera, and depth camera are installed on one side of a light field display unit. 【0014】 According to one aspect of the present invention, a light field display system is provided, which is: An active detection unit configured to collect raw image data, An interaction information computing unit configured to receive the raw image data, identify the raw image data, obtain the corresponding interaction data, and transmit the interaction data to a content generation unit, A display data processing unit is configured to receive the content to be displayed rendered by the content generation unit, and to sort the data in combination with the interaction data and pixel placement map to obtain light field data. Includes a light field display unit configured to display the aforementioned light field data, The interaction data includes one or a combination of the following: the spatial position of the face, the spatial position of the point of gaze, the spatial position of the gesture, and the gesture category. 【0015】 In one exemplary embodiment of the present invention, the interaction information calculation unit sets the ROI region of each sensor in the active detection unit in response to trigger information. 【0016】 In one exemplary embodiment of the present invention, the display data processing unit transmits the synchronization signal to the interaction information calculation unit so that the interaction information calculation unit completes the calculation of the interaction data in response to the synchronization signal. 【0017】 In one exemplary embodiment of the present invention, the active detection unit is installed on one side of the light field display unit and includes several grayscale cameras, an IR camera, and a depth camera, the grayscale cameras being used to collect facial images, the IR cameras being used to collect eye images, and the depth cameras being used to collect gesture images. 【0018】 According to one aspect of the present invention, a computer-readable storage medium in which a computer program is stored is provided. When the computer program is executed by a processor, the light field display method described in any of the above is realized. 【0019】 According to the light field display method provided by the present invention, by collecting and identifying raw images, corresponding interaction data can be determined, and content to be displayed can be generated based on the interaction data using a content generation unit. By rearranging the data for the content to be displayed, light field data can be displayed, enabling active interaction. 【0020】 Please understand that the general description above and the detailed description below are for illustrative and illustrative purposes only and do not limit the invention. 【0021】 The accompanying drawings incorporated herein and constituting part thereof illustrate embodiments consistent with the present invention and, together with this specification, are helpful in illustrating the principles of the present disclosure. Clearly, the drawings in the following description are only some embodiments of the present disclosure, and those skilled in the art can obtain other drawings based on these without any creative effort. 【Brief Description of the Drawings】【0022】 [Figure 1] It is a configuration diagram schematically showing a light field display system according to an exemplary embodiment of the present invention. [Figure 2] It is a schematic diagram schematically showing a method of arranging cameras. [Figure 3] It is a schematic diagram schematically showing a light field display method according to an exemplary embodiment of the present invention. [Figure 4] It is a schematic diagram schematically showing a flow of a method for calculating the spatial position of a face according to an exemplary embodiment of the present invention. [Figure 5] It is a schematic diagram schematically showing a flow of a method for calculating the spatial position of a fixation point according to an exemplary embodiment of the present invention. [Figure 6] It is a schematic diagram schematically showing a flow of a method for calculating the spatial position of a hand and the type of gesture according to an exemplary embodiment of the present invention. [Figure 7] It is a schematic diagram schematically showing the ROI region marking of a face according to an exemplary embodiment of the present invention. [Figure 8] It is a schematic diagram schematically showing the ROI region marking of eyes according to an exemplary embodiment of the present invention. [Figure 9] It is a schematic diagram schematically showing the ROI region marking of a hand according to an exemplary embodiment of the present invention. 【Modes for Carrying Out the Invention】【0023】 Next, exemplary embodiments will be described in more detail with reference to the accompanying drawings. However, the exemplary embodiments can be implemented in various forms and should not be construed as being limited to the embodiments described herein. Rather, these embodiments are provided so that this disclosure will be more thorough and complete, and so that the concept of the exemplary embodiments will be fully conveyed to those skilled in the art. The features, configurations, or characteristics described can be combined in any suitable manner in one or more embodiments. 【0024】 Furthermore, the drawings are merely schematic diagrams of the present disclosure and are not necessarily drawn to exact scale. Identical reference numerals in the drawings indicate identical or similar parts, and redundant descriptions are omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities. These functional entities may be implemented in software form, in one or more hardware modules or integrated circuits, or in different networks and / or processor devices and / or microcontroller devices. 【0025】 In consideration of the shortcomings and deficiencies of the prior art, this exemplary embodiment provides a light field display system 10. The light field display system 10 can be used as a display terminal. As shown in Figure 1, the system 10 includes an active detection unit 101, an interaction information calculation unit 102, a display data processing unit 103, and a light field display unit 104. The active detection unit 101 is connected to the interaction information calculation unit 102, the interaction information calculation unit 102 is connected to the display data processing unit 103, and the display data processing unit 103 is connected to the light field display unit 104. The active detection unit 101 can be used to collect raw image data, including face images, eye images, and gesture images. The interaction information calculation unit 102 may be configured to receive the raw image data collected by the active detection unit, identify the raw image data, obtain corresponding interaction data, and transmit the interaction data to a content generation unit 201. The display data processing unit 103 may be configured to receive the content to be displayed rendered by the content generation unit 201, rearrange the data in combination with the interaction data and pixel placement map to obtain light field data, and transmit it to the light field display device 104. The light field display unit 104 may be configured to display the light field data. 【0026】 Of these, the active detection unit can receive ROI (range of interest) coordinates (x, y, w, h) and other ROI control data from the interactive information computing unit via I2C (Inter-Integrated Circuit) / SPI (Serial Peripheral Interface), etc. On the other hand, it can also transmit real-time collected raw image data to the interactive information computing unit via Mipi (Mobile Industry Processor Interface) / USB (Universal Serial Bus), etc. 【0027】 Specifically, the active detection unit may include several grayscale cameras, IR cameras, depth cameras, and infrared LEDs. Of these, the grayscale cameras can collect grayscale images and detect the spatial position of a face relative to the screen. The number of such cameras is 2n, where n is a positive integer. The spatial position of the face relative to the screen can be determined by binocular distance measurement. Specifically, by setting up multiple groups of grayscale cameras, wide-range face tracking can be achieved. IR cameras (infrared cameras) can collect infrared images to accurately capture the movement of a person's pupil and obtain the spatial gaze position of the person's eye based on the movement of the pupil. Similarly, by setting up multiple groups of IR cameras and combining multiple groups, a wider detection range can be achieved. Depth cameras (not limited to structured light cameras or TOF cameras) can be used to collect depth images, and these depth images can be used to capture the movement of a person's hand and determine the corresponding gesture behavior. Infrared LEDs provide 850nm auxiliary light to the IR cameras to improve image quality. As shown in Reference Figure 2, each camera included in the active detection unit 101 can be spaced apart. For example, an infrared LED 21, a depth camera 22, an IR camera group 23, and a grayscale camera group 24 are arranged in a certain direction to form a camera group. Multiple camera groups may be configured. 【0028】 The light field display unit can receive data processed by the display data processing unit and display it to reconstruct the actual light field information. The light field display unit can consist of an ultra-high resolution display panel and an optical light control device. The pixel arrangement of the display panel may be a conventional RGB arrangement or a pixel island RGB arrangement. The optical light control device may be a physical lens or a liquid crystal lens for controlling light. 【0029】 The light field display system can exchange data with the system end 20. For example, the system end may be a smart terminal device such as a computer or tablet computer. The system end 20 may include a content generation unit 201 used to receive interactive data transmitted by the interactive information computing unit, perform calculations using the interactive data, generate content necessary for light field display, and render the display content. Based on the calculated face center / eyebrow center coordinates, 2 to n cameras can be placed on the left and right sides according to the optical design, and image content can be collected from multiple viewpoints and then transmitted to the display data processing unit. Here, the interaction data includes the spatial position of the face, the spatial position of the gaze point, the spatial position of the gesture, and the gesture category. 【0030】 For example, a light field display system, by installing an interactive information computing unit and a display data processing unit as the display end, can process collected image data at the display front end and acquire ROI regions within the image, enabling efficient data processing and low-latency interaction. A display device with active detection capabilities is provided. 【0031】 This exemplary embodiment provides a light field display method applicable to the light field display system described above. As shown in Figure 3, the above method may include the following: 【0032】 Step S11: Collect raw image data. 【0033】 Step S12: Identify the raw image data and obtain the corresponding interaction data. Here, the interaction data includes the spatial position of the face, the spatial position of the point of gaze, the spatial position of the gesture, and the gesture category. 【0034】 Step S13: The interaction data is sent to the content generation unit, and the content to be displayed, rendered by the content generation unit based on the interaction data, is obtained. 【0035】 Step S14: Combine the interaction data with the data, sort the data according to the content to be displayed, obtain the light field data, and display it. 【0036】 The light field display method provided in this exemplary embodiment integrates the acquisition, recognition, and recapture of raw image data at the display end without occupying system resources. 【0037】 The steps of the light field display method in this exemplary embodiment will be described in more detail below with reference to the attached drawings and examples. 【0038】 In step S11, raw image data is collected. 【0039】 In this exemplary embodiment, an active detection unit is provided for collecting raw image data, which includes facial images, eye images, and gesture images. As shown in Figure 2, the active detection unit includes several grayscale cameras, an IR camera, a depth camera, and an infrared LED. Among these, several grayscale cameras can be used to collect facial images, an IR camera to collect eye images, and a depth camera to collect gesture images. For example, a glasses-free 3D display device is provided with the light field display system described above. The active detection device may be positioned on one side of the display device. The user can maintain a constant spatial position relative to the display device, for example, by standing in front of the display device. The active detection unit is used to collect the user's facial images, eye images, and gesture images in real time. Initially, the raw images collected by each camera may be images within the current area. 【0040】 In step S12, the raw image data is identified and the corresponding interaction data is obtained. Here, the interaction data includes the spatial position of the face, the spatial position of the point of gaze, the spatial position of the gesture, and the gesture category. 【0041】 In this exemplary embodiment, step S12 above may include the following: 【0042】 Step S21: Recognize the face image and determine the ROI region of the face within the face image. 【0043】 Step S22: Determine the spatial position of the corresponding face based on the ROI region of the face. 【0044】 Step S23: Update the ROI region of the face according to a pre-defined rule, and perform face tracking based on the updated ROI region of the face. 【0045】 Specifically, an interactive information computing unit is provided that receives raw image data transmitted from an active detection unit and performs calculations on the image data. When calculating the spatial position of a face, the interactive information computing unit can receive grayscale images captured by a grayscale camera as face images, and use a binocular distance measurement algorithm to calculate the grayscale images and determine the spatial position of the face relative to the display device screen. If the active detection unit has multiple groups of grayscale cameras, it can provide multiple groups of grayscale images to the interactive information computing unit. Based on the multiple groups of grayscale images, the interactive information computing unit can calculate the spatial positions of multiple faces. 【0046】 Initially, the grayscale camera collects all content within the current field of view as an initial face image and sends it to the interaction information computing unit. The interaction information computing unit can first perform image recognition on the initial face image to determine the face's ROI region and its corresponding coordinate information (x, y, w, h). Next, it uses a binocular ranging algorithm to calculate the spatial position of the face according to its ROI region. Specifically, the interaction information computing unit can use a deep learning-based face detection model to perform face recognition on the collected grayscale image and determine the ROI region of the face within the grayscale image. Then, based on the binocular ranging algorithm, it calculates the ROI region of the face for the acquired initial face image and determines the spatial position of the initial face. For example, a binocular ranging algorithm typically requires first calibrating the binocular cameras to obtain the internal and external parameters and homography matrix of the two cameras. Next, the collected initial face image can be corrected according to the calibration results, so that the two corrected images lie on the same plane and are parallel to each other. The two corrected images are matched pixel by pixel to obtain a disparity image. The depth of each pixel is calculated based on the matching result, and a depth map is obtained. Then, the spatial position of the face is determined. The spatial position calculation based on the binocular ranging algorithm can be obtained by conventional means, and this disclosure does not particularly limit the specific steps of this algorithm. Furthermore, the interactive information calculation unit can also set the ROI region information and bounding box size of the grayscale camera and push it to the active detection unit. This allows the grayscale camera to collect small-sized grayscale images of subsequent frames according to the set ROI region information and bounding box size, thereby enabling accurate collection of face images. 【0047】 For example, for the initial face image collected, the interactive information computing unit can mark the face's ROI region using a bounding box and mark the corresponding coordinate information (x, y, w, h) as updated ROI region information. For example, as shown in Figure 7, the face's ROI region is marked using a bounding box and the updated ROI region information is sent to the active detection unit. The active detection unit receives the updated ROI region information and, when collecting the next frame of the face image, can re-collect the grayscale image of the next frame according to the coordinate information in the updated ROI region information. At the same time, during the collection of subsequent frames, the size of the collected image can be reconstructed according to the updated ROI region information. For example, the size of the ROI region can be enlarged according to a pre-defined rule to obtain an enlarged size, and the grayscale image of the next frame can be re-collected based on the updated coordinate information and enlarged size of the ROI region, and the grayscale image of the next frame can be sent to the interactive information computing unit to calculate the spatial position of the face in the image of the next frame. This process is repeated to achieve continuous tracking of the face image. If face tracking fails during the face image tracking process using images collected based on updated ROI area information, continuous tracking can be achieved by re-collecting all content within the grayscale camera's field of view and re-running face recognition and ROI area updates. 【0048】 For example, as shown in Figure 4, in the case of the interactive information computing unit, the face ROI region can be initialized first. In the initial state, face detection is performed on the collected face region image to determine the corresponding face ROI region, and based on the calculation result of the face ROI region, the spatial position of the face in the current grayscale image is calculated using binocular ranging. At the same time, the coordinate information of the face ROI region can be updated using the face ROI region acquired above, and a smaller image of the next frame is reconstructed using the updated face ROI region information, face recognition is performed to determine whether tracking was successful. If tracking is successful, the ROI region information is updated according to the face ROI region information in the frame image until tracking is complete. Alternatively, if tracking is determined to have failed in the next frame image, initialization is performed again, the entire full-size face image is re-collected, face detection is performed again, and the above method flow is repeated. At the same time, after the ROI region information is updated, the image acquisition position information of the grayscale camera is set according to the updated face ROI region. This allows the grayscale camera to acquire a smaller grayscale image of the face. The spatial position of the face obtained by the above method may be the spatial position of the center point of the face, the spatial position of the center of the eyebrows, the spatial position of the center of the eyes, and so on. 【0049】 In this exemplary embodiment, step S12 described above may further include the following: 【0050】 Step S31: Perform image recognition on the eye image to determine the ROI region of the eye within the eye image. 【0051】 Step S32: Determine the spatial position of the corresponding gaze point based on the ROI region of the eye. 【0052】 Step S33: Update the ROI region of the eye according to a pre-set rule, and perform eye tracking using the updated ROI region of the eye. 【0053】 Specifically, the raw image data includes an image of an eye. The corresponding interaction data may be the spatial position of the gaze point obtained based on the eye image. Specifically, an IR camera in the active detection unit can be used to collect an image of the pupil of a human eye and transmit it to the interactive information computing unit. The interactive information computing unit can use the received human eye pupil image to estimate the gaze point of the human eye, obtain the spatial position of the gaze point of the human eye, and track the ROI region of the eye. 【0054】 For example, as shown in Figure 5, in the case of an interactive information computing unit, the ROI region of the human eye can be initialized first. In the initial state, an initial image collected by an IR camera can be received, and the initial image may be an initial image corresponding to all content within the field of view of the IR camera. The interactive information computing unit can use a human eye detection model to directly perform human eye detection on the initial image, determine the ROI region of the human eye, and track the ROI region of the human eye. For example, as shown in Figure 8, the ROI region of the human eye can be marked using a bounding box. Alternatively, in some embodiments, face detection can be performed on the initial image first to determine the face region. Then, by accurately extracting and aligning human eye features to the face region image, human eye detection can be achieved and the ROI region of the human eye can be determined. Next, based on the determined ROI region of the human eye, the spatial position of the gaze point can be calculated using a trained gaze point estimation model, and the spatial position information of the gaze point can be determined. At the same time, the ROI region information of the human eye may be updated based on the currently determined ROI region information of the human eye. The updated ROI region information for human eyes allows the IR camera to set an ROI region, enabling it to accurately acquire images of human eyes based on the set ROI region information during subsequent image acquisition, without needing to acquire complete scene or face images. 【0055】 In this exemplary embodiment, step S12 described above may further include the following: 【0056】 Step S41: Image recognition is performed on the gesture image to determine the ROI region of the hand. 【0057】 Step S42: Determine the spatial position of the hand based on the ROI region of the hand. 【0058】 Step S43: Update the hand's ROI region according to a pre-defined rule and perform hand tracking using the updated hand's ROI region. 【0059】 Step S44: Use the gesture recognition model to identify the ROI region of the hand and determine the corresponding gesture category. 【0060】 Specifically, the raw image data includes gesture images. In the case of the interactive information computing unit, depth images collected in real time by the depth camera in the active detection unit can be received. Image recognition is then performed on the depth images to determine the ROI region of the hand within the depth images and the type of corresponding gesture. Based on the determined ROI region of the hand, the spatial position of the hand can be calculated and the hand can be tracked. 【0061】 For example, as shown in Figure 6, in the case of an interactive information computing unit, initialization can be performed first, and in the initial state, an initial image collected by a depth camera in the active detection unit is received. Here, the initial image may be a depth image captured by the depth camera, containing all the content in the current field of view. In the case of a depth image, a trained recognition model can be used to perform hand detection on the depth image and determine the hand's ROI region in the depth image. For example, as shown in Figure 9, a bounding box can be used to mark the hand's ROI region, and the spatial position information of the hand can be calculated based on the hand's ROI region information and the corresponding depth information. At the same time, a trained gesture recognition model can be used to perform gesture recognition on the hand's ROI region and determine the type of corresponding gesture. The ROI region can be updated with the currently acquired hand ROI region information to determine whether the hand region is being successfully tracked in the current depth image. If the current tracking is successful, the next frame of the depth image can be collected using the updated ROI region. At the same time, the updated ROI region information can be used to set the image acquisition range and position coordinates of the depth camera in the next frame image. Alternatively, if tracking of the current frame image is determined to have failed, i.e., if no hand information is recognized in the depth image, initialization may be performed again to collect the initial depth image and re-execute the method flow described above to obtain the spatial position information of the hand. 【0062】 In step S13, the interaction data is sent to the content generation unit, and the content to be displayed, rendered by the content generation unit based on the interaction data, is obtained. 【0063】 In this exemplary embodiment, the interaction information calculation unit receives a synchronization signal from the display data processing unit and transmits the calculated spatial position of the face (x1, y1, z1), the spatial position of the point of gaze (x2, y2, z2), the spatial position of the gesture (x3, y3, z3), and the gesture category to the content generation unit via USB / WIFI for content generation. Simultaneously, it can also transmit this information to the display data processing unit via I2C / SPI for content processing. 【0064】 In this exemplary embodiment, the step of obtaining the display target content rendered by the content generation unit based on the interaction data includes the steps of: calculating an overlapping region based on the spatial position of the gesture and the position of the current display content; and, if an overlapping region is determined to exist, determining an updated content corresponding to the current display content according to the gesture category, rendering the updated content, and obtaining the display target content. 【0065】 Specifically, for collected user gesture images, a coordinate system for the user's hand is established, and the spatial position of the gesture is marked within that coordinate system. For the currently displayed content on the screen, a screen coordinate system can be established, and the spatial position of the currently displayed content can be marked. Based on a pre-calculated coordinate system transformation matrix between the hand coordinate system and the screen coordinate system, the spatial position of the hand can be transformed into the screen coordinate system, and the intersection of the coordinates of the user's hand and the currently displayed virtual 3D content can be calculated. If the coordinates intersect, they overlap, meaning the user is interacting with the virtual object. If their positions do not overlap, it is determined that the user's hand is not interacting with the virtual object. When an interaction occurs, the currently displayed content is updated according to the recognized user gesture category. For example, user gesture categories include single click, double click, and rotation. Correspondingly, the position and state of the currently displayed virtual 3D display are updated according to pre-set rules, and the updated content is retrieved. 【0066】 In this exemplary embodiment, the step of rendering the updated content to obtain the content to be displayed includes the step of determining a high-resolution region and a low-resolution region of the updated content according to the spatial position of the point of gaze, and the step of rendering the high-resolution region and the low-resolution region, respectively, to obtain the content to be displayed. 【0067】 Specifically, n views can be set for the virtual 3D display content currently shown on the screen (where n is a positive integer). Based on the spatial position of the identified human eye, n / 2 viewpoints can be set for the left eye and the right eye, respectively. Using multi-view rendering, the focus region corresponding to the current display content can be calculated based on the gaze position of the human eye, and the focus region can be set as the high-resolution region and rendered at high resolution. The other regions can be used as low-resolution regions and rendered at low resolution. Next, the content to be displayed is obtained by combining the high-resolution and low-resolution regions. 【0068】 For example, the content generation unit may be an intelligent terminal device such as a computer or tablet computer, or it may be a server that performs calculations for display content. The content generation unit is primarily used to generate content necessary for light field display. Specifically, the content generation unit can obtain 3D display content using ray tracing. Generally, the ray tracing method involves the following steps: a. Emit multiple rays from the center of a human eye / center of a human face / center of an eyebrow. b. Extend the rays to connect the center of the lens and the center of a subpixel. c. The rays continue through the subpixel until they collide with an object, and the color and brightness information of the collided object is assigned to the corresponding pixel. d. Obtain the 3D display content of the matching panel. 【0069】 In this exemplary embodiment, the method further includes the step of setting the rendering mode of the content generation unit in response to a setting command. 【0070】 Specifically, users can set the rendering mode for displayed content on their terminal device. Specifically, rendering modes include low-resolution rendering mode, high-resolution rendering mode, and gaze-point rendering mode. Low-resolution rendering mode is a configuration where the displayed content is shown at a relatively low resolution, thereby reducing rendering requirements and transmission bandwidth. High-resolution rendering mode may be a configuration where the displayed content is shown at a relatively high resolution. Gaze-point rendering mode may be a mode that, based on the user's gaze point information, renders the gaze-point area of the displayed content in high-resolution mode and the rest in low-resolution mode. 【0071】 In step S14, the data is sorted according to the content to be displayed, combined with the interaction data, to obtain and display the light field data. 【0072】 In this exemplary embodiment, a display data processing unit is provided. The display data processing unit, on the one hand, transmits a synchronization signal to an interaction information calculation unit and, according to the synchronization timing, receives interaction data transmitted from the interaction information calculation unit, including the spatial position of the face, the spatial position of the gaze point, and the spatial position of the gesture. On the other hand, it receives display content transmitted from a content processing unit. Next, it sorts the data according to the pixel arrangement map of the light field display, the interaction information, and the content to satisfy the data requirements of the light field display, and transmits the data to the light field display unit. 【0073】 Of these, the light field display unit receives data processed by the display data processing unit and displays it to reconstruct the actual light field information. The light field display unit consists of an ultra-high resolution display panel and an optical light control device, and the pixel arrangement of the panel may be a conventional RGB arrangement or a pixel island RGB arrangement. The optical light control device may be a physical lens or a liquid crystal lens for controlling light. Generally, light field display is a technique for constructing 3D objects through ray tracing. Its principle can be expressed based on a plenoptic function that includes seven-dimensional parameters including the position of the eye (x, y, z), the horizontal angle of light θ, the vertical angle of light φ, and the wavelength of light λ, showing that the intensity of light changes according to time t. 【0074】 For example, a multi-viewpoint fusion algorithm can be used to implement multi-viewpoint autonomic 3D display image fusion and acquire light field data. Taking five views as an example, the multi-viewpoint image fusion algorithm includes: 1) determining the tilt angle α, grid period D, and subpixel width Dh of the cylindrical grid arrangement to calculate the number of subpixel points N covered by one grid period; 2) calculating a subpixel mapping table based on the tilt angle α, the number of subpixel points N, and the row and column coordinates (k, l) of the subpixels; 3) arranging the five viewpoint images vertically according to the resolution of the display screen and compositing them into one complete image; 4) converting the image's color space from YCrCb to RGB; 5) resizing the image and rotating it 90 degrees clockwise; and 6) inputting the RGB subpixel components of the five viewpoints into the corresponding RGB components of the multi-viewpoint stereo composite image according to the obtained subpixel mapping table. Generally, in a grid-type autonomic 3D display, if the grid is arranged at a constant tilt angle and the grid pitch is D, then N subpixel points can be covered in the horizontal direction. A multi-view subpixel mapping algorithm can determine which of the following viewpoints' corresponding components each of the RGB components of a pixel point P at a specific location on a particular display screen is derived from. 【0075】 Furthermore, it is possible to determine whether the user's eyes are positioned at the optimal viewpoint based on the coordinates of both the user's eyes and the coordinates of multiple viewpoints. 【0076】 The light field display method provided in the embodiments of the present invention is applicable to light field display systems. By integrating an active detection unit, an interactive information calculation unit, and a display data processing unit at the display end, the efficiency of information processing and transmission is improved, information transmission efficiency is enhanced, latency is reduced, and the data processing load at the system end is reduced. This improves the user experience of 3D display. 【0077】 It should be noted that the above drawings are merely schematic diagrams of processes included in the method according to exemplary embodiments of the present invention, and are not limiting. It is readily apparent that the processes shown in the above drawings do not indicate or limit the temporal order of these processes. It is also readily apparent that these processes can be executed synchronously or asynchronously, for example, in multiple modules. 【0078】 While the detailed description above refers to multiple modules or units of the device for execution, such division is not mandatory. In fact, according to embodiments of the present invention, the features and functions of two or more modules or units described above are embodied in a single module or unit. Conversely, the features and functions of a single module or unit described above may be further divided and embodied in multiple modules or units. 【0079】 The attached flowcharts and block diagrams illustrate possible implementation architectures, functions, and operations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each box in the flowchart or block diagram may represent a module, program segment, or part of code and may contain one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions described within a block may occur in a different order than shown in the diagram. For example, two blocks shown consecutively may actually be executed almost in parallel, or in reverse order depending on the related functions. It should also be noted that each block in the block diagram or flowchart, and any combination of blocks in the block diagram or flowchart, may be implemented by a dedicated hardware-based system that performs a specified function or operation, or by a combination of dedicated hardware and computer instructions. 【0080】 The units relating to embodiments of the present invention can be implemented in software or hardware, and the described units can also be configured within a processor. The names of these units do not necessarily impose any limitations on the units themselves. 【0081】 In another embodiment, this disclosure further provides a storage medium included in an electronic device, or it may exist independently without being incorporated into an electronic device. One or more programs are stored in the storage medium. When the one or more programs are executed by the electronic device, the electronic device implements the method described in the above embodiments. 【0082】 It should be noted that the storage medium may be a computer-readable signal medium, a computer-readable storage medium, or any combination of the two. A computer-readable storage medium is, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media include, but are not limited to, electrical connections having one or more conductors, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), flash memory, optical fiber, portable compact disc read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above. In the present invention, a computer-readable storage medium is any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this invention, a computer-readable signal medium includes data signals propagated in the baseband or as part of a carrier wave, in which computer-readable program code is carried. Such propagated data signals can take various forms, such as electromagnetic signals, optical signals, or combinations of the aforementioned signals. The computer-readable signal medium may also be any storage medium other than a computer-readable storage medium that can transmit, propagate, or transfer programs used by or in connection with an instruction execution system, apparatus, or device. The program code contained in the storage medium can be transmitted using any suitable medium, including but not limited to wireless, wired, or a suitable combination thereof. 【0083】 Furthermore, the above drawings are merely schematic diagrams of the processes included in the methods according to the exemplary embodiments of this disclosure and are not intended to limit them. It is readily apparent that the processes shown in the above drawings do not indicate or limit the chronological order of these processes. It is also readily apparent that these processes can be executed synchronously or asynchronously, for example, in multiple modules. 【0084】 Other embodiments of this disclosure will become apparent to those skilled in the art through practice of this specification and the disclosed inventions. This application covers any modification, use, or adaptation of this disclosure, including common or customary technical means in the art not invented by this disclosure, in accordance with the general principles of this disclosure. This specification and examples are intended to be considered illustrative only, and the true scope and spirit of this disclosure are shown by the following claims. [Explanation of symbols] 【0085】 10 Light Field Display System 20 System End 21 Infrared LEDs 22 Depth Camera 23 IR camera group 24 Grayscale Camera Group 101 Active Detection Unit 102 Interaction Information Computation Unit 103 Display Data Processing Unit 104 Light Field Display Unit 201 Content Generation Unit

Claims

[Claim 1] A light field display method applicable to a light field display system, Steps to collect raw image data, The steps include identifying the raw image data and obtaining the corresponding interaction data, The steps include: sending the interaction data to the content generation unit and obtaining the content to be displayed rendered by the content generation unit based on the interaction data; The steps include: combining the interaction data with the light field data, rearranging the data according to the content to be displayed, obtaining the light field data, and displaying it; Includes, A light field display method characterized in that the interaction data includes one or a combination of any one of the following: the spatial position of the face, the spatial position of the point of gaze, the spatial position of the gesture, and the gesture category. [Claim 2] The step of obtaining the content to be displayed, rendered by the content generation unit based on the aforementioned interaction data, A step of calculating the overlapping area based on the spatial position of the gesture and the position of the currently displayed content, If it is determined that there is an overlapping area, the process involves determining the update content corresponding to the currently displayed content according to the gesture category, rendering the update content, and obtaining the content to be displayed. The light field display method according to claim 1, characterized by including the following: [Claim 3] The step of rendering the updated content and obtaining the content to be displayed is: The steps include determining the high-resolution and low-resolution regions of the updated content according to the spatial position of the point of focus, The steps include rendering the high-resolution region and the low-resolution region respectively to obtain the content to be displayed, The light field display method according to claim 2, characterized by including the following: [Claim 4] The aforementioned raw image data includes a facial image. The step of identifying the raw image data and obtaining the corresponding interaction data is: The steps include: recognizing the face image and determining the ROI region of the face within the face image; A step of determining the corresponding spatial position of the face based on the ROI region of the face, The steps include updating the ROI region of the face according to pre-set rules and performing face tracking based on the updated ROI region of the face, The light field display method according to claim 1, characterized by including the following: [Claim 5] The aforementioned raw image data includes an image of an eye. The step of identifying the raw image data and obtaining the corresponding interaction data is: The steps include performing image recognition on an image of an eye to determine the ROI region of the eye within the image of the eye, The steps include determining the spatial position of the corresponding point of focus based on the ROI region of the eye, The steps include updating the ROI region of the eye according to a pre-set rule, and performing eye tracking using the updated ROI region of the eye, The light field display method according to claim 1, characterized by including the following: [Claim 6] The aforementioned raw image data includes a gesture image. The step of identifying the raw image data and obtaining the corresponding interaction data is: The steps include: performing image recognition on the gesture image to determine the ROI region of the hand; The steps include determining the spatial position of the hand based on the ROI region of the hand, The steps include updating the hand's ROI region according to a pre-set rule, and performing hand tracking using the updated hand's ROI region, The steps include: identifying the ROI region of the hand using a gesture recognition model and determining the corresponding gesture category; The light field display method according to claim 1, characterized by including the following: [Claim 7] The light field display method according to claim 1, further comprising the step of setting a corresponding ROI region for each sensor used to collect raw image data in response to a trigger operation. [Claim 8] The light field display method according to claim 1, further comprising the step of setting the rendering mode of the content generation unit in response to a setting command. [Claim 9] The step of collecting the raw image data is as follows: The process includes steps such as collecting facial images using several grayscale cameras, collecting eye images using an IR camera, and collecting gesture images using a depth camera. The light field display method according to claim 1, characterized in that several grayscale cameras, IR cameras, and depth cameras are installed on one side of the light field display unit. [Claim 10] A light field display system, An active detection unit configured to collect raw image data, An interaction information computing unit configured to receive the raw image data, identify the raw image data, obtain the corresponding interaction data, and transmit the interaction data to a content generation unit, A display data processing unit is configured to receive the content to be displayed rendered by the content generation unit, and to sort the data in combination with the interaction data and pixel placement map to obtain light field data. A light field display unit configured to display the aforementioned light field data, Includes, A light field display system characterized in that the interaction data includes one or more combinations of the following: the spatial position of the face, the spatial position of the point of gaze, the spatial position of the gesture, and the gesture category. [Claim 11] The light field display system according to claim 10, characterized in that the interaction information calculation unit sets the ROI region of each sensor in the active detection unit in response to trigger information. [Claim 12] The light field display system according to claim 10, characterized in that the display data processing unit transmits the synchronization signal to the interaction information calculation unit so that the interaction information calculation unit completes the calculation of the interaction data in response to the synchronization signal. [Claim 13] The active detection unit is installed on one side of the light field display unit and includes several grayscale cameras, IR cameras, and depth cameras. The light field display system according to claim 10, characterized in that it collects facial images using a grayscale camera, eye images using an IR camera, and gesture images using a depth camera. [Claim 14] A storage medium on which a computer program is stored, A storage medium characterized in that, when the computer program is executed by the processor, the light field display method according to any one of claims 1 to 9 is realized.