Digital human display control method and apparatus, and device, storage medium and program product

By acquiring digital human image data and its associated features, a multi-view rendering table is determined and mapped to the screen of a 3D display device, solving the distortion problem in 3D digital human display and improving the stereoscopic effect and user experience.

WO2026139078A1PCT designated stage Publication Date: 2026-07-02GRAVITYXR ELECTRONICS & TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
GRAVITYXR ELECTRONICS & TECH CO LTD
Filing Date
2025-12-26
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing 3D digital human display technology is prone to distortion when the head moves, resulting in a poor user experience and unsatisfactory stereoscopic effect.

Method used

By acquiring image data of digital humans and their associated features, a multi-view rendering table is determined. The image data is then mapped to the screen of a 3D display device using the multi-view rendering table, thereby improving the mapping accuracy between pixels and screen points and adjusting the display effect of the digital human.

Benefits of technology

It reduces distortion in 3D digital humans during changes, improving the stereoscopic effect and user experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025146318_02072026_PF_FP_ABST
    Figure CN2025146318_02072026_PF_FP_ABST
Patent Text Reader

Abstract

Provided in the embodiments of the present application are a digital human display control method and apparatus, and a device, a storage medium and a program product. The digital human display control method is applied to a three-dimensional display device, and the method comprises: acquiring image data of a digital human and an associated feature of the image data; on the basis of the associated feature of the image data, determining a multi-viewpoint rendering table corresponding to the digital human, wherein the multi-viewpoint rendering table corresponding to the digital human is used for representing mapping relationships between points in a screen of a three-dimensional display device and pixel points in the image data of the digital human; and on the basis of the multi-viewpoint rendering table corresponding to the digital human, mapping the image data to a screen, so as to adjust the digital human, which is displayed by the three-dimensional display device. A dynamic and three-dimensional digital human is displayed on the basis of a multi-viewpoint rendering table, and the stereoscopic effect of the digital human is improved, thereby improving the user experience.
Need to check novelty before this filing date? Find Prior Art

Description

Digital human display control methods, devices, equipment, storage media, and program products

[0001] This application claims priority to Chinese Patent Application No. 202411964059.2, filed on December 28, 2024, entitled "Digital Human Display Control Method, Apparatus, Device, Storage Medium and Program Product", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of three-dimensional display technology, and in particular to a digital human display control method, device, equipment, storage medium, and program product. Background Technology

[0003] Digital humans are virtual avatars that mimic human appearances and are generated using digital technology. They can improve the naturalness, realism, and intelligence of human-computer interaction.

[0004] Existing digital humans are primarily designed for two-dimensional screens, displaying pre-generated videos of the digital human on those screens. For three-dimensional displays, the digital human image is often treated as a regular image, and the image is synthesized using only the parameters of the three-dimensional display to achieve the display of the three-dimensional digital human. This results in poor quality and a poor stereoscopic effect for the three-dimensional digital human.

[0005] Therefore, there is an urgent need to provide a high-quality 3D digital human display solution. Summary of the Invention

[0006] This application provides a digital human display control method, apparatus, device, storage medium, and program product. By utilizing the correlation characteristics of image data, a multi-view rendering table for displaying the digital human is determined. While realizing the display of a three-dimensional digital human, by reasonably selecting the multi-view rendering table, the mapping relationship between pixels in the image and the center point of the screen can be better determined, thereby improving the quality and stereoscopic effect of the displayed three-dimensional digital human.

[0007] In a first aspect, embodiments of this application provide a digital human display control method applied to a three-dimensional display device. The method includes: acquiring image data of a digital human and its associated features; determining a multi-view rendering table corresponding to the digital human based on the associated features of the image data; the multi-view rendering table corresponding to the digital human being used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixels in the image data of the digital human being; and mapping the image data to the screen based on the multi-view rendering table corresponding to the digital human being, so as to adjust the digital human being displayed by the three-dimensional display device.

[0008] In one possible implementation, the associated features of the image data include at least one of the following: the viewing position of at least one user; the display area of ​​the digital human on the screen; and the posture features of the digital human.

[0009] In one possible implementation, the digital human associated with the image data is determined in the display area of ​​the screen based on a first input instruction, and / or based on the horizontal position of the at least one user; the horizontal position is a position in a plane perpendicular to the viewing height.

[0010] In one possible implementation, the pose features of the digital human associated with the image data are determined based on a second input instruction, and / or based on features extracted from the image data and / or the position of the digital human relative to the at least one user.

[0011] In one possible implementation, determining the multi-view rendering table corresponding to the digital human based on the association features of the image data includes: determining at least one search term based on the association features of the image data; and searching for the multi-view rendering table corresponding to the digital human from multiple multi-view rendering tables based on the at least one search term.

[0012] In one possible implementation, the at least one search term includes a first search term and a second search term, the first search term being associated with the head pose of the digital human and the second search term being associated with the viewing height of the at least one user.

[0013] In one possible implementation, mapping the image data to the screen based on the multi-view rendering table corresponding to the digital human includes: for each point on the screen located within the display area of ​​the digital human, determining the pixel in the image data mapped to that point based on the multi-view rendering table corresponding to the digital human; determining the pixel value of that point based on the pixel values ​​of multiple pixels within a preset range of the pixel mapped to that point; and driving the three-dimensional display device based on the pixel values ​​of each point in the three-dimensional display device to display the adjusted digital human.

[0014] In one possible implementation, the method further includes: acquiring initial image data; extracting feature points from the initial image data, and generating digital human template data corresponding to the initial viewpoint based on the extracted feature points; the digital human template data including multiple grids with the feature points as vertices; for viewpoints other than the initial viewpoint among the multiple viewpoints, transforming the digital human template data corresponding to the initial viewpoint based on the positional relationship between the viewpoint and the initial viewpoint to obtain digital human template data corresponding to the viewpoint; and determining the pixel points in the initial image data mapped to each point in the screen based on the digital human template data corresponding to each viewpoint among the multiple viewpoints, as well as the screen coordinates and the viewpoint to which each point in the screen belongs, to obtain a multi-view rendering table corresponding to the digital human.

[0015] In one possible implementation, the method further includes: acquiring the viewing height of at least one user; determining a target preset height that matches the viewing height from a plurality of preset heights; and determining the viewpoint to which each point in the screen belongs based on the viewpoint mapping relationship corresponding to the target preset height.

[0016] In one possible implementation, the method further includes: for each of the plurality of preset heights, based on the light emission angle of each point on the screen, determining the intersection point of the main ray emitted from each point on the screen and the horizontal plane where the preset height is located, to obtain the intersection point corresponding to each point on the screen; and determining the viewpoint with the closest horizontal distance to the corresponding intersection point as the viewpoint to which each point on the screen belongs.

[0017] In one possible implementation, the method further includes: adjusting the brightness and skin tone of the displayed digital human based on the detected brightness and color temperature of ambient light.

[0018] In one possible implementation, the screen of the 3D display device has a border, and the displayed digital human has edges that match the attributes of the border; the attributes of the border include at least one of color and shape.

[0019] Secondly, embodiments of this application provide a digital human display control device applied to a three-dimensional display device. The device includes: a data acquisition module for acquiring image data of a digital human and its associated features; a rendering table determination module for determining a multi-view rendering table corresponding to the digital human based on the associated features of the image data; the multi-view rendering table corresponding to the digital human is used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixels in the image data of the digital human; and a mapping module for mapping the image data to the screen based on the multi-view rendering table corresponding to the digital human, so as to adjust the digital human displayed by the three-dimensional display device.

[0020] Thirdly, embodiments of this application provide a three-dimensional display device, including: a three-dimensional display, a memory, and a processor; the memory stores computer execution instructions; the processor executes the computer execution instructions stored in the memory, causing the processor to perform the method provided in the first aspect above and / or various possible implementations of the first aspect.

[0021] Fourthly, embodiments of this application provide a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the method provided in the first aspect above and / or various possible implementations of the first aspect.

[0022] Fifthly, embodiments of this application provide a computer program product, including a computer program that, when executed by a processor, implements the method and / or various possible implementations of the first aspect as described above.

[0023] The digital human display control method, apparatus, device, storage medium, and program product provided in this application, targeting scenarios where digital humans are displayed through a 3D display device, achieve the following: Based on the associated features of the image data used to drive the digital human, a multi-view rendering table corresponding to the digital human is determined. The multi-view rendering table is a pre-generated mapping relationship between pixels in the image driving the digital human and points on the screen of the 3D display device. Based on this multi-view rendering table, image data is mapped to the screen of the 3D display device, thereby displaying a new frame of the digital human or adjusting the shape of the digital human. Using the multi-view rendering table for mapping improves the accuracy of determining pixel values ​​at each point on the screen, improves the accuracy of the parallax image seen by the user, reduces distortion during the transformation of the 3D digital human, and enhances the stereoscopic effect of the digital human seen by the user. Attached Figure Description

[0024] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0025] Figure 1 is a flowchart illustrating a digital human display control method provided in an embodiment of this application;

[0026] Figure 2 is a schematic diagram of the display area adjustment process provided in an embodiment of this application;

[0027] Figure 3 is a schematic diagram of the digital human pose feature determination process provided in an embodiment of this application;

[0028] Figure 4 is a flowchart illustrating another digital human display control method provided in an embodiment of this application;

[0029] Figure 5 is a schematic diagram of the multi-view rendering table lookup process provided in an embodiment of this application;

[0030] Figure 6 is a schematic diagram of a digital human with edges matching the screen border provided in an embodiment of this application;

[0031] Figure 7 is a flowchart illustrating the multi-view rendering table generation method provided in an embodiment of this application;

[0032] Figure 8 is a schematic diagram of the digital human template data in the embodiment shown in Figure 7 of this application;

[0033] Figure 9 is a schematic diagram of the viewpoints to which the center point of the screen belongs at different heights according to the embodiments of this application;

[0034] Figure 10 is a structural schematic diagram of a three-dimensional display device provided in an embodiment of this application.

[0035] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation

[0036] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0037] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in one or more embodiments of this specification are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, use and processing of related data must comply with relevant laws, regulations and standards, and corresponding operation entry points are provided for users to choose to authorize or refuse.

[0038] Three-dimensional display devices, also known as 3D display devices, do not require auxiliary equipment such as 3D glasses. Instead, they use the principle of parallax to allow viewers to see stereoscopic images, making them convenient and immersive.

[0039] 3D display devices can be used as standalone devices or as displays for any type of device, such as robots or head-mounted displays (HMDs).

[0040] In the field of human-computer interaction, digital humans can be used to enhance the interactive experience. However, existing digital human display strategies are designed for two-dimensional display devices. They use collected data to drive the digital human, obtain a digital human image, and display it on the two-dimensional display device. For three-dimensional display devices, the digital human image is often simply treated as a regular image. By using the display parameters of the three-dimensional display device, the digital human image is processed into a parallax image, thereby forming a three-dimensional image in the user's mind.

[0041] This method of displaying 3D digital humans is only suitable for static scenes, where the digital human's posture remains unchanged and only the mouth or eyes change while the head remains stationary. It cannot be applied to more dynamic digital human display scenarios. When the digital human's head moves, the default mapping method will cause the displayed 3D digital human to be distorted, resulting in a poor user experience.

[0042] To improve the display effect of 3D digital humans, this application provides a digital human display control method, which realizes the selection of a suitable multi-view rendering table corresponding to the digital human based on the correlation features of image data and display scene, thereby more accurately mapping the facial data used to drive the digital human to the screen of the 3D display device, improving the display effect of the 3D digital human and reducing the probability of the 3D digital human distorting during the change process.

[0043] Figure 1 is a flowchart illustrating a digital human display control method according to an embodiment of this application. The digital human display control method provided in this embodiment is applied to a three-dimensional display device. As shown in Figure 1, the digital human display control method includes the following steps:

[0044] Step S101: Obtain image data of the digital human and its associated features.

[0045] Image data for digital humans can be image data of users that digital humans use to imitate, mainly including facial images or data describing facial changes.

[0046] The image data of a digital human can be generated by a 3D display device, or it can be generated by other devices and sent to the 3D display device, such as by a cloud server.

[0047] The associated features of the image data are the features related to the displayed digital human or the user viewing the digital human when the digital human is driven based on the image data, i.e., when the adjusted digital human is displayed.

[0048] The image data of a digital human can be sent to a 3D display device from other devices, or it can be generated by the 3D display device itself.

[0049] For situations where 3D display devices generate image data for digital humans, the devices can generate the image data of the digital human based on the facial data of the user corresponding to the digital human. For example, by using a pre-trained digital human driving model, an image of the digital human can be generated based on the facial data of the user, thus obtaining the image data of the digital human.

[0050] When generating image data of a digital human based on the facial data of the user corresponding to the digital human, the matching facial image can be selected from multiple stored facial images of the digital human using the facial data of the user corresponding to the digital human, and the data of the matching facial image can be extracted to obtain the image data of the digital human.

[0051] To reduce the memory occupied by digital human facial images, the face can be split, such as splitting it vertically or according to different parts, and then the images of each split part in different shapes can be stored. The split parts are matched with facial data to obtain the corresponding part images. By stitching the images of each part together, a complete facial image is obtained.

[0052] For example, the face can be divided into an upper half and a lower half, with the upper half including the eyes and the lower half including the mouth.

[0053] Optionally, based on the facial data of the user corresponding to the digital human, facial image data of the digital human is generated, including: based on the facial data, determining a target lip shape image and a target facial image from multiple stored lip shape images and multiple facial images of the digital human; wherein the lip shape corresponding part in the stored facial image is a default lip shape or default color; fusing the target lip shape image and the target facial image to obtain the facial image data of the digital human.

[0054] Specifically, the target facial image can be determined from multiple facial images of a digital human based on eye data in the facial data, and the target lip image can be determined from multiple lip images of the digital human corresponding to the target user based on mouth data in the facial data.

[0055] The target lip shape image and the target facial image are fused. Specifically, the target lip shape image and the target facial image are stitched together, and the target lip shape image is stitched onto the corresponding part of the lip shape in the facial image.

[0056] The system can generate facial images of the target user under various facial expressions using a large expression-driven model. The system can then adjust the default color of the mouth part (which may include only the mouth or both the mouth and nose) in the generated facial images, or replace it with an image corresponding to the default mouth shape. The processed facial images can then be stored on a 3D display device for later use.

[0057] Optionally, based on the facial data of the user corresponding to the digital human, facial image data of the digital human is generated, including: determining the gaze direction based on the eye data in the facial data; determining the target eye image from multiple eye images of the digital human stored based on the gaze direction; generating the lip image of the digital human based on the mouth data in the facial data and a pre-trained lip-shape driving model; and obtaining the facial image data of the digital human based on the target eye image and the generated lip image.

[0058] The associated features of the image data of a digital human may include internal features extracted from facial data, or external features. Internal features include characteristics of the user or digital human described in the image data, such as head orientation, while external features may be characteristics of the user viewing the digital human, and may include the location of the user viewing the digital human.

[0059] In some embodiments, the display area of ​​the digital human is only a portion of the screen of the three-dimensional display device, and the display area can vary. Therefore, the associated features of the image data also include the current display position of the digital human.

[0060] Step S102: Based on the association features of the image data, determine the multi-view rendering table corresponding to the digital human.

[0061] The multi-view rendering table corresponding to the digital human is used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixel points in the image data of the digital human.

[0062] After obtaining the image data of the digital human and its associated features, the associated features of the image data can be used as an index to determine the multi-view rendering table corresponding to the digital human from multiple stored multi-view rendering tables.

[0063] In some embodiments, the three-dimensional display device is used to display multiple digital humans. In determining the multi-view rendering table corresponding to the digital human, it is also necessary to use the digital human as an index to select a multi-view rendering table that matches the associated features of the image data of the digital human from the multi-view rendering table of the digital human, and use it as the multi-view rendering table to be used, that is, the multi-view rendering table that maps image data to the screen in step S103.

[0064] Step S103: Based on the multi-view rendering table corresponding to the digital human, the image data is mapped to the screen to adjust the digital human displayed by the three-dimensional display device.

[0065] Specifically, the mapping relationship between each pixel in the image data and each point on the screen can be determined based on the mapping relationship in the multi-view rendering table corresponding to the digital human. Based on this mapping relationship and the pixel value of each pixel in the image data, the pixel value of each point on the screen can be determined. The screen can then be driven based on the pixel value of each point on the screen to achieve the display adjustment of the digital human.

[0066] Points on the screen are also called sub-pixels. To avoid confusion with pixels in an image, this application refers to sub-pixels on the screen as simply points on the screen.

[0067] The pixel value of each point on the screen can be the pixel value of the pixel in the image data that the point is mapped to.

[0068] To further improve the accuracy of determining the pixel values ​​of each point on the screen and to further enhance the display effect of the 3D digital human, the pixel values ​​of multiple pixels surrounding a pixel in the mapped image data can be used to comprehensively determine the pixel value of that point.

[0069] Optionally, mapping the image data to the screen based on the multi-view rendering table corresponding to the digital human includes: for each point on the screen located within the display area of ​​the digital human, determining the pixel in the image data mapped to that point based on the multi-view rendering table corresponding to the digital human; determining the pixel value of that point based on the pixel values ​​of multiple pixels within a preset range of the pixel mapped to that point; and driving the three-dimensional display device based on the pixel values ​​of each point in the three-dimensional display device to display the adjusted digital human.

[0070] The preset range can correspond to a filtering window, such as a 2*2, 3*3 or other size window. Multiple pixels are sampled from the filtering window, and the average or weighted average of these multiple pixels is determined as the pixel value of the corresponding screen center point.

[0071] For example, the average pixel value of the four pixels adjacent to the pixel in the four directions of up, down, left, and right can be used as the pixel value of the center point of the three-dimensional display device screen that the pixel is mapped to.

[0072] Based on the pixel values ​​of each point in the 3D display device, the 3D display device is driven so that each sub-pixel of the 3D display device presents the corresponding color value, thereby realizing the display of the adjusted digital human.

[0073] The digital human display control method provided in this embodiment, targeting scenarios where digital humans are displayed through a 3D display device, determines a multi-view rendering table corresponding to the digital human based on the associated features of the image data used to drive the digital human. The multi-view rendering table is a pre-generated mapping relationship between pixels in the image driving the digital human and points on the screen of the 3D display device. Based on this multi-view rendering table, the image data is mapped to the screen of the 3D display device, thereby displaying a new frame of the digital human or adjusting the shape of the digital human. By using the multi-view rendering table for mapping, the accuracy of determining the pixel values ​​of each point on the screen is improved, the accuracy of the parallax image seen by the user is improved, the distortion phenomenon occurring during the transformation of the 3D digital human is reduced, and the stereoscopic effect of the digital human seen by the user is improved.

[0074] Optionally, the associated features of the image data include at least one of the following: the viewing position of at least one user; the display area of ​​the digital human on the screen; and the posture features of the digital human.

[0075] If the viewing position is one-dimensional, it represents the user's viewing height; if it is two-dimensional, it represents the user's position on the horizontal plane (the screen in the vertical direction).

[0076] The pose characteristics of a digital human include the head pose characteristics, including the angles of the head in various dimensions, such as the orientation of the head.

[0077] Users move relative to the 3D display device. If fixed parameters are used to render the digital human's image data, the default viewpoint will still be used for 3D rendering when the user's viewpoint changes. This results in the user's left and right eyes seeing different images than expected, leading to parallax images that do not support the corresponding 3D effect and causing the 3D digital human to appear distorted. Therefore, it is necessary to use the user's viewing position as a correlation feature when the digital human's display area remains unchanged; use the digital human's display area as a correlation feature when the user's viewing position is fixed but the display area changes; and use both the user's viewing position and the digital human's display area as correlation features when both can change. Determining the multi-view rendering table using correlation features can better overcome the aforementioned problems. This ensures that when the user's viewpoint changes, the corresponding multi-view rendering table is used for mapping, thus ensuring that the images seen by the user's left and right eyes match expectations.

[0078] In some embodiments, the digital human's head does not maintain the same posture. For example, the digital human may turn its head or tilt its head, which causes the feature points in the digital human's image data to shift. This may also cause the points on the screen that the pixels in the image are mapped to to change. Therefore, it is necessary to also use the digital human's posture features as associated features to determine a more accurate multi-view rendering table for image data mapping and improve the display quality of the 3D digital human.

[0079] The aforementioned three types of associated features can be selected for use in combination with the specific display scenario of the digital human. For example, when the relative position of the digital human and the user viewing the digital human is fixed, only the posture feature of the digital human can be used as the associated feature, thereby reducing the number of multi-view rendering tables. When both the relative position of the digital human and the user viewing the digital human and the posture feature of the digital human change, the viewing position, display area and posture feature of the digital human need to be used as associated features to ensure the stereoscopic effect of the digital human display.

[0080] By matching multi-view rendering tables to arrays of people using multi-dimensional correlation features, the accuracy of rendering mapping can be further improved, thereby enhancing the quality of the displayed 3D digital humans.

[0081] Optionally, the location of the digital human associated with the image data in the display area of ​​the screen is determined based on a first input instruction, and / or based on the location of the at least one user.

[0082] The first instruction can be an instruction issued by a user interacting with or viewing the digital human, or an instruction issued by an authenticated user, such as a voice instruction, touch screen instruction, or button instruction. It can also be an instruction issued by other electronic devices interacting with the 3D display device.

[0083] For example, Figure 2 is a schematic diagram of the display area adjustment process provided in the embodiment of this application. As shown in Figure 2, the display area of ​​the digital human initially displayed is the default area, such as the lower right corner of the 3D display device. If the user is tall, it is inconvenient to look down at the digital human during the conversation. The user can then issue a voice command to adjust the display area of ​​the digital human, such as "raise the position a little higher". The 3D display device responds to the voice command and adjusts the display area to a higher area, such as adjusting the height in fixed steps. The adjusted display area is shown in the lower part of Figure 2.

[0084] In some embodiments, after receiving the first instruction from the user, since the first instruction from the user is usually a vague instruction, such as "Your position is too high", and cannot specify the exact location of the display area, it is also necessary to detect the position of at least one user, such as the position of the user who issued the first instruction or the target user, and determine the accurate display area based on the position.

[0085] In some embodiments, the user does not need to actively issue a first command; the 3D display device will automatically determine the current display area of ​​the digital human based on the user's position detected actively or periodically.

[0086] When a 3D display device is deployed in a high place, such as on a wall at a street corner, the display area of ​​the digital human can be determined based on the horizontal position of the user viewing the digital human, so that the digital human can follow the user's movement.

[0087] When a 3D display device is deployed within the height range of human eyes, the display area of ​​the digital human can be determined based on the 3D position of the user viewing the digital human. While following the user's movement, the center of the digital human's display area is positioned at the same horizontal line or close to the height of the center of the user's eyes, so that the user can interact with the digital human in a head-up manner.

[0088] By supporting changes in the display area of ​​the digital human, the agility and richness of the digital human's display control are further improved; by automatically adjusting the display area of ​​the digital human based on the detected user position, the intelligence and accuracy of the digital human's display area adjustment are improved.

[0089] Optionally, the pose features of the digital human associated with the image data are determined based on a second input instruction, and / or based on features extracted from the image data and / or the angle of the digital human relative to the at least one user.

[0090] Postural features can specifically include features describing head posture (referred to as head posture) and features describing trunk posture (referred to as trunk posture), such as trunk orientation. Trunk posture is similar to head posture, only the body part it refers to is different. The following explanation will use head posture as an example. The corresponding solution for trunk posture can be obtained adaptively, which will not be discussed further.

[0091] The second instruction can be an instruction issued by a user interacting with or viewing the digital human, or an instruction issued by an authenticated user, such as a voice instruction, touch screen instruction, or button instruction. It can also be an instruction issued by other electronic devices that interact with the 3D display device.

[0092] For example, Figure 3 is a schematic diagram of the digital human posture feature determination process provided in the embodiment of this application. As shown in Figure 3, the head posture of the digital human is the default posture in the silent state. If the head is kept naturally upright and the user is located on the left side of the digital human, the user can issue a voice command to adjust the head posture of the digital human, such as "turn the head to the right". The three-dimensional display device responds to the voice command and adjusts the head posture of the digital human to the posture after the head is rotated to the right by a certain angle. This certain angle can be a default value, such as 5°, or it can be determined based on the angle of the user relative to the digital human, so that the head of the digital human faces the user.

[0093] In some embodiments, after receiving a second instruction from the user, since the first instruction from the user is usually a vague instruction, such as "turn around a bit," it cannot provide specific adjustment parameters for the posture features, such as making the digital human face the user, such as the user who issued the second instruction or the target user, after receiving the second instruction, it is also necessary to combine the angle of the digital human relative to at least one user, such as the angle relative to the user who issued the second instruction or the target user, to determine the posture features of the digital human.

[0094] The weights of the second instruction, features extracted from image data, and the angle of the digital human relative to the at least one user can be set. For example, the weight of the second instruction is the highest, or the weights can be adjusted based on the specific scenario. When the second instruction is received, the pose features of the digital human are determined based on the second instruction, features extracted from image data, and the angle of the digital human relative to the at least one user, which has the highest weight.

[0095] In some embodiments, the user does not need to actively issue a second command; the 3D display device will automatically determine the current posture characteristics of the digital human based on the user's gaze direction detected actively or periodically, features extracted from image data, etc.

[0096] Specifically, the head pose of the user corresponding to the digital human can be extracted from the image data. If the digital human is facing the target user in this head pose, or the relative angle between the digital human and the target user is small (less than or equal to a preset angle), then this head pose is taken as the head pose of the digital human. If the relative angle between the digital human and the target user in this head pose is large (greater than a preset angle), then the head pose of the digital human is determined based on the angle between the digital human and the target user. The target user can be the user (which can be one or more) captured by the image acquisition device of the 3D display device that is closest to the 3D display device, or the user facing the 3D display device.

[0097] The angle of the digital human relative to the target user is the angle between the digital human's gaze direction and the target user's gaze direction. The target user's gaze direction can be identified based on the acquired image of the target user.

[0098] By supporting posture adjustment of the digital human, the flexibility of the displayed digital human is further improved, making the digital human more realistic and further enhancing the immersive experience of users interacting with the digital human.

[0099] Figure 4 is a flowchart illustrating another digital human display control method provided in this application embodiment. This embodiment is a further refinement of steps S102 and S103 based on the embodiment shown in Figure 1. As shown in Figure 4, the digital human display control method provided in this embodiment may specifically include the following steps:

[0100] Step S401: Obtain the image data of the digital human and its associated features.

[0101] Step S402: Based on the association features of the image data, determine at least one search item.

[0102] Optionally, the at least one search term includes a first search term and a second search term, wherein the first search term is associated with the head pose of the digital human and the second search term is associated with the viewing height of the at least one user.

[0103] You can directly use related features as search terms to perform multi-view rendering table lookups.

[0104] For example, one can directly use a portion of the associated features as search terms, such as the pose features of a digital human, and calculate other search terms based on another portion of the associated features.

[0105] Data transformation or processing can be performed on each item in the associated features of image data to obtain the corresponding search items.

[0106] To improve the lookup speed of the multi-view rendering table, the search term can be a hash value, and different search terms can be calculated using the same or different hash functions.

[0107] Taking the target user's viewing position as an example, the value of the height dimension within the viewing position, i.e., the viewing height, can be used as the search term. Alternatively, the viewing height calculated based on the viewing position can be used as the search term. Furthermore, the viewing height can be approximated, such as by rounding or retaining two decimal places, and the processed viewing height can be used as the search term. The process for determining the search term corresponding to the head pose is similar to that for the viewing height and will not be elaborated further.

[0108] Step S403: Based on the at least one search term, find the multi-view rendering table corresponding to the digital human from multiple multi-view rendering tables.

[0109] The stored multi-view rendering table can be used to find the multi-view rendering table corresponding to the digital human by using one or more corresponding search terms after determining at least one search term corresponding to the associated features of the digital human's image data, and using the at least one search term as an index.

[0110] For example, Figure 5 is a schematic diagram of the multi-view rendering table lookup process provided in an embodiment of this application. As shown in Figure 5, the multi-view rendering tables stored in the 3D display device correspond to three dimensions of features: digital human, viewing height, and head posture. The multi-view rendering tables corresponding to digital human i are rendering tables i1 to iN, where i takes the value of a positive integer from 1 to M. Rendering table imn represents the multi-view rendering tables corresponding to digital human i, viewing height m, and head posture n. The value of m ranges from 1 to M1, and the value of n ranges from 1 to N1. Assuming the search term for the associated features is (1,1,2), which is the associated feature of the image data of digital human 1, where the viewing height and head posture are viewing height 1 and head posture 2 respectively, then rendering table 112 is determined as the search result, and rendering table 112 is used as the multi-view rendering table for adjusting the currently displayed digital human 1.

[0111] Step S404: For each point on the screen located within the display area of ​​the digital human, determine the pixel in the image data mapped to that point based on the multi-view rendering table corresponding to the digital human.

[0112] Step S405: Determine the pixel value of the point based on the pixel values ​​of multiple pixels within a preset range of the pixel points mapped to the point.

[0113] Step S406: Based on the pixel values ​​of each point in the three-dimensional display device, drive the three-dimensional display device to display the adjusted digital human.

[0114] In this embodiment, the search term is obtained by associating features, and the required multi-view rendering table is found from multiple stored multi-view rendering tables. The search speed is fast, which improves the speed of determining the multi-view rendering table and thus improves the speed of updating the 3D digital human display.

[0115] Optionally, the digital human display control method further includes adjusting the brightness and skin tone of the displayed digital human based on the detected ambient light brightness and color temperature.

[0116] The 3D display device may include an ambient light sensor (ALS) to detect the brightness and color temperature of ambient light.

[0117] The brightness and color temperature of ambient light can also be obtained through image recognition. By acquiring images of the external environment and using image recognition algorithms, the brightness and color temperature of the ambient light in the external environment can be extracted.

[0118] Each time the displayed digital human is adjusted, the brightness and skin tone of the digital human are adjusted based on the brightness and color temperature of the currently detected ambient light, so that the displayed digital human blends with the environment, more closely resembles the state of human skin under ambient light, and further improves the realism of the digital human.

[0119] Optionally, the screen of the 3D display device has a border, and the displayed digital human has edges that match the attributes of the border; the attributes of the border include at least one of color and shape.

[0120] The color of the displayed digital figure's edge matches the color of the border; it can be the same color or a gradient of the same color family.

[0121] Taking a black border as an example, the edge of the displayed digital human can be a black edge or a black gradient edge. For a black gradient edge, the pixel value of the sub-pixels immediately adjacent to the border is black, and the pixel values ​​of the subsequent sub-pixels can gradually change in a certain pattern to present a gradient black edge. For example, the pixel value of the sub-pixels immediately adjacent to the digital human can be dark gray, such as an RGB value of (30,30,30).

[0122] The shape of the digital human's edge matches the shape of the border. Specifically, the shape of the digital human's edge can be complementary to the shape of the border, so that the area on the screen used to display the digital human is a preset shape, such as a rectangle, an ellipse, or other shapes.

[0123] For example, Figure 6 is a schematic diagram of a digital human with an edge matching the screen border provided in an embodiment of this application. As shown in Figure 6, a three-dimensional display device is deployed on the robot's head. Its screen is a curved screen. The area where the screen and its border are located is region 60, and the area where the screen is located is region 61. The area obtained by subtracting region 61 from region 60 is the area where the screen border is located. Its color is black, and its shape is irregular. The shape of the screen border is shown in Figure 6. The display area of ​​the digital human is a rectangular area in the center of the screen, namely region 62. The area of ​​the screen excluding the display area of ​​the digital human, namely the area obtained by subtracting region 62 from region 61, is the aforementioned edge. Its color is a black gradient, and its shape is complementary to the border.

[0124] Figure 7 is a flowchart illustrating the multi-view rendering table generation method provided in an embodiment of this application. This method is used to generate multi-view rendering tables in any of the aforementioned embodiments. As shown in Figure 7, the multi-view rendering table generation method includes:

[0125] Step S701: Obtain initial image data.

[0126] The initial image data can be any image data of the digital human used to generate the multi-view rendering table of the specified digital human. For example, a frontal photo of the user corresponding to the digital human, or a head image of the user corresponding to the digital human.

[0127] To generate multi-view rendering tables for different pose features or their corresponding search terms, initial images of the user under different pose features can be collected to obtain different initial image data. Through subsequent steps, multi-view rendering tables corresponding to different pose features can be generated.

[0128] In some embodiments, the initial image data can be the user image data input when generating the digital human corresponding to the user. That is, when generating the digital human corresponding to the user, it is necessary to generate one or more multi-view rendering tables corresponding to the digital human based on the user image.

[0129] Step S702: Extract feature points from the initial image data, and generate digital human template data corresponding to the initial viewpoint based on the extracted feature points.

[0130] Step S703: For viewpoints other than the initial viewpoint among multiple viewpoints, based on the positional relationship between the viewpoint and the initial viewpoint, the digital human template data corresponding to the initial viewpoint is transformed to obtain the digital human template data corresponding to the viewpoint.

[0131] The initial viewpoint is the viewpoint corresponding to the initial image data; for example, in a frontal shot, the viewpoint is 0°. The digital human template data consists of multiple meshes with feature points as vertices, and the meshes do not overlap.

[0132] Any feature extraction algorithm can be used to extract feature points from the initial image; this application does not impose any restrictions on this.

[0133] Taking the initial image data as head image data as an example, the image coordinates and pixel values ​​of each feature point in the head can be obtained through feature extraction.

[0134] After obtaining multiple feature points from the image data, any meshing algorithm can be used to obtain a mesh of a preset shape, such as a triangular mesh or a rectangular mesh, using the feature points as vertices. Taking a triangular mesh as an example, it can be obtained by triangulating the feature points.

[0135] For example, the digital human template data includes multiple triangular grids covering the face. Figure 8 is a schematic diagram of the digital human template data in the embodiment shown in Figure 7 of this application. Figure 8 only shows a part of the facial area, specifically the eye area. As shown in Figure 8, multiple feature points on the face corresponding to the initial image data are obtained through feature extraction. Connecting three adjacent feature points with straight lines yields a triangle. Traversing each feature point yields a digital human template. The data describing the digital human template is called digital human template data, which may include attributes such as the coordinates, normals, and pixel values ​​of the feature points that make up each triangle.

[0136] In some embodiments, after obtaining the feature points of the initial image data, it is also necessary to adjust the coordinates of the feature points, including translation transformation, magnification, etc.

[0137] After obtaining the 3D coordinates (in pixels) of the feature points, such as (x0, y0, z0), a translation transformation is performed on (x0, y0, z0) to obtain the coordinates in a 3D coordinate system with the origin at the center of the screen surface. User edges, such as face edges, are identified in the initial image and used as image boundaries. The initial image is then magnified to obtain the magnified coordinates of each feature point, such as (x1, y1, z1). (x1, y1, z1) is then transformed to the world coordinate system to obtain the world coordinates of the feature points. The world coordinate system can be defined with its origin at a distance of d centimeters outward from the center of the screen surface, where d is the viewing distance of the 3D display device. The units of the world coordinates also need to be converted to millimeters. This can be done by using the user's interpupillary distance to determine the physical length corresponding to a single pixel. Based on the Euclidean distance r of the pupil feature points and the raster period p of the 3D display device, the units of the world coordinates are converted to millimeters, resulting in coordinates (x1*p / r, y1*p / r, z1*p / r+d), denoted as (x2, y2, z2). When calculating (x2,y2,z2), the intrinsic parameters of the camera that took the initial image or more precise depth information can also be considered.

[0138] Based on (x2, y2, z2), the camera coordinates of each feature point are determined in a new camera coordinate system with the origin at the viewing position corresponding to the initial viewpoint. Through perspective transformation, the map coordinates of each feature point are obtained. Through orthogonal projection, the final coordinates of the feature points, such as (x3, y3), can be obtained, which are the same as the screen coordinates set in the 3D display.

[0139] If the 3D display device has only one viewpoint, namely the initial viewpoint, then step S703 is not required. Step S704 is executed directly. Based on the digital human template data corresponding to the initial viewpoint, as well as the screen coordinates and viewpoints of each point on the screen, the rendering table corresponding to the initial point is obtained, that is, the mapping relationship between the screen midpoint and each pixel in the initial image data, with the viewpoint being the initial viewpoint.

[0140] If there are multiple viewpoints in a 3D display device, then for viewpoints other than the initial viewpoint (denoted as other viewpoints), it is necessary to use the positional relationship between the other viewpoints and the initial viewpoint to convert the digital human template data corresponding to the initial viewpoint to obtain the digital human template data corresponding to that viewpoint.

[0141] Specifically, for any other viewpoint, the coordinates of each feature point in the digital human template data corresponding to the initial viewpoint can be transformed based on the transformation relationship between the camera coordinate system corresponding to the other viewpoint and the camera coordinate system corresponding to the initial viewpoint, so as to obtain the digital human template data corresponding to the other viewpoint.

[0142] Assuming other viewpoints are deviated to the left by β degrees relative to the initial viewpoint, for the coordinates (x2, y2, z2) of a feature point in the digital human template data corresponding to the initial viewpoint, the coordinates of the feature points in the digital human template data corresponding to other viewpoints in that dimension can be expressed as: (cos(β+arctan(z2 / x2))*sqrt(x2) 2 +z2 2 ),y2,sin(β+arctan(z2 / x2))*sqrt(x2 2 +z2 2 Based on this relationship, and the perspective transformation matrix and orthographic projection matrix between image coordinates and camera coordinates, the digital human template data corresponding to the initial viewpoint can be transformed into digital human template data corresponding to other viewpoints. Here, sqrt() is the square root function.

[0143] Although the grids in the digital human template data corresponding to the initial viewpoint are non-overlapping, for other viewpoints, after the aforementioned transformation, grid overlap may occur. Therefore, when saving the digital human template data corresponding to each viewpoint, it is also necessary to save the depth information of each feature point, such as the z-axis coordinate, in order to determine the front-to-back relationship of the overlapping grids.

[0144] Step S704: Based on the digital human template data corresponding to each viewpoint in the multiple viewpoints, and the screen coordinates and viewpoints of each point on the screen, determine the pixel points in the initial image data mapped to each point on the screen, and obtain a multi-viewpoint rendering table corresponding to the digital human.

[0145] The viewpoint to which each point on the screen belongs can be obtained through a viewpoint mapping table.

[0146] For each point on the screen, the viewpoint to which the point belongs is determined by querying the viewpoint mapping table. If the viewpoint to which the point belongs is not the initial viewpoint, then based on the coordinates of the point, multiple grids in which the point belongs are determined from the digital human template data corresponding to the viewpoint to which the point belongs, resulting in multiple candidate grids. Based on the depth information of the vertices contained therein, the grid with the smallest depth is determined as the target grid from the multiple candidate grids. Based on the coordinates of each vertex of the grid (containing the same vertices) mapped by the target grid in the digital human template data corresponding to the initial viewpoint, the pixels and their pixel values ​​in the initial image data mapped by the point are determined. By traversing each point on the screen, the pixels in the initial image data mapped by each point on the screen can be obtained. Based on the pixels mapped by the points, a multi-viewpoint rendering table is obtained.

[0147] For the screen center point whose viewpoint is the initial viewpoint, the pixel point in the initial image data mapped to that point can be determined directly based on the coordinates of that point.

[0148] The center coordinate method can be used to determine the grid where the point sits.

[0149] Weighting information can be determined based on the coordinates of a point and the coordinates of each vertex of the target mesh corresponding to the point. Using the weighting information, the coordinates of each vertex in the mesh of the digital human template data corresponding to the initial viewpoint mapped to the target mesh corresponding to the point can be weighted to obtain the coordinates of the pixel point of the point mapping.

[0150] By utilizing the distribution of digital human feature points from different viewpoints, a digital human template is constructed. By understanding the positional relationship between the screen midpoint and the grid in the digital human template, the mapping between the screen midpoint and the pixels in the initial image data is achieved, improving the accuracy of the mapping and thus enhancing the stereoscopic effect of the digital human seen from different viewpoints.

[0151] In addition to the offline generation of multi-view rendering tables provided in the aforementioned embodiments, an online generation method can also be used. That is, the initial image data can be the image data of the digital human provided in the aforementioned embodiments, thereby generating the multi-view rendering table corresponding to the digital human in real time. The specific generation process can be referred to the embodiment shown in Figure 7, and will not be repeated here.

[0152] To accommodate users of different heights, multi-view rendering tables corresponding to different heights can be generated. Since different user heights result in different viewing angles, it's necessary to modify the viewpoints of various points on the screen, thus referring to the aforementioned steps to generate multi-view rendering tables corresponding to different heights.

[0153] For each preset height, a viewpoint mapping relationship corresponding to that preset height can be established in advance. Then, by traversing each preset height and adjusting the viewpoint to which the screen center point belongs in the aforementioned step S704, a multi-viewpoint rendering table corresponding to that preset height can be generated, thereby obtaining multi-viewpoint rendering tables with different posture features and different heights.

[0154] The preset height can be the height obtained by taking values ​​within a certain range according to a certain step size. For example, within the range of 0.5 meters to 2 meters, 16 preset heights can be obtained by taking values ​​with a step size of 0.1 meters.

[0155] Optionally, the multi-view rendering table generation method further includes: obtaining the viewing height of at least one user; determining a target preset height that matches the viewing height from multiple preset heights; and determining the viewpoint to which each point in the screen belongs based on the viewpoint mapping relationship corresponding to the target preset height.

[0156] At least one user can be a user corresponding to each user of the 3D display device, such as each certified user. Taking a home robot as an example, the at least one user is each family member.

[0157] Optionally, the multi-viewpoint rendering table generation method further includes: for each of the plurality of preset heights, based on the light emission angle of each point on the screen, determining the intersection point of the main ray emitted by each point on the screen and the horizontal plane where the preset height is located, to obtain the intersection point corresponding to each point on the screen; determining the viewpoint with the closest horizontal distance to the corresponding intersection point as the viewpoint to which each point on the screen belongs.

[0158] Figure 9 is a schematic diagram of the viewpoints of the screen center point at different heights provided in the embodiments of this application. As shown in Figure 9, the world coordinate system xyz is shown in Figure 9, with the screen center O as the origin. The initial observation position P1 is (xobs, 0, zobs), and the observation position P2 after moving up and down is (xobs, yobs, zobs). The propagation path of the principal ray emitted from a point (x0, y0, z0) on the screen of the 3D display device is shown as L90 in Figure 9. The propagation path of the principal ray emitted from the point is determined by the light emission angle of the point, which can be determined by the normal vector of the point and the tilt angle of the beam splitting unit. The intersection points of L90 with the horizontal planes of the observation positions (xobs, 0, zobs) and (xobs, yobs, zobs) are D91 and D92, respectively. The viewpoint (represented by a circle in Figure 9) closest to the intersection point in that horizontal plane can be taken as the viewpoint of that point. The distances between intersection points D91 and D92 and each viewpoint along the x-axis can be calculated. Thus, the viewpoint of point P1 (x0, y0, z0) is viewpoint V1, and the viewpoint of point P2 (x0, y0, z0) is viewpoint V2. It can be seen that the viewpoint of the same point on the screen differs at different heights, i.e., at the y-axis coordinate.

[0159] By determining corresponding multi-view rendering tables for multiple preset heights, the system supports the selection of appropriate multi-view rendering tables for rendering or displaying digital humans based on the height (or eye height) of different users. This allows users of different heights to see a stereoscopic image of the digital human from the front, further improving the quality of digital human display.

[0160] Corresponding to the digital human display control method provided in the foregoing embodiments, this application also provides a digital human display control device applied to a three-dimensional display device. The digital human display control device includes: a data acquisition module for acquiring image data of a digital human and its associated features; a rendering table determination module for determining a multi-view rendering table corresponding to the digital human based on the associated features of the image data; the multi-view rendering table corresponding to the digital human is used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixels in the image data of the digital human; and a mapping module for mapping the image data to the screen based on the multi-view rendering table corresponding to the digital human, so as to adjust the digital human displayed by the three-dimensional display device.

[0161] Optionally, the rendering table determination module is specifically used to: determine at least one search item based on the association features of the image data; and, based on the at least one search item, search for the multi-view rendering table corresponding to the digital human from multiple multi-view rendering tables.

[0162] Optionally, the mapping module is specifically used for: determining the pixel in the image data mapped to each point on the screen within the display area of ​​the digital human, based on the multi-view rendering table corresponding to the digital human; determining the pixel value of the point based on the pixel values ​​of multiple pixels within a preset range of the pixel mapped to the point; and driving the three-dimensional display device to display the adjusted digital human based on the pixel values ​​of each point in the three-dimensional display device.

[0163] Optionally, the digital human display control device further includes a rendering table generation module, used for: acquiring initial image data; extracting feature points from the initial image data, and generating digital human template data corresponding to the initial viewpoint based on the extracted feature points; the digital human template data includes multiple grids with the feature points as vertices; for viewpoints other than the initial viewpoint among the multiple viewpoints, transforming the digital human template data corresponding to the initial viewpoint based on the positional relationship between the viewpoint and the initial viewpoint to obtain the digital human template data corresponding to the viewpoint; and determining the pixel points in the initial image data mapped to each point on the screen based on the digital human template data corresponding to each viewpoint among the multiple viewpoints, as well as the screen coordinates and the viewpoint to which each point belongs, to obtain a multi-view rendering table corresponding to the digital human.

[0164] Optionally, the digital human display control device further includes a viewpoint adjustment module, used to: acquire the viewing height of at least one user; determine a target preset height that matches the viewing height from a plurality of preset heights; and determine the viewpoint to which each point in the screen belongs based on the viewpoint mapping relationship corresponding to the target preset height.

[0165] Optionally, the digital human display control device further includes a viewpoint determination module, used to: for each of the plurality of preset heights, based on the light emission angle of each point on the screen, determine the intersection point of the main ray emitted by each point on the screen and the horizontal plane where the preset height is located, and obtain the intersection point corresponding to each point on the screen; determine the viewpoint with the closest horizontal distance to the corresponding intersection point as the viewpoint to which each point on the screen belongs.

[0166] Optionally, the digital human display control device further includes a brightness and skin tone adjustment module for adjusting the brightness and skin tone of the displayed digital human based on the detected ambient light brightness and color temperature.

[0167] The digital human display control device provided in this embodiment can execute the digital human display control method provided in the above embodiment. Its implementation principle and technical effect are similar, and will not be described in detail here.

[0168] Figure 10 is a schematic diagram of the structure of a three-dimensional display device provided in an embodiment of this application. As shown in Figure 10, the control device provided in this embodiment includes: a processor 1001, a memory 1002, and a three-dimensional display 1005. The memory 1002 stores computer execution instructions.

[0169] In a specific implementation, at least one processor 1001 executes computer execution instructions stored in memory 1002, causing at least one processor 1001 to perform the above-described method.

[0170] Optionally, the control device further includes a communication component 1003. The processor 1001, memory 1002, and communication component 1003 are connected via a bus 1004.

[0171] The specific implementation process of processor 1001 can be found in the above method embodiments, and its implementation principle and technical effect are similar. It will not be repeated here.

[0172] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.

[0173] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.

[0174] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.

[0175] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.

[0176] This application also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described method.

[0177] The aforementioned readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The readable storage medium can be any available medium accessible to a general-purpose or special-purpose computer.

[0178] An exemplary readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the readable storage medium. Of course, the readable storage medium can also be a component of the processor. The processor and the readable storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the processor and the readable storage medium can exist as discrete components in the device.

[0179] The division of units is merely a logical functional division; in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices, or units, and may be electrical, mechanical, or other forms.

[0180] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0181] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0182] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0183] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.

[0184] Finally, it should be noted that other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein, and is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.

Claims

1. A digital human display control method, characterized by, Applied to a three-dimensional display device, the method includes: Acquire image data of digital humans and their associated features; Based on the association features of the image data, a multi-view rendering table corresponding to the digital human is determined; the multi-view rendering table corresponding to the digital human is used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixel points in the image data of the digital human. Based on the multi-view rendering table corresponding to the digital human, the image data is mapped to the screen to adjust the digital human displayed by the three-dimensional display device.

2. The method of claim 1, wherein, The association features of the image data include at least one of the following: At least one user's viewing location; The digital human is displayed in the screen's display area; The posture characteristics of the digital human.

3. The method of claim 2, wherein, The digital human associated with the image data is determined in the display area of ​​the screen based on a first input instruction, and / or based on the location of the at least one user.

4. The method of claim 2, wherein, The pose features of the digital human associated with the image data are determined based on a second input instruction, and / or based on features extracted from the image data and / or the angle of the digital human relative to the at least one user.

5. The method according to any one of claims 1 to 4, characterized in that, Based on the association features of the image data, a multi-view rendering table corresponding to the digital human is determined, including: Based on the association features of the image data, at least one search term is determined; Based on the at least one search term, the multi-view rendering table corresponding to the digital human is found from multiple multi-view rendering tables.

6. The method of claim 5, wherein, The at least one search term includes a first search term and a second search term, wherein the first search term is associated with the head posture of the digital human and the second search term is associated with the viewing height of the at least one user.

7. The method according to any one of claims 1 to 6, characterized in that, Based on the multi-view rendering table corresponding to the digital human, the image data is mapped to the screen, including: For each point on the screen located within the display area of ​​the digital human, the pixel in the image data mapped to that point is determined based on the multi-view rendering table corresponding to the digital human; The pixel value of a point is determined based on the pixel values ​​of multiple pixels within a preset range of the pixel points mapped from that point. Based on the pixel values ​​of each point in the 3D display device, the 3D display device is driven to display the adjusted digital human.

8. The method according to any one of claims 1 to 7, characterized in that, The method further includes: Obtain initial image data; Feature points are extracted from the initial image data, and digital human template data corresponding to the initial viewpoint is generated based on the extracted feature points; the digital human template data includes multiple grids with the feature points as vertices; For a viewpoint other than the initial viewpoint among multiple viewpoints, the digital human template data corresponding to the initial viewpoint is transformed based on the positional relationship between the viewpoint and the initial viewpoint to obtain the digital human template data corresponding to the viewpoint. Based on the digital human template data corresponding to each viewpoint in the multiple viewpoints, as well as the screen coordinates and viewpoints of each point on the screen, the pixel points in the initial image data mapped to each point on the screen are determined, and a multi-viewpoint rendering table corresponding to the digital human is obtained.

9. The method of claim 8, wherein, The method further includes: Obtain the viewing height of at least one user; From a plurality of preset heights, determine a target preset height that matches the viewing height; Based on the viewpoint mapping relationship corresponding to the preset height of the target, the viewpoint to which each point in the screen belongs is determined.

10. The method of claim 9, wherein, The method further includes: For each of the multiple preset heights, based on the light emission angle of each point on the screen, the intersection point of the main ray emitted from each point on the screen and the horizontal plane where the preset height is located is determined, and the intersection point corresponding to each point on the screen is obtained. The viewpoint that is closest to the corresponding intersection point in horizontal distance is determined as the viewpoint to which each point in the screen belongs.

11. The method according to any one of claims 1 to 10, characterized in that, The method further includes: Based on the detected ambient light brightness and color temperature, the brightness and skin tone of the displayed digital human are adjusted.

12. The method according to any one of claims 1 to 11, characterized in that, The screen of the 3D display device has a border, and the displayed digital human has edges that match the attributes of the border; the attributes of the border include at least one of color and shape.

13. A digital human display control apparatus, characterized by comprising: The device is used in a three-dimensional display device and includes: The data acquisition module is used to acquire image data of the digital human and its associated features; The rendering table determination module is used to determine the multi-view rendering table corresponding to the digital human based on the association features of the image data; the multi-view rendering table corresponding to the digital human is used to characterize the mapping relationship between each point on the screen of the three-dimensional display device and the pixel points in the image data of the digital human; The mapping module is used to map the image data to the screen based on the multi-view rendering table corresponding to the digital human, so as to adjust the digital human displayed by the three-dimensional display device.

14. A three-dimensional display device, characterized by comprising: Includes 3D displays, memory, and processors; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory, causing the processor to perform the method as described in any one of claims 1-12.

15. A computer readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1-12.

16. A computer program product, characterised in that, Includes a computer program that, when executed by a processor, implements the method described in any one of claims 1-12.