Digital content display method, head-mounted display device and computer-readable medium

By acquiring object images from multiple perspectives to train a digital content display model, the problem of poor simulation effect and long time consumption in existing 3D digital object display technologies has been solved, realizing efficient and immersive 3D digital object display in head-mounted display devices.

WO2026139102A1PCT designated stage Publication Date: 2026-07-02LINGBAN INTELLIGENT (HANGZHOU) INFORMATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
LINGBAN INTELLIGENT (HANGZHOU) INFORMATION TECHNOLOGY CO LTD
Filing Date
2026-02-24
Publication Date
2026-07-02
Patent Text Reader

Abstract

Disclosed in the embodiments of the present disclosure are a digital content display method, a head-mounted display device and a computer-readable medium. A specific implementation manner of the method comprises: acquiring an object image set corresponding to a target object (201); on the basis of the object image set, performing model training on a pre-constructed digital content display model, so as to generate digital content display information (202); loading the digital content display information into a pre-created digital content editing space (203); and on the basis of the digital content display information in the digital content editing space, displaying, in a head-mounted display device, a projection picture of the target object that corresponds to target pose information (204). The implementation manner provides a digital content display manner, such that imaging of a 3D digital object in each dimension can be displayed at a higher fidelity under lower computing power overheads and time overheads, thereby improving the visual experience of a user.
Need to check novelty before this filing date? Find Prior Art

Description

Digital content display method, head-mounted display device, and computer readable medium

[0001] Cross-reference to Related Applications

[0002] This application claims priority to Chinese Patent Application No. 202411932085.7, filed on December 25, 2024, with the Chinese Patent Office, the content of which is incorporated herein by reference in its entirety. TECHNICAL FIELD

[0003] Embodiments of the present disclosure relate to the field of computer technology, and specifically to a digital content display method, a head-mounted display device, and a computer readable medium. BACKGROUND

[0004] Digital content refers to various three-dimensional digital objects or scenes displayed in virtual spaces such as augmented reality (AR), virtual reality (VR), or digital twin. A head-mounted display device can provide a more stereoscopic visual experience for a user by displaying three-dimensional digital content (e.g., 3D virtual objects) in a scene space. Currently, when displaying digital content, the head-mounted display device usually adopts the following way: pre-making 3D digital objects through CAD, or reconstructing 3D digital objects through multiple view stereo (MVS) technology or neural network technology, and displaying the obtained 3D digital objects.

[0005] However, when displaying digital content in the above-mentioned way, the following technical problems often exist: CAD models usually lack natural materials and light and shadow effects, the simulation effect is poor from the user's perspective, and the model making takes a long time. The digital objects constructed by the multiple view stereo reconstruction technology have poor precision, and when the multiple view resolution is low, the reconstructed digital objects contain a large amount of noise. In addition, the neural network model for reconstructing 3D digital object models has a large computational cost, and the reconstruction takes a long time.

[0006] The above information disclosed in this Background section is only for the purpose of enhancing the understanding of the background of the present inventive concepts, and therefore, it can contain information that does not form the prior art that is already known in this country to those skilled in the art. SUMMARY

[0007] The Summary section is provided to introduce concepts briefly in a simplified form, which will be described in detail in the detailed description section below. The Summary section does not intend to identify key or essential features of the claimed technology nor is it intended to be used to limit the scope of the claimed technology.

[0008] Some embodiments of the present disclosure propose a digital content display method, a head-mounted display device and a computer readable medium to solve one or more of the technical problems mentioned in the background section.

[0009] In a first aspect, some embodiments of the present disclosure provide a digital content display method, comprising: obtaining a set of object images corresponding to a target object, wherein each object image in the set of object images is collected at a different viewing angle for the target object, and the target object is a three-dimensional object; performing model training on a pre-constructed digital content display model based on the set of object images to generate digital content display information; loading the digital content display information into a pre-created digital content editing space; and displaying a projection image of the target object corresponding to target pose information in a head-mounted display device according to the digital content display information in the digital content editing space, wherein the target pose information represents an observation pose of a target user, and the target user is a user wearing the head-mounted display device.

[0010] In an implementation, the model training on the pre-constructed digital content display model based on the set of object images to generate digital content display information comprises: training the digital content display model according to the set of object images; in response to determining that the trained digital content display model does not converge, performing the model training again; and in response to determining that the trained digital content display model converges, exporting model weight information of the trained digital content display model as the digital content display information.

[0011] In an implementation, before the loading of the digital content display information into the pre-created digital content editing space, the method further comprises: obtaining object display encryption information and encryption protocol information, wherein the object display encryption information and the encryption protocol information correspond to the target object; and performing encryption processing on the digital content display information according to the object display encryption information and the encryption protocol information to update the digital content display information.

[0012] In an implementation, the loading of the digital content display information into the pre-created digital content editing space comprises: loading a pre-created digital content editing space, wherein the digital content editing space is a scene space for adjusting a three-dimensional model of the target object represented by the digital content display information; and loading the digital content display information into the digital content editing space.

[0013] In an implementation, the method of displaying, in a head-mounted display device, a projection picture of a target object corresponding to target pose information according to digital content display information in a digital content editing space, comprises: in response to determining that a cloud rendering request issued by the head-mounted display device is received, performing the following steps: rendering the digital content display information according to target pose information included in the cloud rendering request to generate the projection picture of the target object; sending the projection picture to the head-mounted display device for displaying the projection picture by the head-mounted display device; and in response to determining that a terminal rendering request issued by the head-mounted display device is received, synchronizing the digital content display information in the digital content editing space to the head-mounted display device for displaying the projection picture by the head-mounted display device, wherein the projection picture displayed by the head-mounted display device is generated according to the received digital content display information.

[0014] In a second aspect, some embodiments of the present disclosure provide a head-mounted display device, comprising: one or more processors; at least one optical module, the optical module comprising at least one display screen and optical elements, the at least one display screen being configured to display a scene space, a target object and / or digital content; and a storage device having one or more programs stored thereon.

[0015] When the one or more programs are executed by the one or more processors, the one or more processors implement the method described in any implementation of the first aspect.

[0016] In a third aspect, some embodiments of the present disclosure provide a computer-readable medium having stored thereon a computer program, wherein the program is executed by a processor to implement the method described in any implementation of the first aspect.

[0017] The above various embodiments of the present disclosure have the following beneficial effects: the digital content display method based on the head-mounted display device of some embodiments of the present disclosure can provide a digital content display method, which can display the imaging of a 3D digital object in each dimension with a higher restoration degree at a lower computational power and time cost, thereby improving the visual experience of the user. Specifically, the reason for the poor quality of the constructed digital object model or the large time and computational power cost of digital content reconstruction is that the CAD model usually lacks natural material and light and shadow effects, the simulation effect is poor from the perspective of the user, and the model making takes a long time. Moreover, the digital object constructed by using the multi-view stereo vision reconstruction technology has poor precision, and when the multi-view resolution is low, the reconstructed digital object contains a large amount of noise. Furthermore, the reconstruction of the 3D digital object model by using the neural network model has a large computational power cost and takes a long time. Based on this, the digital content display method based on the head-mounted display device of some embodiments of the present disclosure first acquires a set of object images corresponding to a target object. Each object image in the set of object images is collected at a different perspective of the target object. The target object is a three-dimensional object. Thus, by collecting object images at multiple perspectives, the visual features such as light and shadow, texture, and shape of the target object at different perspectives can be captured. Then, based on the set of object images, a pre-constructed digital content display model is trained to generate digital content display information. Thus, training the digital content display model using multi-perspective images can obtain neural network weights describing the performance of the target object at different perspectives. Furthermore, compared with the traditional multi-view reconstruction technology, the digital content display model can learn the complex light and shadow changes and object structure of the target object through a large amount of training, thereby generating a result with higher precision. Moreover, using the digital content display model for implicit modeling greatly reduces the model computational power and time cost compared with directly using the neural network model for three-dimensional reconstruction. Subsequently, the digital content display information is loaded into a pre-created digital content editing space. Thus, the user can intuitively preview and adjust the display effect of the digital object in the digital content editing space, thereby achieving convenient adjustment of the digital object. Finally, according to the digital content display information in the digital content editing space, a projection image of the target object corresponding to target pose information is displayed in the head-mounted display device. The target pose information represents the observation pose of the target user. The target user is the user wearing the head-mounted display device. Thus, by real-time rendering, when the user wearing the head-mounted display device observes the three-dimensional virtual object from different angles, a two-dimensional image projection matching the perspective can be obtained, thereby enhancing the immersion and interactivity of the user experience. Also, in the actual process of constructing a 3D object model, various construction methods have their own defects, for example, the constructed 3D object has poor structure, high construction complexity, or does not support large scenes, etc.However, for the 3D object model placed in the three-dimensional scene space, the wearing user is still in the 2D view, that is, can only be viewed in a single view at the current moment. Therefore, the excellent presentation capability of the neural network in the 2D view can be directly used to generate the two-dimensional projection of the 3D digital object in the current view of the wearing user. Thus, a display mode of the 3D digital object can be provided for the wearing user, and the imaging of the 3D digital object in each dimension is displayed with high restoration degree at low computing power and time cost, thereby improving the visual viewing experience of the user. BRIEF DESCRIPTION OF DRAWINGS

[0018] The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by describing in detail some embodiments thereof with reference to the attached drawings. The same or similar components have the same or similar reference labels. It should be understood that the drawings are schematic and elements and features are not necessarily to scale.

[0019] FIG. 1 is an architecture diagram of an exemplary system to which some embodiments of the present disclosure can be applied;

[0020] FIG. 2 is a flowchart of some embodiments of a digital content display method based on a head-mounted display device according to the present disclosure;

[0021] FIG. 3 is a flowchart of another embodiment of a digital content display method based on a head-mounted display device according to the present disclosure;

[0022] FIG. 4 is a schematic internal test screenshot of a digital content editing space in a digital content display method based on a head-mounted display device according to the present disclosure;

[0023] FIG. 5 is a schematic diagram of digital content display information encryption in a digital content display method based on a head-mounted display device according to the present disclosure;

[0024] FIG. 6 is a scene schematic diagram of a digital content display method based on a head-mounted display device according to the present disclosure;

[0025] FIG. 7 is a structural schematic diagram of an electronic device suitable for implementing some embodiments of the present disclosure. DETAILED DESCRIPTION

[0026] Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although some embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms, and should not be interpreted as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure can be more thoroughly and completely understood. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the scope of protection of the present disclosure.

[0027] In addition, it needs to be noted that only parts related to the present application are shown in the drawings for the convenience of description. The embodiments in the present disclosure and the features in the embodiments can be combined with each other without conflict.

[0028] It should be noted that the concepts of "first", "second", and the like mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the order or interdependence of the functions performed by these devices, modules or units.

[0029] It should be noted that the adjectives "one", "multiple" mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless otherwise explicitly indicated in the context, it should be understood as "one or more".

[0030] The names of the messages or information exchanged between the plurality of devices in the embodiments of the present disclosure are only for illustrative purposes, and are not used to limit the scope of the messages or information.

[0031] The present disclosure will be described in detail below with reference to the drawings and in conjunction with embodiments.

[0032] FIG. 1 shows an exemplary system architecture 100 of a digital content presentation method based on a head-mounted display device, to which some embodiments of the present disclosure can be applied.

[0033] As shown in FIG. 1, the system architecture 100 can include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 can include various connection types, such as wired, wireless communication links, or optical fiber cables, etc.

[0034] A user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages, etc. Various communication client applications can be installed on the terminal devices 101, 102, 103, such as web browser applications, program development applications, search applications, instant messaging tools, email clients, social platform software, etc. The terminal device 103 can be a head-mounted display device. Various screen projection applications can be installed on the terminal device 103.

[0035] The terminal device 101, 102 can be hardware or software. When the terminal device 101, 102 is hardware, it can be various electronic devices with a display screen and supporting information display, including but not limited to a smart phone, a tablet computer, an electronic book reader, a laptop computer, a desktop computer, and the like. When the terminal device 101, 102 is software, it can be installed in the above-mentioned electronic devices. It can be implemented as a plurality of software or software modules for providing distributed services, or as a single software or software module. No specific limitation is made herein.

[0036] The server 105 can be a server providing various services, such as a background server supporting information displayed on the terminal device 101, 102, 103. The background server can analyze and process received request data, and feed back the processing result to the terminal device.

[0037] It should be noted that the method for displaying digital content based on a head-mounted display device provided by the embodiments of the present disclosure can be executed by the terminal device 103.

[0038] It should be noted that the server can be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of a plurality of servers, or as a single server. When the server is software, it can be implemented as a plurality of software or software modules for providing distributed services, or as a single software or software module. No specific limitation is made herein.

[0039] With reference to FIG. 2, a flow 200 of some embodiments of the method for displaying digital content based on a head-mounted display device according to the present disclosure is shown. The method for displaying digital content based on a head-mounted display device includes the following steps:

[0040] In step 201, a set of object images corresponding to a target object is obtained.

[0041] In some embodiments, the execution entity of the digital content display method based on a head-mounted display device, such as the cloud server 105 shown in Figure 1, can acquire a set of object images corresponding to the target object. Each object image in the set is captured from different viewpoints of the target object. The target object is a three-dimensional object. A three-dimensional object can be a real-world three-dimensional entity or a virtual three-dimensional object in a virtual scene. For example, the target object can be a real-world coffee machine, a figurine, or any three-dimensional entity. Alternatively, the target object can be a virtual three-dimensional object, a virtual three-dimensional object, or a virtual landscape in a virtual scene. The head-mounted display device can be a device for a user to view a virtual scene. For example, the head-mounted display device can be, but is not limited to, one of the following: AR glasses, Mixed Reality (MR) glasses, or VR glasses.

[0042] Step 202: Based on the object image set, train the pre-built digital content display model to generate digital content display information.

[0043] In some embodiments, the executing entity can train a pre-built digital content display model based on a set of object images to generate digital content display information. The digital content display model can be a neural network model used to generate different two-dimensional images of a target object viewed from different poses. The digital content display model can be deployed to a cloud server. As an example, the digital content display model can be, but is not limited to, one of the following: a Neural Radiance Fields (NeRF) model, a 3DS model, or other neural network models capable of implicitly representing three-dimensional object models. The digital content display information can be a set of binary codes used to display different two-dimensional images of the target object presented from different viewing angles.

[0044] In some alternative implementations of certain embodiments, the executing entity may generate digital content display information by training a pre-built digital content display model based on a set of object images through the following steps:

[0045] The first step is to train the digital content display model based on the set of object images. In practice, the executing agent first inputs each object image from the set into the digital content display model and determines the corresponding loss function value for each object image. As an example, the loss function used by the digital content display model could be the mean squared error loss function. Then, the executing agent can determine the mean of the model loss function and train the digital content display model using a model optimization algorithm (e.g., gradient descent and backpropagation).

[0046] The second step, in response to the determination that the trained digital content display model has not converged, is to repeat the above model training steps. In practice, the executing entity can determine that the mean of the model loss function of the digital content display model has not converged, and then repeat the above model training steps.

[0047] The third step, in response to the convergence of the trained digital content display model, is to export the neural network weight information corresponding to the trained digital content display model as digital content display information. In practice, the executing entity can export the neural network weight information corresponding to the trained digital content display model as digital content display information in response to the convergence of the mean of the model loss function of the digital content display model.

[0048] Step 203: Load the digital content display information into the pre-created digital content editing space.

[0049] In some embodiments, the executing entity may load digital content display information into a pre-created digital content editing space. This digital content editing space can be a spatial scene used to render and adjust the position and display angle of a two-dimensional image corresponding to the target object presented by the digital content display information.

[0050] Figure 4 shows a schematic screenshot of an internal test of the digital content editing space. The digital content editing space displayed here is loaded with digital content display information. This information can be a collection of images captured from different perspectives of a 3D virtual dinosaur-themed landscape and the neural network weights generated by the digital content display model. The loaded digital content display information can present a two-dimensional image of the 3D virtual dinosaur-themed landscape from a specific perspective. Figure 4 also shows the relevant controls and attributes for adjusting the loaded digital content display information, including controls for controlling the space's occlusion type and interaction type, as well as positioning type and other attribute information.

[0051] In some alternative implementations of certain embodiments, the executing entity may load digital content display information into the digital content editing space of the head-mounted display device through the following steps:

[0052] The first step is to load the pre-created digital content editing space. This digital content editing space is used to adjust and render the displayed information of the loaded digital content. In practice, the executing entity can load the pre-created digital content editing space.

[0053] The second step is to load the digital content display information into the digital content editing space. In practice, the executing entity can load the generated digital content display information into the digital content editing space. Alternatively, the target user can interactively drag and drop the generated digital content display information into the digital content editing space. The target user can be a user of the head-mounted display device or a developer. Interactive methods can include, but are not limited to, at least one of the following: voice interaction, gesture interaction, touch interaction, and button interaction.

[0054] Optionally, before loading the digital content display information into the digital content editing space of the head-mounted display device, the above method may further include the following steps:

[0055] The first step is to obtain the model encryption information and encryption protocol information. The model encryption information and the aforementioned encryption protocol information correspond to the target object. The model encryption information can be information used to encrypt the digital content display information. As an example, the model encryption information can be a set of model weight offsets. In practice, the model weight offsets can be represented in binary encoding form. The encryption protocol information can be information representing the specific encryption method. As an example, the encryption protocol information can be the network layer range of the digital content display model, representing the encryption of each model weight in the digital content display information that corresponds to the aforementioned network layer range.

[0056] The second step involves encrypting the digital content display information based on the model encryption information and the aforementioned encryption protocol information to update the digital content display information. In practice, the executing entity can add the model weight offsets included in the model encryption information to the corresponding model weights included in the digital content display information, according to the network layer range represented by the encryption protocol information, to encrypt the digital content display information.

[0057] As shown in Figure 5, the encryption method for digital content display information allows the executing entity to add the model weight offset included in the encrypted model information to the corresponding model weights included in the digital content display information. Therefore, even if the digital content display information is leaked, the encrypted information, affected by the offset, will struggle to achieve a satisfactory content display effect.

[0058] Step 204: Based on the digital content display information in the digital content editing space, display the projected image of the target object corresponding to the target pose information in the head-mounted display device.

[0059] In some embodiments, the executing entity can display a projected image of a target object corresponding to target pose information in a head-mounted display device based on digital content display information in the digital content editing space. The target pose information represents the observation pose of the target user. The target user is a user wearing the head-mounted display device. The target pose information can be information generated or determined by the head-mounted display device that represents the current position and posture of the user. As an example, the target pose information may include three-dimensional coordinates and a rotation matrix. The projected image can be a two-dimensional image (two-dimensional image) of the target object presented at the observation angle represented by the target pose information.

[0060] As shown in Figure 6, the scene diagram includes a three-dimensional space 601, a target object 602, and head-mounted display devices 603, 604, and 605 from three different perspectives. Specifically, when the user observes the target object 602 from a vertical angle on the left, the two-dimensional image of the target object 602 displayed on the head-mounted display device 603 is shown in Figure 606. When the user observes the target object 602 vertically downwards from above, the two-dimensional image of the target object 602 displayed on the head-mounted display device 604 is shown in Figure 607. When the user observes the target object 602 downwards at a 45-degree angle from above, the two-dimensional image of the target object 602 displayed on the head-mounted display device 605 is shown in Figure 608.

[0061] The above-described embodiments of this disclosure have the following beneficial effects: The digital content display method based on a head-mounted display device, through some embodiments of this disclosure, can provide a digital content display method that can display the imaging of 3D digital objects in various dimensions with high fidelity and low computational and time overhead, thereby improving the user's visual experience. Specifically, the reasons for poor quality of the constructed digital object model or high time and computational overhead in digital content reconstruction are: CAD models usually lack natural materials and lighting effects, resulting in poor simulation effects from the user's perspective, and the model production is time-consuming. Digital objects constructed using multi-view stereoscopic vision reconstruction technology have poor accuracy, and when the multi-view resolution is low, the reconstructed digital objects contain significant noise. Furthermore, reconstructing 3D digital object models using neural network models incurs high computational overhead and is time-consuming. Based on this, the digital content display method based on a head-mounted display device according to some embodiments of this disclosure first acquires a set of object images corresponding to the target object. Each object image in the object image set is acquired from different perspectives of the target object. The target object is a three-dimensional object. Therefore, by acquiring object images from multiple perspectives, visual features such as light and shadow, texture, and shape of the target object can be captured from different viewpoints. Then, based on the object image set, a pre-built digital content display model is trained to generate digital content display information. Thus, training the digital content display model using multi-view images yields neural network weights describing the target object's appearance from different perspectives. Furthermore, compared to traditional multi-view reconstruction techniques, the digital content display model can learn complex light and shadow changes and object structures through extensive training, generating higher-precision results. Moreover, implicit modeling using the digital content display model significantly reduces computational and time overhead compared to directly using neural network models for 3D reconstruction. Next, the digital content display information is loaded into a pre-created digital content editing space. Users can then intuitively preview and adjust the display effect of digital objects within this space, enabling convenient adjustments. Finally, based on the digital content display information in the editing space, a projected image of the target object, corresponding to the target pose information, is displayed on a head-mounted display device. The target pose information represents the user's observation pose. The target users are those wearing head-mounted display devices. Therefore, through real-time rendering, when a user observes a 3D virtual object from different angles, a 2D image projection matching that viewpoint can be obtained, thus enhancing the immersive and interactive experience. However, in the actual process of constructing 3D object models, various construction methods have their own shortcomings, such as poor structure of the constructed 3D objects, high construction complexity, or lack of support for large scenes.However, for 3D object models placed in a 3D scene space, the wearer still views them from a 2D perspective, meaning they can only view them from a single perspective at any given moment. Therefore, we can directly leverage the superior ability of neural networks to render 2D images from a 2D perspective to generate a 2D projection of the 3D digital object from the wearer's current viewpoint. This provides a way to display 3D digital objects for the wearer, showcasing the image of 3D digital objects in various dimensions with high fidelity and relatively low computational and time overhead, thereby improving the user's visual viewing experience.

[0062] Referring further to Figure 3, a flow 300 of another embodiment of the digital content display method based on a head-mounted display device according to the present disclosure is shown. This digital content display method based on a head-mounted display device includes the following steps:

[0063] Step 301: Obtain the set of object images corresponding to the target object.

[0064] Step 302: Based on the object image set, train the pre-built digital content display model to generate digital content display information.

[0065] Step 303: Load the digital content display information into the pre-created digital content editing space.

[0066] In some embodiments, the specific implementation of steps 301-303 and the resulting technical effects can be referred to steps 201-203 in the embodiments corresponding to Figure 2, and will not be repeated here.

[0067] Step 304: In response to confirming that a cloud rendering request has been received from the head-mounted display device, perform the following steps:

[0068] Step 3041: Render the digital content display information according to the target pose information included in the cloud rendering request to generate a projected image of the target object.

[0069] In some embodiments, the executing entity can render digital content display information based on the target pose information included in the cloud rendering request to generate a projected image of the target object. In practice, the executing entity can initialize the model weights of the digital content display model using the digital content display information. Then, the executing entity can perform ray tracing and sampling in three-dimensional space, starting from the user's observation point represented by the target pose information, to obtain the three-dimensional coordinates of each sampling point. Finally, the executing entity can input the three-dimensional coordinates of each sampling point into the initialized digital content display model to perform volumetric rendering to generate a projected image of the target object.

[0070] Step 3042: Send the projected image to the head-mounted display device so that the head-mounted display device can display the projected image.

[0071] In some embodiments, the executing entity may send the projected image to a head-mounted display device for the head-mounted display device to display the projected image. In practice, after receiving the projected image, the head-mounted display device may display the projected image in the scene space or virtual space.

[0072] Step 305: In response to confirming that a terminal rendering request has been received from the head-mounted display device, the digital content display information in the digital content editing space is loaded into the head-mounted display device so that the head-mounted display device can generate and display the projected image.

[0073] In some embodiments, in response to receiving a terminal rendering request from a head-mounted display device, the executing entity may load digital content display information from the digital content editing space into the head-mounted display device for the head-mounted display device to generate and display a projected image. In practice, the executing entity may send the digital content display information to the head-mounted display device. The head-mounted display device can then use the digital content display information to initialize the model weights of a pre-stored digital content display model. Then, the head-mounted display device can perform ray tracing and sampling in three-dimensional space, starting from the user's observation point represented by the target pose information, to obtain the three-dimensional coordinates of each sampling point. Finally, the head-mounted display device can input the three-dimensional coordinates of each sampling point into the initialized digital content display model to perform neural network inference and volumetric rendering to generate a projected image of the target object.

[0074] As shown in Figure 3, compared with the description of some embodiments corresponding to Figure 2, this disclosure first refines the specific method of displaying the projected image corresponding to the target object, generating and displaying the projected image through both cloud and terminal methods. When the user generates the projected image through cloud rendering, they can view the two-dimensional image of the target object from the current perspective in real time. During the interaction, only the generated two-dimensional image and target posture information need to be transmitted, thereby reducing the amount of data transmission and improving data transmission efficiency. When the user uses terminal rendering, since the generated digital content display information is essentially the weight information of the trained neural network model, the data volume is smaller and easier to transmit. Moreover, after model training, the computing power requirement of hardware devices can be reduced, thereby realizing real-time rendering on the head-mounted display device. In addition, this disclosure utilizes the characteristics of multi-batch computation and heterogeneous computation of neural network weights, allowing the terminal or cloud to perform concurrent rendering computation of large numbers of objects and multiple user viewing angles, which can protect privacy during terminal rendering computation and also enable concurrent computation in the cloud.

[0075] Referring now to FIG7, a schematic diagram of a head-mounted display device 700 (e.g., terminal device 103 in FIG1) suitable for implementing some embodiments of the present disclosure is shown. The head-mounted display device shown in FIG7 is merely an example and should not impose any limitation on the functionality and scope of the embodiments of the present disclosure.

[0076] As shown in Figure 7, the head-mounted display device 700 may include a processing unit 701 (e.g., a central processing unit, a graphics processing unit, etc.), which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 702 or a program loaded from a storage device 708 into a random access memory (RAM) 703. The RAM 703 also stores various programs and data required for the operation of the head-mounted display device 700. The processing unit 701, ROM 702, and RAM 703 are interconnected via a bus 704. An input / output (I / O) interface 705 is also connected to the bus 704.

[0077] Typically, the following devices can be connected to I / O interface 705: input devices 706 including, for example, a touchscreen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 707 including, for example, at least one display screen, speaker, vibrator, etc.; and communication devices 709. At least one display screen can combine optical elements to image display content in front of the user's eyes. Communication device 709 can allow head-mounted display device 700 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 7 shows a head-mounted display device 700 with various devices, it should be understood that it is not required to implement or have all the devices shown. More or fewer devices can be implemented or have alternatively. Each box shown in Figure 7 can represent one device or multiple devices as needed.

[0078] The head-mounted display device may also include an optical module. The optical module includes at least one display screen and optical elements. The display screen is used to display scene space, target objects, and / or digital content.

[0079] Optionally, the head-mounted display device may include a head-mounted display device body and a smart terminal. The smart terminal is communicatively connected to the head-mounted display device body.

[0080] In particular, according to some embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, some embodiments of this disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via communication device 709, or installed from storage device 708, or installed from ROM 702. When the computer program is executed by processing device 701, it performs the functions defined in the methods of some embodiments of this disclosure.

[0081] It should be noted that, in some embodiments of this disclosure, the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium may be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In some embodiments of this disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In some embodiments of this disclosure, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.

[0082] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol), and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any currently known or future-developed networks.

[0083] The aforementioned computer-readable medium may be included in the aforementioned head-mounted display device; or it may exist independently and not assembled into the head-mounted display device. The aforementioned computer-readable medium carries one or more programs that, when executed by the head-mounted display device, cause the head-mounted display device to: acquire a set of object images corresponding to a target object, wherein each object image in the set is acquired from different viewpoints of the target object, and the target object is a three-dimensional object; train a pre-constructed digital content display model based on the set of object images to generate digital content display information; load the digital content display information into a pre-created digital content editing space; and, based on the digital content display information in the digital content editing space, display a projected image of the target object corresponding to the target pose information in the head-mounted display device, wherein the target pose information represents the observation pose of the target user, and the target user is a user wearing the aforementioned head-mounted display device.

[0084] Computer program code for performing operations of some embodiments of this disclosure can be written in one or more programming languages ​​or a combination thereof, including object-oriented programming languages ​​such as Java, Smalltalk, and C++, and conventional procedural programming languages ​​such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0085] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0086] The functions described above in this document can be performed, at least in part, by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: Field Programmable Gate Arrays (FPGAs), Application-Specific Integrated Circuits (ASICs), Application Standard Products (ASSPs), System-on-Chip (SoCs), Complex Programmable Logic Devices (CPLDs), and so on.

[0087] The above description is merely a selection of preferred embodiments of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of the invention involved in the embodiments of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described inventive concept. For example, technical solutions formed by substituting the above-described features with (but not limited to) technical features with similar functions disclosed in the embodiments of this disclosure.

Claims

1. A method for digital content presentation, comprising: obtaining a set of object images corresponding to a target object, wherein each object image in the set of object images is captured at a different view angle for the target object, and the target object is a three-dimensional object; based on the set of object images, performing model training on a pre-constructed digital content presentation model to generate digital content presentation information; loading the digital content presentation information into a pre-created digital content editing space; presenting, in a head-mounted display device, a projection image of the target object corresponding to target pose information according to the digital content presentation information in the digital content editing space, wherein the target pose information represents an observation pose of a target user, and the target user is a user wearing the head-mounted display device.

2. The method of claim 1, wherein, The model training on the pre-constructed digital content presentation model based on the set of object images to generate digital content presentation information comprises: training the digital content presentation model according to the set of object images; in response to determining that the trained digital content presentation model does not converge, performing the model training step again; in response to determining that the trained digital content presentation model converges, exporting model weight information of the trained digital content presentation model as the digital content presentation information.

3. The method of claim 1, wherein, Before loading the digital content presentation information into the pre-created digital content editing space, the method further comprises: obtaining object presentation encryption information and encryption protocol information, wherein the object presentation encryption information and the encryption protocol information correspond to the target object; encrypting the digital content presentation information according to the object presentation encryption information and the encryption protocol information to update the digital content presentation information.

4. The method of claim 1, wherein, The loading of the digital content presentation information into the pre-created digital content editing space comprises: loading a pre-created digital content editing space, wherein the digital content editing space is a scene space for adjusting a three-dimensional model of the target object represented by the digital content presentation information; and loading the digital content presentation information into the digital content editing space.

5. The method of claim 1, wherein, The presenting, in the head-mounted display device, of the projection image of the target object corresponding to the target pose information according to the digital content presentation information in the digital content editing space comprises: in response to determining that a cloud rendering request issued by the head-mounted display device is received, performing the following steps: rendering the digital content presentation information according to target pose information included in the cloud rendering request to generate the projection image of the target object; sending the projection image to the head-mounted display device for presenting the projection image by the head-mounted display device; in response to determining that a terminal rendering request issued by the head-mounted display device is received, synchronizing the digital content presentation information in the digital content editing space to the head-mounted display device for presenting the projection image by the head-mounted display device, wherein the projection image presented by the head-mounted display device is generated according to the received digital content presentation information. 6.A head-mounted display device, comprising: one or more processors; at least one optical module, the optical module comprising at least one display screen for displaying a scene space, a target object and / or digital content, and an optical element; a storage device having stored thereon one or more programs; when the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-5.

7. The head-mounted display device of claim 6, wherein, The head-mounted display device comprises a head-mounted display device body and a smart terminal, and the smart terminal is in communication connection with the head-mounted display device body.

8. A computer readable medium having stored thereon a computer program, wherein, The computer program is executed by a processor to implement the method according to any one of claims 1-5.