Digital human display method, device, equipment, medium and program product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By acquiring and processing user data to generate personalized welcome messages, the problem of insufficient personalization of welcome messages in digital human systems is solved, thus improving the user interaction experience.

CN122289481APending Publication Date: 2026-06-26JINGDONG TECH HLDG CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: JINGDONG TECH HLDG CO LTD
Filing Date: 2026-04-29
Publication Date: 2026-06-26

Application Information

Patent Timeline

29 Apr 2026

Application

26 Jun 2026

Publication

CN122289481A

IPC: G06T13/40; G06N3/0455; G06F40/30; G06F40/186

AI Tagging

Technology Topics

Personalization Engineering

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

system
JP2026101314ACosmonautic condition simulations Data processing applications Information processing Personalization
A traffic car customer service marketing method and system based on a large language model
CN120851956Baccurate perceptionaccurate quantitative analysisInput/output for user-computer interaction Biological models Personalization Data set
system
JP2026103633AFinance Personalization Operations research
system
JP2026103634ASport apparatus Personalization Real time analysis
Personalized nuclear power management system question and answer method and device
CN122240764ASemantic analysis Text database indexing Personalization Linguistic model

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122289481A_ABST

Patent Text Reader

Abstract

This disclosure presents embodiments of a digital human display method, apparatus, device, medium, and program product. One specific implementation of the method includes: acquiring real-time user data corresponding to a target user; performing feature encoding processing on the real-time user data to obtain user encoding information; determining strategy information for generating a welcome message based on the user encoding information; generating a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information; and rendering and displaying the corresponding digital human based on the welcome message text. This implementation is related to artificial intelligence and can achieve personalized generation of welcome messages based on the personalized content corresponding to the target user, thereby enhancing interaction with the target user.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] Embodiments of this disclosure relate to the field of computer technology, and more specifically to digital human display methods, apparatus, electronic devices, computer-readable media, and program products. Background Technology

[0002] Currently, in existing digital human live streaming or virtual customer service systems, when initiating communication with users, it's necessary to use a welcoming message to increase interaction. The typical methods for generating this welcoming message are: using fixed phrases or simple conditional statements.

[0003] However, the inventors discovered that the following technical problems often arise when using the above method: The generated welcome messages suffer from low personalization and a lack of context awareness, resulting in mechanical and rigid responses. When applied to the field of digital humans, this negatively impacts immersion and realism. Summary of the Invention

[0004] The summary portion of this disclosure is intended to provide a brief overview of the concepts, which will be described in detail in the detailed description portion. This summary portion is not intended to identify key or essential features of the claimed technical solutions, nor is it intended to limit the scope of the claimed technical solutions.

[0005] Some embodiments of this disclosure provide digital human display methods, apparatuses, electronic devices, computer-readable media, and program products to address the technical problems mentioned in the background section above.

[0006] In a first aspect, some embodiments of this disclosure provide a digital human display method, including: acquiring real-time user data corresponding to a target user; performing feature encoding processing on the real-time user data to obtain user encoding information; determining strategy information for generating a welcome message based on the user encoding information; generating a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information; and rendering and displaying the corresponding digital human based on the welcome message text.

[0007] Optionally, the aforementioned strategy information includes: a first strategy information based on template filling and a second strategy information based on a generative model; and the aforementioned strategy information for generating a welcome message based on the aforementioned user coding information includes: determining scenario information for generating a welcome message based on the aforementioned user coding information; in response to the aforementioned scenario information being characterized as the aforementioned first strategy information corresponding to a first applicable scenario, determining the aforementioned first strategy information as strategy information, wherein the aforementioned first applicable scenario is a scenario in which the scenario content corresponds to a scenario that definitively satisfies a first condition; in response to the aforementioned scenario information being characterized as the aforementioned second strategy information corresponding to a second applicable scenario, determining the aforementioned second strategy information as strategy information, wherein the aforementioned second applicable scenario is a scenario in which the scenario content corresponds to a scenario that satisfies a second condition in terms of naturalness.

[0008] Optionally, the above-mentioned generation of a welcome message text based on the above-mentioned strategy information and the above-mentioned user coding information, combined with the semantic content of the user context, includes: in response to the above-mentioned strategy information being the above-mentioned first strategy information, loading the corresponding welcome message template from the template library; adding the corresponding parameter content in the above-mentioned user coding information to the above-mentioned welcome message template to obtain initial welcome message information; and performing post-processing on the above-mentioned initial welcome message information to obtain the welcome message text.

[0009] Optionally, the above-mentioned generation of a welcome message text based on the above-mentioned strategy information and the above-mentioned user encoding information, combined with the semantic content of the user context, includes: responding to the above-mentioned strategy information as the second strategy information, generating a welcome message generation prompt information based on the above-mentioned user encoding information; inputting the above-mentioned welcome message generation prompt information into a pre-trained generative model to obtain an initial welcome message text; and performing post-processing on the above-mentioned initial welcome message text to obtain the final welcome message text.

[0010] Optionally, determining the scenario information for generating the welcome message based on the aforementioned user coding information includes: in response to the existence of a welcome message template in the template library with a rule matching confidence level higher than a first confidence level with respect to the aforementioned user coding information, and / or the current system environment load meets the target load condition, generating scenario information characterized as the aforementioned first applicable scenario; in response to the rule matching confidence level corresponding to each welcome message template in the template library being lower than a second confidence level, and / or the user value corresponding to the aforementioned target user meeting the preset value condition, generating scenario information characterized as the aforementioned second applicable scenario.

[0011] Optionally, the above-mentioned feature encoding processing of the real-time user data to obtain user encoding information includes: using a dynamic weight adjustment method to dynamically allocate weights to each context feature content in the real-time user data to obtain allocated user data; performing standardized label transformation on the allocated user data to obtain transformed data; and performing feature encoding processing on the transformed data to obtain user encoding information.

[0012] Optionally, determining the scenario information for generating the welcome message based on the user coding information includes: determining the scenario information as the first applicable scenario or the second applicable scenario based on the user coding information using a pre-built reinforcement learning agent.

[0013] Optionally, generating the welcome text based on the strategy information and the user coding information, combined with the semantic content of the user's context, includes: extracting coded content related to the user's historical dialogue from the user coding information; generating unprocessed intents corresponding to the coded content; and generating the welcome text based on the strategy information, the user coding information, and the unprocessed intents, wherein the welcome text may be a text for guiding and continuing the processing of the unprocessed intents.

[0014] Optionally, the above method further includes: obtaining welcome message feedback information corresponding to the target user; storing the welcome message feedback information; and updating the strategy information used to generate the welcome message based on the stored welcome message feedback information when the strategy update time is reached.

[0015] Optionally, the above-mentioned rendering and displaying of the corresponding digital human based on the above-mentioned welcome text includes: generating the emotion tag corresponding to the above-mentioned welcome text; and rendering and displaying the corresponding digital human based on the above-mentioned welcome text and the above-mentioned emotion tag.

[0016] Secondly, some embodiments of this disclosure provide a digital human display device, including: an acquisition unit configured to acquire real-time user data corresponding to a target user; a processing unit configured to perform feature encoding processing on the real-time user data to obtain user encoding information; a determination unit configured to determine strategy information for generating a welcome message based on the user encoding information; a generation unit configured to generate a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information; and a display unit configured to render and display the corresponding digital human based on the welcome message text.

[0017] Optionally, the aforementioned strategy information includes: a first strategy information based on template filling and a second strategy information based on a generative model; and the determining unit can be configured to: determine scenario information for generating a welcome message based on the aforementioned user coding information; in response to the aforementioned scenario information being characterized as the aforementioned first strategy information corresponding to a first applicable scenario, determine the aforementioned first strategy information as strategy information, wherein the aforementioned first applicable scenario is a scenario in which the scenario content corresponds to a scenario whose determinism satisfies a first condition; in response to the aforementioned scenario information being characterized as the aforementioned second strategy information corresponding to a second applicable scenario, determine the aforementioned second strategy information as strategy information, wherein the aforementioned second applicable scenario is a scenario in which the scenario content corresponds to a scenario whose naturalness satisfies a second condition.

[0018] Optionally, the generation unit can be configured to: in response to the above-mentioned strategy information being the first strategy information, load the corresponding welcome message template from the template library; add the corresponding parameter content in the above-mentioned user coding information to the above-mentioned welcome message template to obtain initial welcome message information; and perform post-processing on the above-mentioned initial welcome message information to obtain welcome message text.

[0019] Optionally, the generation unit can be configured to: respond to the above-mentioned strategy information as the above-mentioned second strategy information, generate welcome message generation prompt information according to the above-mentioned user encoding information; input the above-mentioned welcome message generation prompt information into a pre-trained generative model to obtain an initial welcome message text; and perform post-processing on the above-mentioned initial welcome message text to obtain a final welcome message text.

[0020] Optionally, the determining unit can be configured to: in response to the existence of a welcome message template in the template library with a rule matching confidence higher than a first confidence level with respect to the aforementioned user coding information, and / or the load of the current system environment meets the target load condition, generate scenario information characterized as the aforementioned first applicable scenario; in response to the rule matching confidence level corresponding to each welcome message template in the template library being lower than a second confidence level, and / or the user value corresponding to the aforementioned target user meeting the preset value condition, generate scenario information characterized as the aforementioned second applicable scenario.

[0021] Optionally, the processing unit can be configured to: dynamically allocate weights to each context feature content in the real-time user data using a dynamic weight adjustment method to obtain allocated user data; perform standardized label transformation on the allocated user data to obtain transformed data; and perform feature encoding processing on the transformed data to obtain user encoding information.

[0022] Optionally, the determining unit can be configured to: determine the scenario information as the first applicable scenario or the second applicable scenario based on the aforementioned user coding information and using a pre-built reinforcement learning agent.

[0023] Optionally, the generation unit can be configured to: extract encoded content related to the user's historical dialogue from the aforementioned user encoded information; generate unprocessed intents corresponding to the aforementioned encoded content; and generate the aforementioned welcome message text based on the aforementioned policy information, the aforementioned user encoded information, and the aforementioned unprocessed intents, wherein the aforementioned welcome message text can be a message text that guides the continuation of the aforementioned unprocessed intents.

[0024] Optionally, the device further includes: acquiring welcome message feedback information corresponding to the target user; storing the welcome message feedback information; and updating the strategy information used to generate the welcome message based on the stored welcome message feedback information in response to the policy update time.

[0025] Optionally, the device further includes: generating emotion tags corresponding to the aforementioned welcome text; and rendering and displaying the corresponding digital human based on the aforementioned welcome text and the aforementioned emotion tags.

[0026] Thirdly, some embodiments of this disclosure provide an electronic device, including: one or more processors; and a storage device having one or more programs stored thereon, such that when the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any implementation of the first aspect.

[0027] Fourthly, some embodiments of this disclosure provide a computer-readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method as described in any implementation of the first aspect.

[0028] Fifthly, some embodiments of this disclosure provide a computer program product, including a computer program that, when executed by a processor, implements the method described in any of the implementations of the first aspect above.

[0029] The above embodiments of this disclosure have the following beneficial effects: Through the digital human display method of some embodiments of this disclosure, personalized welcome messages and corresponding emotional tags can be generated based on the personalized content corresponding to the target user, thereby enhancing interaction with the target user. Specifically, existing methods often suffer from low personalization and a lack of context awareness in the generated welcome messages, resulting in mechanical and rigid messages. In the field of digital humans, this affects immersion and realism. Therefore, the digital human display method of some embodiments of this disclosure first acquires real-time user data corresponding to the target user, so as to realize the personalized generation of subsequent welcome messages based on the real-time user content corresponding to the target user, adapting to the initial interaction with the target user and improving interactivity. Then, the real-time user data is subjected to feature encoding processing to obtain user encoding information, which encodes and integrates the unstructured data corresponding to the real-time user data, facilitating the determination of subsequent strategy information and the use of strategies for generating corresponding welcome messages. Next, based on the user encoding information, the most suitable strategy information for generating welcome messages can be selected, so as to ensure the accurate and efficient generation of subsequent welcome message text. Furthermore, based on the aforementioned strategy information and user coding information, a welcome message text that incorporates the user's contextual semantic content can be accurately and efficiently generated. Here, the generated welcome message text can be customized based on the target user's personal information, ensuring it aligns with the target user's welcome preferences and enhancing interaction. Finally, based on the aforementioned welcome message text, a corresponding digital persona is rendered and displayed. This digital persona rendering and display method vividly achieves personalized interaction with the target user, improving their positive experience. In summary, by using real-time user data corresponding to the target user and selecting appropriate strategy information, a personalized welcome message can be accurately and efficiently generated for the target user. Furthermore, the digital persona display method enhances interaction with the target user. Attached Figure Description

[0030] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and elements are not necessarily drawn to scale.

[0031] Figure 1 This is a schematic diagram of an application scenario of a digital human display method according to some embodiments of the present disclosure; Figure 2 This is a flowchart of some embodiments of the digital human display method according to the present disclosure; Figure 3These are flowcharts of other embodiments of the digital human display method according to this disclosure; Figure 4 These are schematic diagrams illustrating the structure of some embodiments of the digital human display device according to this disclosure; Figure 5 This is a schematic diagram of the structure of an electronic device suitable for implementing some embodiments of the present disclosure. Detailed Implementation

[0032] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0033] It should also be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings. Unless otherwise specified, the embodiments and features described in this disclosure can be combined with each other.

[0034] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.

[0035] It should be noted that the terms "a" and "a plurality of" used in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".

[0036] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

[0037] Before performing any of the operations involving the collection, storage, or use of user personal information (such as real-time user data) disclosed in this disclosure, the relevant organizations or individuals shall fulfill their obligations, including conducting personal information security impact assessments, informing personal information subjects, and obtaining prior authorization and consent from personal information subjects.

[0038] This disclosure will now be described in detail with reference to the accompanying drawings and embodiments.

[0039] Figure 1 This is a schematic diagram of an application scenario of a digital human display method according to some embodiments of the present disclosure.

[0040] exist Figure 1In this application scenario, firstly, the electronic device 101 can acquire real-time user data 102 corresponding to the target user. In this application scenario, the real-time user data 102 may be: User ID=U12345, Profile: Female, 28 years old, high-spending, Interest=skincare; History: Inquired about "foundation" 3 days ago; Time: Saturday 19:30; Location: Shanghai, sunny, 22℃. Then, the electronic device 101 can perform feature encoding processing on the above real-time user data 102 to obtain user encoding information 103. Next, the electronic device 101 can determine the strategy information 104 for generating a welcome message based on the above user encoding information 103. In this application scenario, the strategy information 104 may be "strategy information for generating a welcome message using a large language model". Furthermore, the electronic device 101 can generate a welcome message text 105 that combines the user's contextual semantic content based on the above strategy information 104 and the above user encoding information 103. In this application scenario, the welcome text 105 could be, "Good evening! That foundation you were looking at before is now in stock, would you like to take a look?" Finally, the electronic device 101 can render and display the corresponding digital human 106 based on the aforementioned welcome text 105.

[0041] It should be noted that the aforementioned electronic device 101 can be either hardware or software. When the electronic device is hardware, it can be implemented as a distributed cluster consisting of multiple servers or terminal devices, or as a single server or a single terminal device. When the electronic device is software, it can be installed in the hardware devices listed above. It can be implemented as, for example, multiple software programs or software modules used to provide distributed services, or as a single software program or software module. No specific limitations are made here.

[0042] It should be understood that Figure 1 The number of electronic devices shown is merely illustrative. Any number of electronic devices can be used depending on the implementation requirements.

[0043] Continue to refer to Figure 2 The flowchart 200 illustrates some embodiments of a digital human display method according to the present disclosure. The digital human display method includes the following steps: Step 201: Obtain real-time user data corresponding to the target user.

[0044] In some embodiments, the entity executing the above-described digital human display method (e.g.) Figure 1The electronic device 101 shown can acquire real-time user data corresponding to the target user via wired or wireless connection. The target user can be a user who is about to interact and initiates a conversation. For example, the target user can be the initiator of a 1v1 digital human live stream or a user engaging in a conversation. Real-time user data can be user data related to the target user acquired at the current time. In practice, user data can represent the target user's self-profile and interactive behavior. In practice, for the initiator of a 1v1 digital human live stream, the executing entity can retrieve real-time user data from the backend database based on the user ID (IdentityDocument / Identifier). For example, real-time user data can include: user profile, chat history, current time, geographical location, and weather conditions. User profile can include: age, gender, consumption registration, and interest tags (e.g., "beauty" and "digital"). Chat history can include: the content of the last conversation, the conversation topic, and conversation keywords. The current time can be accurate to the minute and is the storage time of the current real-time user data. Geographical location can be the geographical location of the device used by the target user. In practice, cities can also be located based on IP or GPS, and weather APIs can be called to obtain weather conditions (sunny / rainy / snowy) and temperature. For example, real-time user data could be: User ID=U12345, Profile: Female, 28 years old, high-spending, Interest = Skincare; History: Inquired about "foundation" 3 days ago; Time: Saturday 19:30; Location: Shanghai, Sunny, 22℃.

[0045] Here, in the context of digital human live streaming, the executing entity can be a digital human live streaming server, which is used to generate welcome messages and render the corresponding digital human.

[0046] It should be noted that the scenario of acquiring real-time user data and generating a welcome message based on that data can be a 1v1 digital human live stream scenario. After receiving a 1v1 digital human live stream request from a user terminal, the executing entity (e.g., the digital human live stream server) uses a data acquisition module to obtain the real-time user data corresponding to the target user through bidirectional communication.

[0047] Step 202: Perform feature encoding processing on the above real-time user data to obtain user encoding information.

[0048] In some embodiments, the execution entity may perform feature encoding processing on the real-time user data to obtain user-encoded information. The user-encoded information may be in the form of a structured vector. The user-encoded information can represent the semantic content of the features of each user feature in the real-time user data. Here, the real-time user data may be heterogeneous data; through feature encoding processing, heterogeneous data can be uniformly encoded into a structured context feature vector for subsequent policy matching.

[0049] For example, the content displayed in JSON format corresponding to user encoding information could be: { "user_type": "return", "interests": ["skincare", "makeup"], "last_topic": "foundation", "time_slot": "evening", "weekday": "Saturday", "holiday": false, "location_city": "Shanghai", "weather": "sunny" }

[0050] As an example, the aforementioned execution entity can input real-time user data into the FeatureCrossing model to generate high-dimensional feature vectors as user-encoded information.

[0051] In some optional implementations of certain embodiments, the execution entity may perform feature encoding processing on the real-time user data to obtain user encoding information, including the following steps: The first step involves using a dynamic weight adjustment method to dynamically allocate weights to each contextual feature in the aforementioned real-time user data, resulting in the allocated user data. This dynamic weight adjustment method can be a method of dynamically adjusting the feature importance weights corresponding to each contextual feature. Furthermore, real-time user data may include user data from multiple time steps, with different data weights for each time step. The closer the time step is to the current time, the higher the corresponding data weight. These data weights can also be determined using the dynamic weight adjustment method. The contextual features here can be user-related features. For example, contextual features can be one of the following: gender, consumption level, interest, geographic location, or chat history. Each contextual feature has a corresponding feature importance weight. The feature importance weight represents the information importance of the corresponding feature content in the subsequent generation of the welcome message. In practice, the feature importance weight can be a value between 0 and 1; the higher the value, the more important the corresponding feature content is in the subsequent generation of the welcome message. For example, for VIP users, the weight of historical chat history can be automatically increased to ensure that the welcome message is closer to the user's historical interests. The contextual feature content can be the feature value corresponding to the contextual feature. Dynamic weight adjustment methods can be based on user type and current time. For example, a dynamic weight adjustment method could be a context feature weight allocation method based on a machine learning model. Here, the machine learning model could be XGBoost or a neural network. VIP users are those with higher user privileges.

[0052] The second step is to standardize and label the user data after allocation to obtain the converted data.

[0053] As an example, the aforementioned implementing entity can use coding rules to map unstructured behavioral data (e.g., "last week's consultation on foundation") to standardized labels (last_topic=foundation).

[0054] The third step is to perform feature encoding processing on the above-mentioned transformed data to obtain user encoding information.

[0055] As an example, feature encoding techniques (e.g., using Sentence-BERT or similar models) can be used to generate high-dimensional feature vectors corresponding to the transformed data, thereby improving the expressive power of the model.

[0056] Step 203: Based on the user coding information mentioned above, determine the strategy information used to generate the welcome message.

[0057] In some embodiments, the execution entity can determine strategy information for generating welcome messages based on the user encoding information. The strategy information can be the strategy content of the strategy used to generate the welcome messages. For example, the strategy information can be a strategy identifier for generating welcome messages. For example, the strategy information can correspond to one of the following strategies: decision tree-based strategy information, large language model-based strategy information, or vector retrieval-based strategy information. Here, the decision tree-based strategy information corresponds to a strategy that matches welcome messages based on a decision tree. Each node in the decision tree corresponds to different user features. The large language model-based strategy information corresponds to a strategy that uses a large language model to generate welcome messages. Here, the large language model can be a pre-trained large language model based on the Transformer architecture. The vector retrieval-based strategy information corresponds to a strategy that selects the welcome message with the highest vector similarity by determining the vector similarity between the user encoding information and each welcome message in the welcome message vector library.

[0058] As an example, firstly, a first degree of matching is determined between the user-encoded information and the policy information based on the decision tree. Then, a second degree of matching is determined between the user-encoded information and the policy information based on the large language model. Next, a third degree of matching is determined between the user-encoded information and the policy information based on vector retrieval. Finally, the policy information with the highest value among the first, second, and third degrees of matching is selected as the policy information.

[0059] In some optional implementations of certain embodiments, the aforementioned strategy information includes: a first strategy information based on template filling and a second strategy information based on a generative model. The first strategy information based on template filling may be a method of generating a welcome message by adding user-related information to a welcome message template. The welcome message template may be a pre-set template that only requires adding parameter content to generate the welcome message. Each welcome message template may be stored in a template library. The second strategy information based on a generative model may be a method of generating the welcome message using a generative model. The generative model may be a large language model that generates content based on prompt word engineering. For example, the generative model may be a specified generative model (e.g., Qwen model, ChatGLM model). In specific application scenarios, the second strategy information based on a generative model may be a method of generating the welcome message by calling the generative model interface configuration, or it may be a generation method using a lightweight generative model deployed at the edge (e.g., a small Transformer or LSTM). Specifically, for lightweight generative models deployed at the edge, model compression techniques (distillation, quantization) can be used to lightweight the LLM. The terminal generates welcome message text based on locally cached user profiles and context, and only sends data back for model updates when necessary.

[0060] It should be noted that the model parameters of the generative model can be dynamically updated under the online learning mechanism, and the strategy effect can be optimized without downtime.

[0061] Optionally, the aforementioned executing entity may determine strategy information for generating the welcome message based on the aforementioned user coding information, including the following steps: The first step is to determine the scenario information used to generate the welcome message based on the user coding information mentioned above. This scenario information can be various scenario features within the digital human scenario where the dialogue with the target user takes place. For example, scenario features may include: naturalness features, fluency features, and deterministic features.

[0062] Specifically, different strategies and information are used to generate welcome messages based on different scenario characteristics.

[0063] As an example, the aforementioned execution entity can generate user intent and scene intent corresponding to user-coded information. Then, it queries a scene feature mapping table to retrieve at least one scene feature information corresponding to the user intent and scene intent, which serves as the scene information. The scene intent can be the requested information during subsequent dialogue between the digital human. The scene feature mapping representation can characterize the mapping relationship between scene features and intents.

[0064] The second step involves, in response to the aforementioned scenario information being characterized as corresponding to a first applicable scenario for the first strategy information, defining the first strategy information as strategy information. The first applicable scenario is a scenario where the certainty of the scenario content satisfies a first condition. This certainty of the scenario content can be characteristic content corresponding to a certainty feature. The first condition can be that the certainty is higher than a target certainty value. The target certainty value can be a threshold used to measure whether the scenario content requires high certainty. That is, when the certainty is higher than the target certainty value, the scenario content in the subsequent digital human scenario is required to be sufficiently accurate and reliable. Certainty is used to measure the predictability, accuracy, and stability of the generated content, ensuring that the output strictly conforms to preset rules or facts. Certainty can be information in numerical form. The higher the certainty value, the higher the requirements for the accuracy and reliability of the scenario content.

[0065] The third step involves responding to the scenario information described above, which represents the second applicable scenario corresponding to the second strategy information, and determining the second strategy information as strategy information. The second applicable scenario is a scenario where the naturalness of the scenario content satisfies the second condition. The naturalness of the scenario content can be a numerical value corresponding to a naturalness feature. Naturalness can represent the rendering naturalness and content naturalness in the digital human rendering process; it measures how closely the generated content resembles naturally produced human content, whether it is smooth, realistic, and conforms to human habits. The feature content corresponding to the naturalness feature can be in numerical form. The higher the value of the feature content corresponding to the naturalness feature, the smoother the digital human rendering and the more natural the content. The second condition can be that the naturalness is higher than a naturalness threshold. The naturalness threshold can be a numerical value used to measure whether the digital human content is high-naturalness content.

[0066] In some optional implementations of certain embodiments, the execution entity may determine the scenario information for generating the welcome message based on the user coding information, including the following steps: The first step involves generating scenario information representing the first applicable scenario, in response to the existence of a welcome message template in the template library with a rule matching confidence score higher than the first confidence score for the aforementioned user-coded information, and / or the current system load meeting the target load condition. Here, the rule matching confidence score can be the degree of matching between the user-coded information and the template in the template library (i.e., the welcome message template). The higher the rule matching confidence score, the more suitable the corresponding welcome message template is for application in the user context scenario corresponding to the user-coded information. In practice, the rule matching confidence score can be in numerical form. The higher the value, the better the match. The first confidence score can be a threshold used to measure whether the rule matching confidence score is high enough (i.e., whether the welcome message template matches). The first confidence score can be set based on historical experience. For example, the first confidence score can be 0.8. The current system environment can be the system environment corresponding to the current executing entity (e.g., the server status corresponding to the digital human live streaming server). The target load condition can be that the current system load is too high. Specifically, excessive load can be determined by determining whether the actual load level is higher than the load level threshold.

[0067] Here, in situations where the template matching degree is high and / or the system load is high, in order to ensure response speed, a template-based first strategy information approach can be used to generate the welcome message.

[0068] It should be noted that the load threshold can be dynamically updated and adjusted to ensure a rapid response under high load conditions. The various welcome message templates in the template library can also be dynamically updated without system downtime.

[0069] In practice, the semantic similarity between template vectors in template vector libraries (e.g., FAISS, Milvus) and user-encoded information can be used as the confidence level for rule matching.

[0070] The second step involves generating scenario information representing the second applicable scenario in response to situations where the rule matching confidence of each welcome message template in the template library is lower than the second confidence level, and / or the user value corresponding to the target user meets the preset value condition. The second confidence level is a threshold used to measure whether the rule matching confidence is low enough (i.e., whether a suitable welcome message template cannot be matched). The second confidence level can be set based on historical experience. For example, the second confidence level can be 0.6. The user value corresponding to the target user can be the value created by the user or the value embodied by the user. For example, the user value corresponding to the target user can be measured by whether the user is a high-value customer (i.e., whether they are a VIP customer). The preset value condition can be that the user value is higher than the target value (i.e., representing the target user as a high-value user). That is, for high-value users, a generative model-based strategy can be used to generate appropriate welcome messages to improve the experience.

[0071] In some optional implementations of certain embodiments, determining the scenario information for generating the welcome message based on the aforementioned user coding information includes: The aforementioned execution entity can determine the scenario information as either the first applicable scenario or the second applicable scenario based on the user's encoded information and a pre-built reinforcement learning agent. The reinforcement learning agent can automatically learn which type of welcome message strategy (rule template or generative model) to use to generate the welcome message within the given context, based on historical interaction feedback (e.g., user response rate, dwell time). The reinforcement learning agent optimizes strategy selection using the Q-learning algorithm to maximize long-term user interaction rate. Specifically, in the reinforcement learning model corresponding to the reinforcement learning agent, the state can be a contextual feature vector (profile + time + location, etc.). The action can be selecting a template or calling the generative model. The reward can be subsequent user behavior (whether to interact, whether to click on a product). The model output is the optimal strategy selection path.

[0072] Step 204: Based on the above strategy information and user coding information, generate a welcome message text that combines the user's contextual semantic content.

[0073] In some embodiments, the executing entity may generate a welcome message text that incorporates the user's contextual semantic content, based on the aforementioned policy information and user encoding information. The welcome message text may be a text-based welcome message.

[0074] For example, the semantic content of the user context is: User gender: female, age: 28, consumption level: high, interest tags: skincare, makeup, last inquiry: foundation (3 days ago). Current time: Saturday 19:30, City: Shanghai, Weather: Sunny, 22℃, Regular customer: Yes. Corresponding welcome message: "Hi~ Good evening, fairy! The weather in Shanghai is amazing today. I just confirmed that the foundation you were looking at before is in stock now, would you like to take a look?"

[0075] As an example, the aforementioned executing entity can input user coding information into the corresponding policy information to obtain the welcome message text.

[0076] In some optional implementations of certain embodiments, the execution entity may generate a welcome message text that combines user context semantic content based on the aforementioned strategy information and user encoding information, including the following steps: The first step, in response to the aforementioned strategy information (specifically, the first strategy information), is to load the corresponding welcome message template from the template library. The template library stores templates of various message types.

[0077] The second step is to add the corresponding parameter content from the user coding information to the welcome message template to obtain the initial welcome message information. This initial welcome message information can be the message information before any post-syntactic processing. The parameter content can be the feature content corresponding to each user's key features from the user coding information.

[0078] The third step involves post-processing the initial welcome message information to obtain the final welcome text. This post-processing may include: length control, compliance filtering, and grammar correction. Length control involves controlling the length of the text corresponding to the initial welcome message information. For example, the final generated welcome text should not exceed 50 characters. Compliance filtering involves filtering inappropriate expressions using a sensitive word database. Grammar correction involves using a lightweight grammar model to correct obvious grammatical errors.

[0079] In some optional implementations of certain embodiments, the execution entity may generate a welcome message text that combines user context semantic content based on the aforementioned strategy information and user encoding information, including the following steps: The first step, in response to the aforementioned second strategy information, is to generate a welcome message prompt based on the aforementioned user coding information. This welcome message prompt can be generated using a prompt-generating model.

[0080] As an example, the aforementioned executing entity can add the feature content corresponding to each user feature of the user coding information to the initial change message generation prompt template to obtain the welcome message generation prompt information.

[0081] The second step is to input the above welcome message generation prompts into the pre-trained generative model to obtain the initial welcome message text.

[0082] The third step involves post-processing the initial welcome message text to obtain the final welcome message text. Specific post-processing details will not be elaborated upon here.

[0083] Step 205: Render and display the corresponding digital human based on the above welcome text.

[0084] In some embodiments, the aforementioned executing entity may render and display the corresponding digital human based on the aforementioned welcome message text.

[0085] As an example, firstly, the aforementioned executing entity can call the interface protocol of the digital human driving module to render the digital human corresponding to the aforementioned welcome message text. Then, the digital human is displayed.

[0086] As an example, the aforementioned execution entity can use the emotional tone control built into the TTS engine (e.g., "happy", "friendly") to automatically map the emotional polarity (positive or negative) of the welcome message text to a preset emoticon number, and use keywords to trigger at least one of the actions to render and display the corresponding digital human based on the aforementioned welcome message text.

[0087] In some optional implementations of certain embodiments, after step 205, the steps further include: The first step is to generate the sentiment tags corresponding to the above welcome text. These sentiment tags can be the emotional content of the text itself. For example, the sentiment tag could be "happy".

[0088] As an example, the aforementioned entity can input a welcoming message into a pre-trained sentiment analysis model to obtain sentiment labels.

[0089] The second step is to render and display the corresponding digital human based on the aforementioned welcome text and emotional tags.

[0090] As an example, firstly, the aforementioned executing entity can call the interface protocol of the digital human driving module and generate corresponding voice and animation commands through multimodal fusion technology to render the aforementioned welcome text and the digital human corresponding to the aforementioned emotion tags. Then, the digital human is displayed.

[0091] During the rendering of the corresponding digital human, the digital human-driven module parses emotion tags, calls TTS to generate speech with corresponding intonation, and simultaneously plays preset animation resources to ensure real-time synchronization between speech and animation. Precise synchronization between speech and animation is achieved through timestamp alignment technology.

[0092] Optionally, the emotion label can be dynamically adjusted based on user feedback (e.g., speech recognition results) to adapt to the user's emotional state. For example, if a user shows impatience, the system will automatically adjust the emotion label to "calm" to soothe the user.

[0093] In some optional implementations of certain embodiments, after step 205, the steps further include: The first step is to obtain the welcome message feedback information for the target users. This feedback information can be the target users' responses to the welcome message broadcast by the digital human. In practice, the welcome message feedback information may include: dwell time, interaction level, and conversion rate.

[0094] The second step is to store the feedback information for the above welcome message.

[0095] The third step involves updating the strategy information used to generate the welcome messages based on the stored feedback information from each welcome message. Specifically, for the first strategy information, the content of each welcome message template in the template library can be updated. For the second strategy information, the generative model can be trained.

[0096] As an example, by collecting real-time user interaction feedback data (such as response speed, product clicks, and order placements), online learning algorithms (such as Online Gradient Descent) are used to dynamically adjust strategy weights. This mechanism, through incremental learning, ensures that the model can quickly adapt to changes in data distribution. A multi-armed bandit framework can also be built to compare the effects of different strategy groups and automatically select the optimal strategy. Bayesian optimization improves the efficiency and accuracy of strategy selection. Furthermore, an automated operations platform enables hot updates of strategies and dynamic fine-tuning of the model without manual intervention. This platform, through CI / CD (Continuous Integration / Continuous Deployment) processes, ensures the stability and reliability of the system.

[0097] The above embodiments of this disclosure have the following beneficial effects: Through the digital human display method of some embodiments of this disclosure, personalized welcome messages and corresponding emotional tags can be generated based on the personalized content corresponding to the target user, thereby enhancing interaction with the target user. Specifically, existing methods often suffer from low personalization and a lack of context awareness in the generated welcome messages, resulting in mechanical and rigid messages. In the field of digital humans, this affects immersion and realism. Therefore, the digital human display method of some embodiments of this disclosure first acquires real-time user data corresponding to the target user, so as to realize the personalized generation of subsequent welcome messages based on the real-time user content corresponding to the target user, adapting to the initial interaction with the target user and improving interactivity. Then, the real-time user data is subjected to feature encoding processing to obtain user encoding information, which encodes and integrates the unstructured data corresponding to the real-time user data, facilitating the determination of subsequent strategy information and the use of strategies for generating corresponding welcome messages. Next, based on the user encoding information, the most suitable strategy information for generating welcome messages can be selected, so as to ensure the accurate and efficient generation of subsequent welcome message text. Furthermore, based on the aforementioned strategy information and user coding information, a welcome message and sentiment tags that combine the user's contextual semantic content can be accurately and efficiently generated. Here, the generated welcome message and sentiment tags can be customized based on the target user's personal information, ensuring the welcome message aligns with the target user's preferences and enhancing interaction. Finally, based on the aforementioned welcome message and sentiment tags, a corresponding digital persona is rendered and displayed. This digital persona rendering and display method vividly achieves personalized interaction with the target user, improving their positive interaction experience. In summary, by using real-time user data corresponding to the target user and selecting appropriate strategy information, personalized welcome messages for the target user can be accurately and efficiently generated. Furthermore, by generating sentiment tags and displaying a digital persona, interaction with the target user can be enhanced.

[0098] Further reference Figure 3 The flowchart 300 illustrates some other embodiments of the digital human display method according to the present disclosure. The digital human display method includes the following steps: Step 301: Obtain real-time user data corresponding to the target user.

[0099] Step 302: Perform feature encoding processing on the above real-time user data to obtain user encoding information.

[0100] Step 303: Based on the user coding information mentioned above, determine the strategy information used to generate the welcome message.

[0101] Step 304: Extract the encoded content related to the user's historical dialogue from the above user encoded information.

[0102] In some embodiments, the executing entity (e.g. Figure 1 The electronic device 101 shown can extract encoded content related to the user's historical dialogue from the aforementioned user encoded information. The user's historical dialogue can be a record of user dialogues from a historical time period.

[0103] Step 305: Generate the unprocessed intent corresponding to the above encoded content.

[0104] In some embodiments, the aforementioned executing entity may generate unprocessed intents corresponding to the aforementioned encoded content. These unprocessed intents may be incomplete consultation or purchase intents recorded in historical user conversations.

[0105] As an example, the aforementioned executing entity can generate the unprocessed intent corresponding to the encoded content using an intent recognition model (i.e., a model that captures key information based on an attention mechanism).

[0106] Step 306: Generate the welcome message text based on the above-mentioned strategy information, user coding information, and unprocessed intent.

[0107] In some embodiments, the executing entity may generate the welcome message text based on the policy information, the user coding information, and the unprocessed intent. The welcome message text may be a message text that guides the continuation of the unprocessed intent. For example, the welcome message text after adding the unprocessed intent could be, "The product you asked about before is now in stock."

[0108] As an example, in response to the first policy information, the aforementioned executing entity can add the key parameter content corresponding to the user-encoded information and the aforementioned unprocessed intent to the intent continuation template to obtain the aforementioned welcome message text. The intent continuation template can be dynamically populated with variables using natural language generation (NLG) technology to ensure the natural flow of the sentences. In response to the second policy information, based on the aforementioned user-encoded information and the aforementioned unprocessed intent, the aforementioned welcome message text is generated through prompt word generation.

[0109] Step 307: Render and display the corresponding digital human based on the above welcome text.

[0110] In some embodiments, the specific implementation of steps 301-303 and 307 and their resulting technical effects can be found in [reference needed]. Figure 2 Steps 201-203 and 205 in the corresponding embodiments will not be repeated here.

[0111] from Figure 3 It can be seen from this that, with Figure 2 Compared to the description of some corresponding embodiments, Figure 3 In some corresponding embodiments, the process 300 of the digital human display method can enhance the user's sense of being understood and improve the user experience by identifying unfinished intentions in the user's historical dialogue and guiding the generation of welcome text.

[0112] Further reference Figure 4 As an implementation of the methods shown in the above figures, this disclosure provides some embodiments of a digital human display device, which are similar to... Figure 2 Corresponding to the method embodiments shown, this digital human display device can be specifically applied to various electronic devices.

[0113] like Figure 4 As shown, a digital human display device 400 includes: an acquisition unit 401, a processing unit 402, a determining unit 403, a generating unit 404, and a display unit 405. The acquisition unit 401 is configured to acquire real-time user data corresponding to a target user; the processing unit 402 is configured to perform feature encoding processing on the real-time user data to obtain user encoding information; the determining unit 403 is configured to determine strategy information for generating a welcome message based on the user encoding information; the generating unit 404 is configured to generate a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information; and the display unit 405 is configured to render and display the corresponding digital human based on the welcome message text.

[0114] In some optional implementations of some embodiments, the aforementioned strategy information includes: a first strategy information based on template filling and a second strategy information based on a generative model; and the determining unit 403 may be further configured to: determine scenario information for generating a welcome message based on the aforementioned user coding information; in response to the aforementioned scenario information being characterized as the aforementioned first strategy information corresponding to a first applicable scenario, determine the aforementioned first strategy information as strategy information, wherein the aforementioned first applicable scenario is a scenario in which the scenario content corresponds to a scenario whose determinism satisfies a first condition; in response to the aforementioned scenario information being characterized as the aforementioned second strategy information corresponding to a second applicable scenario, determine the aforementioned second strategy information as strategy information, wherein the aforementioned second applicable scenario is a scenario in which the scenario content corresponds to a scenario whose naturalness satisfies a second condition.

[0115] In some optional implementations of some embodiments, the generation unit 404 may be further configured to: in response to the above-mentioned strategy information being the above-mentioned first strategy information, load the corresponding welcome message template from the template library; add the corresponding parameter content in the above-mentioned user coding information to the above-mentioned welcome message template to obtain initial welcome message information; and perform post-processing on the above-mentioned initial welcome message information to obtain welcome message text.

[0116] In some optional implementations of certain embodiments, the generation unit 404 may be further configured to: in response to the above-mentioned strategy information being the above-mentioned second strategy information, generate welcome message generation prompt information according to the above-mentioned user encoding information; input the above-mentioned welcome message generation prompt information into a pre-trained generative model to obtain an initial welcome message text; and perform post-processing on the above-mentioned initial welcome message text to obtain a welcome message text.

[0117] In some optional implementations of some embodiments, the determining unit 403 may be further configured to: generate scenario information representing the first applicable scenario in response to the existence of a welcome message template in the template library with a rule matching confidence higher than a first confidence level with respect to the user coding information, and / or the load of the current system environment meets the target load condition; and generate scenario information representing the second applicable scenario in response to the existence of a welcome message template in the template library with a rule matching confidence level lower than a second confidence level, and / or the user value corresponding to the target user meets the preset value condition.

[0118] In some optional implementations of certain embodiments, the processing unit 402 may be further configured to: dynamically allocate weights to each context feature content in the real-time user data using a dynamic weight adjustment method to obtain allocated user data; perform standardized label conversion on the allocated user data to obtain converted data; and perform feature encoding processing on the converted data to obtain user encoding information.

[0119] In some optional implementations of some embodiments, the determining unit 403 may be further configured to: determine the scene information as the first applicable scene or the second applicable scene based on the user coding information and using a pre-built reinforcement learning agent.

[0120] In some optional implementations of some embodiments, the generation unit 404 may be further configured to: extract encoded content related to the user's historical dialogue from the aforementioned user encoded information; generate unprocessed intents corresponding to the aforementioned encoded content; and generate the aforementioned welcome message text and sentiment tags based on the aforementioned strategy information, the aforementioned user encoded information, and the aforementioned unprocessed intents, wherein the aforementioned welcome message text may be a message text that guides the continuation of the aforementioned unprocessed intents.

[0121] In some optional implementations of certain embodiments, the apparatus 400 further includes: an information acquisition unit, a storage unit, and an update unit (not shown in the figure). The information acquisition unit can be configured to acquire welcome message feedback information corresponding to the target user. The storage unit can be configured to store the welcome message feedback information. The update unit can be configured to update the strategy information used to generate the welcome message based on the stored welcome message feedback information upon reaching the strategy update time.

[0122] In some optional implementations of certain embodiments, the apparatus 400 further includes a tag generation unit and a digital human rendering unit (not shown in the figure). The tag generation unit can be configured to generate emotion tags corresponding to the welcome text. The digital human rendering unit can be configured to render and display the corresponding digital human based on the welcome text and the emotion tags.

[0123] It is understandable that the units described in the digital human display device 400 are related to the reference Figure 2 The steps in the described method correspond to each other. Therefore, the operations, features, and beneficial effects described above for the method also apply to the digital human display device 400 and the units contained therein, and will not be repeated here.

[0124] The following is for reference. Figure 5 It illustrates electronic devices suitable for implementing some embodiments of this disclosure (e.g., Figure 1 A schematic diagram of the structure of electronic device 101)500. Figure 5 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments of this disclosure.

[0125] like Figure 5 As shown, the electronic device 500 may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 501, which can perform various appropriate actions and processes according to a program stored in a read-only memory 502 or a program loaded from a storage device 508 into a random access memory 503. The random access memory 503 also stores various programs and data required for the operation of the electronic device 500. The processing unit 501, the read-only memory 502, and the random access memory 503 are interconnected via a bus 504. An input / output interface 505 is also connected to the bus 504.

[0126] Typically, the following devices can be connected to the input / output interface 505: input devices 506 including, for example, a touchscreen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a liquid crystal display (LCD), speaker, vibrator, etc.; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and communication devices 509. Communication device 509 allows electronic device 500 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 5 An electronic device 500 with various devices is shown; however, it should be understood that it is not required to implement or possess all of the devices shown. More or fewer devices may be implemented or possessed alternatively. Figure 5 Each box shown can represent a device or multiple devices as needed.

[0127] In particular, according to some embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, some embodiments of this disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via a communication device 509, or installed from a storage device 508, or installed from a read-only memory 502. When the computer program is executed by the processing device 501, it performs the functions defined above in the methods of some embodiments of this disclosure.

[0128] It should be noted that, in some embodiments of this disclosure, the computer-readable medium described above may be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium may be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In some embodiments of this disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In some embodiments of this disclosure, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.

[0129] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol) and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future-developed networks.

[0130] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device. The aforementioned computer-readable medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: acquire real-time user data corresponding to the target user; perform feature encoding processing on the real-time user data to obtain user encoding information; determine strategy information for generating a welcome message based on the user encoding information; generate a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information; and render and display the corresponding digital human based on the welcome message text.

[0131] Computer program code for performing operations of some embodiments of this disclosure can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).

[0132] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0133] The units described in some embodiments of this disclosure can be implemented in software or hardware. The described units can also be housed in a processor; for example, a processor may be described as including an acquisition unit, a processing unit, a determining unit, a generating unit, and a display unit. The names of these units do not necessarily limit the specific unit; for example, an acquisition unit may also be described as "a unit that acquires real-time user data corresponding to a target user."

[0134] The functions described above in this document can be performed at least in part by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), system-on-a-chip (SoCs), complex programmable logic devices (CPLDs), and so on.

[0135] Some embodiments of this disclosure also provide a computer program product, including a computer program that, when executed by a processor, implements any of the digital human display methods described above.

[0136] The above description is merely a selection of preferred embodiments of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of the invention involved in the embodiments of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described inventive concept. For example, technical solutions formed by substituting the above-described features with (but not limited to) technical features with similar functions disclosed in the embodiments of this disclosure.

Claims

1. A method for displaying a digital human, comprising: Obtain real-time user data corresponding to the target user; The real-time user data is subjected to feature encoding processing to obtain user encoding information; Based on the user coding information, determine the strategy information used to generate the welcome message; Based on the strategy information and the user encoding information, a welcome message text is generated that combines the semantic content of the user's context. Based on the welcome message text, render and display the corresponding digital human.

2. The method according to claim 1, wherein, The strategy information includes: a first strategy information based on template filling and a second strategy information based on a generative model; and The step of determining the strategy information for generating the welcome message based on the user coding information includes: Based on the user coding information, determine the scenario information used to generate the welcome message; In response to the scenario information being characterized as the first strategy information corresponding to a first applicable scenario, the first strategy information is determined as strategy information, wherein the first applicable scenario is a scenario in which the scenario content definitively satisfies a first condition; In response to the scenario information being characterized as the second strategy information corresponding to a second applicable scenario, the second strategy information is determined as strategy information, wherein the second applicable scenario is a scenario in which the naturalness of the scenario content satisfies the second condition.

3. The method according to claim 2, wherein, The step of generating a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information includes: In response to the strategy information being the first strategy information, the corresponding welcome message template is loaded from the template library; Add the corresponding parameter content from the user coding information to the welcome message template to obtain the initial welcome message information; The initial welcome message information is processed post-processed to obtain the welcome message text.

4. The method according to claim 2, wherein, The step of generating a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information includes: In response to the strategy information being the second strategy information, a welcome message generation prompt is generated based on the user coding information; The welcome message generation prompt is input into a pre-trained generative model to obtain the initial welcome message text; The initial welcome message text is processed to obtain the final welcome message text.

5. The method according to claim 2, wherein, The step of determining the scenario information for generating the welcome message based on the user coding information includes: In response to the existence of a welcome message template in the template library with a rule matching confidence higher than the first confidence level with the user's encoded information, and / or the current system environment load meets the target load condition, scenario information representing the first applicable scenario is generated; In response to the rule matching confidence of each welcome message template in the template library being lower than the second confidence, and / or the user value corresponding to the target user meeting the preset value condition, scenario information representing the second applicable scenario is generated.

6. The method according to claim 1, wherein, The step of performing feature encoding processing on the real-time user data to obtain user encoding information includes: The dynamic weight adjustment method is used to dynamically allocate weights to each context feature in the real-time user data to obtain the allocated user data. The allocated user data is standardized and tagged to obtain transformed data; The transformed data is subjected to feature encoding processing to obtain user encoding information.

7. The method according to claim 2, wherein, The step of determining the scenario information for generating the welcome message based on the user coding information includes: Based on the user coding information, the scenario information is determined to be either the first applicable scenario or the second applicable scenario using a pre-built reinforcement learning agent.

8. The method according to claim 1, wherein, The step of generating a welcome message text that combines the user's contextual semantic content based on the strategy information and the user encoding information includes: Extract encoded content related to the user's historical conversations from the user's encoded information; Generate the unprocessed intent corresponding to the encoded content; The welcome message text is generated based on the strategy information, the user coding information, and the unprocessed intent, wherein the welcome message text may be a message text that guides the continuation of the unprocessed intent.

9. The method according to claim 1, wherein, The method further includes: Obtain the welcome message feedback information corresponding to the target user; The feedback information of the welcome message is stored; In response to the policy update time, the policy information used to generate the welcome message is updated based on the stored feedback information of each welcome message.

10. The method according to claim 1, wherein, The process of rendering and displaying the corresponding digital human based on the welcome message text includes: Generate the sentiment tags corresponding to the welcome message text; Based on the welcome message text and the emotion tags, render and display the corresponding digital human.

11. A digital human display device, comprising: The acquisition unit is configured to acquire real-time user data corresponding to the target user. The processing unit is configured to perform feature encoding processing on the real-time user data to obtain user encoding information; The determining unit is configured to determine strategy information for generating a welcome message based on the user coding information; The generation unit is configured to generate a welcome message text that combines the user's contextual semantic content, based on the strategy information and the user encoding information. The display unit is configured to render and display the corresponding digital human based on the welcome message text.

12. An electronic device, comprising: One or more processors; Storage device, on which one or more programs are stored, When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-10.

13. A computer-readable medium having a computer program stored thereon, wherein, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-10.

14. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1-10.