Virtual avatar-based data processing method and apparatus, and readable storage medium
By displaying and updating the virtual avatars of both parties in the instant chat interface, and generating chat messages by combining voice and image data, the problem of not being able to record and display historical chat messages is solved, enabling rich display methods and real-time status presentation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TENCENT TECHNOLOGY (SHENZHEN) CO LTD
- Filing Date
- 2021-10-28
- Publication Date
- 2026-06-23
AI Technical Summary
In existing real-time conversation scenarios, virtual avatars cannot record and display historical conversation messages, and the display methods are relatively simple, failing to present the object's state in real time.
The virtual avatars of both parties in the conversation are displayed in the conversation interface, and the virtual avatars are updated by triggering operations. Conversation messages are generated by combining voice and image data, and the virtual avatars are updated in real time to reflect the status of the objects. At the same time, historical conversation messages are recorded and displayed.
It enables the normal recording and display of historical conversation messages in real-time conversation scenarios with virtual avatars, enriches the display methods, can present the object status in real time, and enhances the sense of presence in the conversation.
Smart Images

Figure CN116048310B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of Internet technology, and in particular to a data processing method, apparatus and readable storage medium based on virtual avatars. Background Technology
[0002] With the continuous development of Internet technology, more and more users tend to communicate with others through applications with instant messaging capabilities. During instant messaging, users can send various multimedia data such as text, images, voice, and video according to their own needs, thereby achieving the purpose of information exchange and dissemination.
[0003] In existing real-time conversation scenarios, such as text-based or voice-based communication, it's difficult to visually represent the states of both parties. While audio-visual conversations allow for real-time display of each other's states (e.g., emotions) through changes in virtual avatars, enhancing the sense of presence, this method suffers from the difficulty of tracing back historical conversation messages. Therefore, existing real-time conversation display methods are relatively limited and cannot record and display historical conversation messages in scenarios using virtual avatars. Summary of the Invention
[0004] This application provides a data processing method, apparatus, and readable storage medium based on virtual avatars, which can enrich the display methods of real-time conversations and maintain the normal recording and display of historical conversation messages in real-time conversation scenarios using virtual avatars.
[0005] One embodiment of this application provides a data processing method based on virtual avatars, including:
[0006] The conversation interface is displayed when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, the first virtual avatar of the first object and the second virtual avatar of the second object are displayed.
[0007] In response to a trigger operation on the conversation interface, a first conversation message sent by the first object to the second object is displayed in the message display area of the conversation interface; the first conversation message carries media data of the first object associated with the first object;
[0008] In the virtual avatar display area containing the second virtual avatar, the first virtual avatar is updated to the first virtual updated avatar; the first virtual updated avatar is obtained by updating the first virtual avatar based on the first object media data.
[0009] One embodiment of this application provides a data processing method based on virtual avatars, including:
[0010] The conversation interface is displayed when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, the first virtual avatar of the first object and the second virtual avatar of the second object are displayed.
[0011] In response to a trigger operation on the conversation interface, a voice control and an image capture area for capturing object image data of a first object are output. When the first object enters voice information through the voice control, the conversation image data of the first object during the real-time conversation is displayed in the image capture area.
[0012] The message display area of the conversation interface displays the first conversation message sent by the first object to the second object; the first conversation message carries the first object media data associated with the first object; the first object media data is determined based on the conversation image data and voice information.
[0013] This data processing method also includes:
[0014] In the virtual avatar display area containing the second virtual avatar, the first virtual avatar is updated to the first virtual updated avatar; the first virtual updated avatar is obtained by updating the first virtual avatar based on the first object media data.
[0015] This data processing method also includes:
[0016] When displaying audio-visual effects and animations carrying conversation image data with voice information in the message display area, the voice information is processed by voice conversion to obtain the second converted text information corresponding to the voice information, and the second converted text information is highlighted synchronously in the message display area.
[0017] One embodiment of this application provides a data processing device based on a virtual avatar, including:
[0018] The first display module is used to display the conversation interface when the first object and the second object are having a real-time conversation. In the virtual image display area of the conversation interface, the first virtual image of the first object and the second virtual image of the second object are displayed.
[0019] The second display module is used to respond to a trigger operation on the session interface and display a first session message sent by the first object to the second object in the message display area of the session interface; the first session message carries media data of the first object associated with the first object;
[0020] The first update module is used to update the first virtual image to a first virtual updated image in the virtual image display area containing the second virtual image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data.
[0021] The first display module includes:
[0022] The first display unit is used to display the conversation interface when the first object and the second object are having a real-time conversation.
[0023] The image determination unit is used to use the local virtual image used to represent the first object as the first virtual image of the first object, and the local virtual image used to represent the second object as the second virtual image of the second object.
[0024] The second display unit is used to display the first virtual avatar and the second virtual avatar in the virtual avatar display area of the session interface.
[0025] The conversation interface includes a message display area for showing historical conversation messages; the historical conversation messages are recorded conversation messages between the first object and the second object during an instant conversation; the device also includes:
[0026] The area hiding module is used to respond to the operation of hiding the message display area, hide the message display area in the conversation interface, and use the display interface where the virtual image display area is located as the conversation update interface.
[0027] The second update module is used to update the first virtual image from a partial virtual image of the first object to a complete virtual image of the first object in the session update interface, and to update the second virtual image from a partial virtual image of the second object to a complete virtual image of the second object; when the first object and the second object are having an instant conversation, the second session message sent by the first object to the second object is displayed in the session update interface.
[0028] The device also includes:
[0029] The state switching module is used to respond to the business state switching operation of the first object, update the business state of the first object to the business update state, and update the first virtual image to the second virtual update image that matches the business update state in the virtual image display area containing the second virtual image.
[0030] The second display module includes:
[0031] The first data determination unit is used to determine a first type of image data that characterizes the object state of the first object during an instantaneous session in response to a trigger operation on the session interface.
[0032] The first message generation unit is used to take the first type of image data as the first object media data associated with the first object, generate a first session message based on the first object media data for sending to the second object, and display the first session message in the message display area of the session interface.
[0033] The first data determining unit includes:
[0034] The text mapping subunit is used to respond to a trigger operation on the text input control in the session interface, display the text information entered through the text input control; when it is detected that the text information carries state mapping text, it displays the first type of image data mapped by the state mapping text, which represents the object state of the first object when conducting an instant session.
[0035] The first data determining unit includes:
[0036] The data selection subunit is used to output an image selection panel associated with the status display control in response to a trigger operation on the status display control in the session interface; and to use the image data corresponding to the selection operation as a first type of image data to represent the object state of the first object during an instant session in response to a selection operation on the image selection panel.
[0037] The first data determining unit includes:
[0038] The data determination subunit is used to determine the target image data in the session interface in response to the determination operation of the target image data, and to use the target image data as a first type of image data to characterize the object state of the first object during the instant session.
[0039] The second display module includes:
[0040] The data capture unit is used to respond to the trigger operation of the conversation interface and, when the first object enters voice information through the voice control, call the camera to capture the object image data of the first object.
[0041] The second data determination unit is used to adjust the image state of the first virtual image based on the object image data when the object image data is captured, and to generate a second type of image data to represent the object state of the first object when conducting an instant conversation based on the first virtual image after the image state is adjusted.
[0042] The second message generation unit is used to integrate the second type of image data with voice information to obtain the first object media data associated with the first object, generate a first session message based on the first object media data for sending to the second object, and display the first session message in the message display area of the session interface.
[0043] The second data determining unit includes:
[0044] The state detection subunit is used to perform state detection on the object image data when the object image data is captured, and to use the detected state as the object state to characterize the first object when it is conducting an instant session.
[0045] The state adjustment subunit is used to acquire the first virtual image, adjust the image state of the first virtual image based on the object state, and generate a second type of image data to represent the object state based on the first virtual image after adjusting the image state.
[0046] The second message generation unit includes:
[0047] The first upload subunit is used to upload the second type of image data to the server, and when the second type of image data is successfully uploaded, it obtains the image resource identifier corresponding to the second type of image data.
[0048] The second uploading subunit is used to upload voice information to the server. When the voice information is successfully uploaded, the corresponding voice resource identifier is obtained.
[0049] The first integration subunit is used to integrate the second type of image data carrying image resource identifiers and the voice information carrying voice resource identifiers to obtain the first object media data associated with the first object.
[0050] The device also includes:
[0051] The message generation module is used to determine a third type of image data based on the first virtual image when no object image data is captured, integrate the third type of image data with voice information to obtain the first object media data associated with the first object, and generate a first session message to be sent to the second object based on the first object media data.
[0052] The second message generation unit includes:
[0053] The speech conversion subunit is used to convert speech information to speech information, obtain the converted text information corresponding to the speech information, and display the converted text information in the image capture area used to capture image data of the object.
[0054] The second integration subunit is used to integrate the converted text information with the media data of the first object to obtain a first session message for sending to the second object.
[0055] The device also includes:
[0056] The voice playback module is used to respond to a trigger operation on the first object media data carried in the first session message, play the voice information, display the sound effects and animations associated with the voice information in the message display area, and synchronously highlight the converted text information contained in the first session message.
[0057] Specifically, the first update module is used to update the first virtual image based on the first object media data to obtain a first virtual updated image that matches the first object media data, and to update the first virtual image to the first virtual updated image in the virtual image display area containing the second virtual image.
[0058] The first update module includes:
[0059] The first updating unit is used to update the first virtual image based on the first type of image data when the first object media data contains the first type of image data, so as to obtain a first virtual updated image that matches the first type of image data.
[0060] The second updating unit is used to update the first virtual image to the first updated virtual image in the virtual image display area containing the second virtual image;
[0061] The device also includes:
[0062] The third display module is used to display the first type of image data in the virtual image display area.
[0063] The first update unit includes:
[0064] The data detection subunit is used to detect media data in the first session message through the message manager. If the first object media data carried in the first session message contains first type of image data, a state trigger event is generated and sent to the virtual avatar processor. The state trigger event includes the object identifier of the first object and a list of image data. The list of image data is used to record the first type of image data contained in the first object media data.
[0065] The update subunit is used to update the first virtual image associated with the object identifier based on the first type of image data in the image data list when the virtual image processor receives a state trigger event, so as to obtain a first virtual updated image that matches the first type of image data.
[0066] The first update module includes:
[0067] The third updating unit is used to update the first virtual image based on the second type of image data in response to a triggering operation on the first object media data when the first object media data contains the second type of image data, so as to obtain a first virtual updated image that matches the second type of image data.
[0068] The fourth update unit is used to update the first virtual image to the first updated virtual image in the virtual image display area containing the second virtual image.
[0069] The device also includes:
[0070] The background update module is used to use the virtual background associated with the first virtual image and the second virtual image as the original virtual background in the virtual image display area; when the background mapping text is detected in the first session message, the original virtual background is updated to a virtual update background based on the background mapping text; and the first virtual update image, the second virtual image and the virtual update background are fused to obtain a fused background virtual image for display in the virtual image display area.
[0071] The device also includes:
[0072] The fourth display module is used to display the third session message in the message display area of the session interface when the third session message sent by the second object is received; the third session message carries the media data of the second object associated with the second object.
[0073] The third update module is used to update the second virtual image to a third virtual updated image in the virtual image display area containing the second virtual image; the third virtual updated image is obtained by updating the second virtual image based on the second object media data.
[0074] The first display module includes:
[0075] The resource request unit is used to obtain the first image resource identifier and the first business status corresponding to the first object, and to obtain the second image resource identifier and the second business status corresponding to the second object. Based on the first image resource identifier, the first business status, the second image resource identifier, and the second business status, it generates an image resource acquisition request and sends the image resource acquisition request to the server. When the server receives the image resource acquisition request, it generates the first image resource address corresponding to the first virtual image based on the first image resource identifier and the first business status, and generates the second image resource address corresponding to the second virtual image based on the second image resource identifier and the second business status. It then returns the first image resource address and the second image resource address.
[0076] The image acquisition unit is used to receive the first image resource address and the second image resource address returned by the server, acquire the first virtual image associated with the first object based on the first image resource address, and acquire the second virtual image associated with the second object based on the second image resource address.
[0077] The third display unit is used to display the first virtual avatar and the second virtual avatar in the virtual avatar display area of the session interface.
[0078] One embodiment of this application provides a data processing device based on a virtual avatar, including:
[0079] The first display module is used to display the conversation interface when the first object and the second object are having a real-time conversation. In the virtual image display area of the conversation interface, the first virtual image of the first object and the second virtual image of the second object are displayed.
[0080] The voice input module is used to respond to the trigger operation of the conversation interface, output the voice control and the image capture area for capturing the object image data of the first object. When the first object inputs voice information through the voice control, the conversation image data of the first object during the real-time conversation is displayed in the image capture area.
[0081] The second display module is used to display a first session message sent by the first object to the second object in the message display area of the session interface; the first session message carries first object media data associated with the first object; the first object media data is determined based on session image data and voice information.
[0082] The device also includes:
[0083] The image update module is used to update the first virtual image to a first virtual updated image in the virtual image display area containing the second virtual image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data.
[0084] The first object media data includes conversational image data carrying voice information; the conversational image data is generated based on the captured object image data of the first object, or the conversational image data is obtained by adjusting the image state of the first virtual avatar based on the captured object image data of the first object;
[0085] The device also includes:
[0086] The third display module is used to respond to a trigger operation on the first object media data carried by the first session message, play voice information, and display sound effects and animations carrying the session image data in the message display area.
[0087] The first session message also includes first converted text information obtained by converting speech information into speech in the image capture area;
[0088] The device also includes:
[0089] The fourth display module is used to synchronously highlight the first conversion text information in the message display area when displaying audio-visual effects and animations carrying conversation image data with voice information in the message display area.
[0090] The device also includes:
[0091] The fifth display module is used to perform voice conversion processing on the voice information when displaying audio-visual animations carrying conversation image data in the message display area, to obtain the second converted text information corresponding to the voice information, and to synchronously highlight the second converted text information in the message display area.
[0092] One embodiment of this application provides a computer device, including: a processor and a memory;
[0093] The processor is connected to a memory, which stores a computer program. When the computer program is executed by the processor, it causes the computer device to perform the method provided in the embodiments of this application.
[0094] One aspect of this application provides a computer-readable storage medium storing a computer program adapted to be loaded and executed by a processor, so that a computer device having the processor performs the method provided in this application.
[0095] One embodiment of this application provides a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the method provided in this application embodiment.
[0096] In this embodiment, when the first object and the second object are in an instant conversation, a conversation interface for the instant conversation can be displayed. In the virtual avatar display area of the conversation interface, a first virtual avatar of the first object and a second virtual avatar of the second object are displayed. Furthermore, in response to a trigger operation on the conversation interface, a first conversation message sent by the first object to the second object can be displayed in the message display area of the conversation interface. Further, since the first conversation message carries first object media data associated with the first object, the first virtual avatar can be updated to a first updated virtual avatar based on the first object media data in the virtual avatar display area containing the second virtual avatar. Therefore, during real-time conversations via the conversation interface, the virtual avatars of both parties (i.e., the first object and the second object) can be displayed in the virtual avatar display area, and the conversation messages generated during the real-time conversation (i.e., historical conversation messages, such as the first conversation message) can be displayed in the message display area. This makes it convenient for both parties to review historical conversation messages without encountering a situation where historical conversation messages cannot be recorded and displayed normally. In other words, the embodiments of this application can maintain the normal recording and display of historical conversation messages in real-time conversation scenarios using virtual avatars. In addition, the first virtual avatar can be updated based on the first object media data carried by the first conversation message to present the object status of the first object in real time, thereby enriching the display methods of real-time conversations. Attached Figure Description
[0097] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0098] Figure 1 This is a schematic diagram of a network architecture provided in an embodiment of this application;
[0099] Figure 2 This is a schematic diagram of a data processing scenario based on a virtual avatar provided in an embodiment of this application;
[0100] Figure 3 This is a flowchart illustrating a data processing method based on a virtual avatar provided in an embodiment of this application;
[0101] Figure 4 This is a schematic diagram of a conversational interface provided in an embodiment of this application;
[0102] Figure 5 This is a schematic diagram of a scene for determining image data provided in an embodiment of this application;
[0103] Figure 6 This is a schematic diagram of a scene for selecting image data provided in an embodiment of this application;
[0104] Figure 7 This is a schematic diagram of a scene for determining image data provided in an embodiment of this application;
[0105] Figure 8 This is a schematic diagram of a scenario for recording voice information provided in an embodiment of this application;
[0106] Figure 9 This is a schematic diagram of a scenario for recording voice information provided in an embodiment of this application;
[0107] Figure 10 This is a flowchart illustrating a data processing method based on a virtual avatar provided in an embodiment of this application;
[0108] Figure 11 This is an interactive schematic diagram of obtaining a virtual avatar provided in an embodiment of this application;
[0109] Figure 12 This is a schematic diagram of a process for recording voice information provided in an embodiment of this application;
[0110] Figure 13 This is a schematic diagram of a process for sending session messages provided in an embodiment of this application;
[0111] Figure 14 This is a schematic diagram of a process for updating a virtual avatar provided in an embodiment of this application;
[0112] Figure 15 This is a schematic diagram of a scene for updating a virtual background provided in an embodiment of this application;
[0113] Figure 16 This is a flowchart illustrating a data processing method based on a virtual avatar provided in an embodiment of this application;
[0114] Figure 17 This is a flowchart illustrating a data processing method based on a virtual avatar provided in an embodiment of this application;
[0115] Figure 18 This is a schematic diagram of the structure of a data processing device based on a virtual image provided in an embodiment of this application;
[0116] Figure 19 This is a schematic diagram of the structure of a data processing device based on a virtual image provided in an embodiment of this application;
[0117] Figure 20 This is a schematic diagram of the structure of a computer device provided in an embodiment of this application. Detailed Implementation
[0118] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.
[0119] Please see Figure 1 , Figure 1 This is a schematic diagram of a network architecture provided in an embodiment of this application. For example... Figure 1 As shown, the network architecture may include a server 100 and a terminal cluster. The terminal cluster may include user terminals 200a, 200b, 200c, ..., 200n. Communication connections may exist between the terminal cluster members; for example, there may be a communication connection between user terminals 200a and 200b, or between user terminals 200a and 200c. Simultaneously, any user terminal in the terminal cluster may have a communication connection with the server 100; for example, there may be a communication connection between user terminal 200a and server 100. The communication connection method is not limited; it can be established directly or indirectly through wired communication, wireless communication, or other methods. This application does not impose any restrictions on this method.
[0120] It should be understood that, such as Figure 1 Each user terminal in the terminal cluster shown can have an application client installed. When the application client runs on each user terminal, it can interact with the aforementioned... Figure 1Data interaction occurs between the servers 100 shown. The application client can be an instant messaging application, social networking application, live streaming application, short video application, video application, music application, shopping application, game application, novel application, payment application, browser, or other application client with instant conversation capabilities. This application client can be a standalone client or an embedded sub-client integrated into another client (such as a social networking client, game client, etc.), without limitation. Instant conversation, also known as instant messaging or instant chat, is a system service on the Internet used for real-time communication, supporting the real-time transmission of information streams such as text, voice, video, images, and documents. Taking an instant messaging application as an example, server 100 may include multiple servers such as a backend server and a data processing server corresponding to the instant messaging application. Therefore, each user terminal can transmit data with server 100 through the application client corresponding to the instant messaging application. Each user terminal can also engage in instant conversations with other user terminals through server 100 to achieve communication and sharing anytime, anywhere. For example, different user terminals can communicate instantly by sending and receiving conversation messages.
[0121] For ease of understanding, let's take user terminal 200a and user terminal 200b as examples. User terminal 200a can generate session message A through its installed instant messaging application, and then send session message A to server 100. Subsequently, user terminal 200b can receive session message A through server 100 and display session message A in the corresponding session interface of user terminal 200b. Similarly, user terminal 200b can also send session message B to user terminal 200a through its installed instant messaging application via server 100. This realizes an instant conversation between user terminal 200a and user terminal 200b.
[0122] The conversation messages may include one or more of the following message types: text messages, voice messages, emoticon messages, image messages (including static and animated images), link messages, mini-program messages, video messages, file messages, and virtual item messages (which can be used to send and receive virtual items such as virtual gifts and virtual red envelopes). This application embodiment does not limit the specific type of conversation messages.
[0123] It is understood that the methods provided in this application embodiment can be executed by computer equipment, which includes, but is not limited to, user terminals or servers. The server can be an independent physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud databases, cloud services, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms. The user terminal can be a smart terminal capable of running application clients with real-time conversation capabilities, such as a smartphone, tablet, laptop, desktop computer, PDA, mobile internet device (MID), wearable device (e.g., smartwatch, smart bracelet), smart computer, or smart vehicle. The user terminal and server can be connected directly or indirectly via wired or wireless means, and this application embodiment does not impose any limitations on this connection.
[0124] It should be noted that the user terminal can provide a conversational interface for sending and receiving conversational messages, and can also fully utilize the virtual image of the object during the instant conversation to vividly display the object's state. Here, "object" refers to the users participating in the instant conversation (e.g., user A and user B). The number of objects participating in the same instant conversation can be one or more; this application embodiment does not limit the specific number of objects. For ease of understanding and explanation in the following embodiments, in this application embodiment, objects participating in the same instant conversation are divided into a first object (e.g., object A) and a second object (e.g., object B). The first object can act as both a sender (or sender) and a receiver (or receiver) of conversational messages. Similarly, the second object can also act as both a sender and a receiver of conversational messages; this application embodiment does not limit this. The object's state refers to the object's emotions / feelings during the instant conversation, such as happiness, excitement, sadness, fear, confusion, etc. The object's state can be reflected through facial expressions, body movements, or sent conversational messages. It should be understood that in virtual social scenarios, objects can appear as virtual avatars. Therefore, in this embodiment, the virtual avatar of the first object can be referred to as the first virtual avatar, and the virtual avatar of the second object can be referred to as the second virtual avatar. The virtual avatar here can be a pre-configured avatar based on the object's needs, for example, composed of corresponding avatar resources (such as virtual wearable items, makeup, hairstyles, etc.) selected from an avatar resource library. It can also be a virtual avatar reconstructed based on the object's real image, such as a virtual avatar rendered by the rendering engine on the application client based on collected real object data (such as the object's face shape, hairstyle, clothing, etc.). The aforementioned virtual social scenario refers to a 3D (Three Dimensions) virtual space based on the future Internet, possessing characteristics of convergence and physical persistence, and characterized by connection perception and sharing, through virtual augmented physical reality; or a world of interactivity, immersion, and collaboration.
[0125] Specifically, this application provides a virtual avatar-based instant messaging design. When a first object and a second object are engaged in an instant messaging session, the user terminal of the first object can display a session interface for the session. Within the virtual avatar display area of this session interface, a first virtual avatar of the first object and a second virtual avatar of the second object are displayed. Furthermore, in response to a trigger operation on the session interface, a first session message sent by the first object to the second object is displayed in the message display area of the session interface. Further, since the first session message carries media data associated with the first object, the user terminal can update the first virtual avatar to a first updated virtual avatar in the virtual avatar display area containing the second virtual avatar based on the first object media data. In other words, the session interface provided by this application can include a virtual avatar display area and a message display area. The virtual avatar display area displays the virtual avatars of both parties in the session, while the message display area displays historical session messages generated during the instant messaging process. Therefore, either party can easily browse historical session messages through the corresponding message display area, meaning that historical session messages are traceable. Furthermore, to enrich the display methods of real-time conversations, embodiments of this application can also update the corresponding virtual avatar (e.g., the first virtual avatar) based on the object media data (e.g., the first object media data) carried in the conversation message (e.g., the first conversation message) to present the object state of the object corresponding to the virtual avatar in real time. Here, object media data refers to media data associated with an object carried in the conversation message, including but not limited to text data, image data, and voice data. Object media data can affect the virtual avatar of the object. For example, when conversation message X carries image data Y (e.g., a "smiling" emoticon) representing the object state of object A during a real-time conversation, the virtual avatar of object A can change along with the image data Y. It should be understood that in embodiments of this application, the object media data associated with the first object carried in the first conversation message can be referred to as the first object media data. Similarly, the object media data associated with the second object carried in a conversation message sent by the second object to the first object can be referred to as the second object media data.
[0126] For ease of understanding, please refer to the following: Figure 2 , Figure 2 This is a schematic diagram of a data processing scenario based on a virtual avatar, provided in an embodiment of this application. The implementation process of this data processing scenario can be as follows: Figure 1The interaction can occur in the server 100 shown, or it can occur in the user terminal, or it can be executed jointly by the user terminal and the server. There are no restrictions here. This embodiment uses the interaction between user terminal 20A, user terminal 20B, and server 20C as an example for illustration. Here, user terminal 20A can be, for example... Figure 1 Any user terminal in the terminal cluster shown, for example, user terminal 200a; user terminal 20B can also be as follows Figure 1 Any user terminal in the terminal cluster shown, for example, user terminal 200b; server 20C can be as follows: Figure 1 The server shown is 100.
[0127] like Figure 2 As shown, object A and object B are the two parties conducting an instant conversation. Object A can be the first object, and object B can be the second object. Object A is bound to user terminal 20A, and object B is bound to user terminal 20B. Object A and object B can conduct an instant conversation through their respective bound user terminals. For ease of understanding and distinction, in this embodiment, the user terminal corresponding to the first object can be referred to as the first terminal (e.g., user terminal 20A), and the user terminal corresponding to the second object can be referred to as the second terminal (e.g., user terminal 20B). It should be understood that both the first terminal and the second terminal can send and receive conversation messages. Therefore, in this embodiment, when the first terminal sends a conversation message to the second terminal, the first terminal acts as the sending terminal, and the second terminal acts as the receiving terminal; when the second terminal sends a conversation message to the first terminal, the second terminal acts as the sending terminal, and the first terminal acts as the receiving terminal.
[0128] It should be understood that the process of user terminal 20A sending a session message to user terminal 20B is the same as the process of user terminal 20B sending a session message to user terminal 20A. This application embodiment only uses user terminal 20A as the sending terminal and user terminal 20B as the receiving terminal as an example for illustration.
[0129] Specifically, after user terminal 20A (i.e., the first terminal) opens its installed application client, it can first display the application client's homepage, then display a message list on the homepage, and subsequently, in response to a selection operation on the message list, output a conversation interface 201 (also called the first conversation interface) for real-time conversation with object B. Optionally, in response to a switching operation on the application homepage, it can also display a list of conversation objects, and subsequently, in response to a selection operation on the list of conversation objects, output a conversation interface 201 for real-time conversation with object B. Figure 2As shown, the conversation interface 201 can display the name of object B, such as "lollipop," to indicate the object that object A is currently communicating with. The conversation interface 201 may also include a virtual avatar display area 201a. This virtual avatar display area 201a can be used to display the virtual avatar of object A (i.e., the first virtual avatar, for example, virtual avatar 20a) and the virtual avatar of object B (i.e., the second virtual avatar, for example, virtual avatar 20b). The virtual avatars displayed here are diverse; they can be 2D or 3D, static or dynamic, partial or complete. This embodiment does not limit the specific form of the virtual avatar.
[0130] It should be understood that the virtual avatar can be selected by the object as needed. For example, user terminal 20A can respond to object A's trigger operation on the session interface 201 by displaying one or more virtual avatars, and then respond to a virtual avatar selection operation for these one or more virtual avatars, using the virtual avatar corresponding to the selection operation as object A's virtual avatar. Optionally, the corresponding virtual avatar can also be reconstructed based on collected real object data (such as the object's face shape, hairstyle, clothing, etc.). For example, user terminal 20A can call a camera to collect image data of object A, and then extract the real object data of object A from the image data. Subsequently, it can call a rendering engine to render the virtual avatar of object A based on the real object data; or, it can match the collected real object data in an avatar resource library and combine the matched avatar resources to obtain the corresponding virtual avatar. Optionally, object A can also adjust its virtual image. For example, it can adjust the virtual wearable items (e.g., clothing, headwear, hat, glasses, backpack, etc.), hairstyle, makeup (e.g., eye shape, eyeshadow, lip shape, lip color, blush, etc.) of the virtual image to obtain the desired virtual image.
[0131] It should be noted that, optionally, the virtual avatar can be associated with the object's business status. That is, different forms or postures of the virtual avatar can be used to represent different business states of the corresponding object. Business status refers to the object's login status on the application client, which can be set by the object itself, including but not limited to online, offline, busy, gaming, resting, and invisible states. For example, if object A's current business status is online, then the displayed virtual avatar 20a can be a standing position; while if object B's current business status is resting, then the displayed virtual avatar 20b can be a sleeping position. It should be understood that when an object's business status changes, its virtual avatar will also change accordingly.
[0132] like Figure 2 As shown, the conversation interface 201 may further include a message display area 201b, in which conversation messages (i.e., historical conversation messages, such as conversation message 201d) generated during an instant conversation between object A and object B can be displayed. Optionally, in the message display area 201b, conversation messages with earlier timestamps can be displayed first, and those with later timestamps can be displayed later. Object A can scroll back through historical conversation messages by performing a browsing operation (e.g., a swipe operation) on the message display area 201b. Furthermore, the user terminal 20A can generate a conversation message (such as a first conversation message) to be sent to object B in response to a trigger operation on the conversation interface 201, and display the conversation message in the message display area 201b of the conversation interface 201. The conversation interface 201 may further include a message input control bar 201c, which may include one or more message input controls, such as text input controls, status display controls, and voice controls. These controls allow users to input relevant information to generate conversation messages. For example, a text input control can be used to input text information (such as the text "Okay"), and a status display control can be used to select image data (such as emoticons) to be sent. Optionally, conversation messages can also be generated using non-control methods, such as using historical image data (such as used emoticons) displayed in the conversation interface as conversation messages to be sent. This application embodiment does not limit the method of generating conversation messages.
[0133] It should be understood that when a session message carries object media data associated with a certain object, the virtual image of that object can be updated based on that object media data. Taking session message 201e sent by object A as an example, session message 201e carries object media data associated with object A (i.e., first object media data), such as a smiley face emoticon (which belongs to image data). This object media data can be used to characterize the object state of object A during an instant session; for example, a smiley face emoticon represents happiness. Therefore, after triggering the sending of session message 201e, user terminal 20A will display session message 201e in the message display area 201b of the session interface 201. (It should be understood that...) Figure 2 The two session interfaces 201 shown are the session interfaces of user terminal 20A at different times. Simultaneously, user terminal 20A can update the virtual avatar 20a based on the object media data in the session message 201e, thereby obtaining a virtual avatar 201f (i.e., the first updated virtual avatar) that matches the object media data. Then, in the virtual avatar display area 201a containing virtual avatar 20b, virtual avatar 20a is updated to virtual avatar 201f. For example... Figure 2As shown, the virtual avatar 20a updates according to the smiley face emoji in the conversation message 201e, and the resulting virtual avatar 201f also displays a smiley face. Optionally, the object media data carried in the conversation message 201e can also be displayed in the virtual avatar display area 201a. For example, the smiley face emoji can be displayed in the vicinity of the virtual avatar 201f (i.e., around the virtual avatar 201f). It should be understood that the object media data associated with object A may also be other types of data, such as text data, voice data, etc., which may also affect the virtual avatar of object A. This embodiment only uses image data as an example for illustration. For the specific process of updating the virtual avatar based on other types of data, please refer to the following. Figure 3 The corresponding implementation examples.
[0134] Furthermore, user terminal 20A sends the session message input by object A to server 20C. After receiving the session message, server 20C forwards it to user terminal 20B (i.e., the second terminal). User terminal 20B also updates the virtual image of object A based on the object media data associated with object A carried in the session message. Taking session message 201e as an example, such as... Figure 2 As shown, the conversation interface 202 is the conversation interface (also known as the second conversation interface) on the user terminal 20B used for instant conversation between object B and object A. The interface structure of the conversation interface 202 is the same as that of the conversation interface 201 displayed on the user terminal 20A. The conversation interface 202 includes a virtual avatar display area 202a and a message display area 202b. The virtual avatar display area 202a can display the virtual avatars of object B and object A, and the message display area 202b can display historical conversation messages. After receiving the session message 201e sent by user terminal 20A, the application client on user terminal 20B can display the session message 201e in the message display area 202b of its session interface 202. Since the session message 201e carries object media data (e.g., a smiley face emoji) associated with object A, user terminal 20B can update the virtual avatar 20a based on the object media data in the session message 201e, thereby obtaining a virtual avatar 201f that matches the object media data. In the virtual avatar display area 202a containing the virtual avatar 20b, the original virtual avatar 20a of object A is updated to the virtual avatar 201f. Optionally, the object media data carried in the session message 201e can also be displayed in the virtual avatar display area 202a. For example, a smiley face emoji can be displayed in the vicinity of the virtual avatar 201f (i.e., around the virtual avatar 201f).
[0135] It should be understood that when object B sends a session message to object A, i.e., user terminal 20B acts as the sending terminal and user terminal 20A acts as the receiving terminal, the interaction process between user terminal 20B and user terminal 20A is consistent with the interaction process described above. That is to say, both user terminals can display the session message (i.e., the third session message) sent by object B to object A in the message display area of their corresponding session interface, and update the virtual image of object B (e.g., virtual image 20b) in the corresponding virtual image display area based on the object media data (i.e., the second object media data) associated with object B carried in the session message. The specific process will not be described in detail.
[0136] As can be seen from the above, the embodiments of this application can enrich the display methods of real-time conversations based on virtual avatars, and at the same time maintain the normal recording and display of historical conversation messages in real-time conversation scenarios using virtual avatars.
[0137] Please see Figure 3 , Figure 3 This is a flowchart illustrating a data processing method based on a virtual avatar, as provided in an embodiment of this application. This data processing method can be executed by a computer device, which may include, for example... Figure 1 The user terminal or server mentioned above. In this embodiment, the user terminal corresponding to the first object is referred to as the first terminal (e.g., user terminal 200a), and the user terminal corresponding to the second object is referred to as the second terminal (e.g., user terminal 200b). Both the first and second terminals can be either sending or receiving terminals. For ease of understanding, this embodiment uses the execution of the method by the first terminal as an example for explanation. This data processing method may include at least the following steps S101-S103:
[0138] Step S101: Display the conversation interface when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, display the first virtual avatar of the first object and the second virtual avatar of the second object.
[0139] Specifically, the first terminal can display a conversation interface when the first object and the second object are having an instant conversation. The conversation interface corresponding to the instant conversation structure provided in this application embodiment will place more emphasis on the virtual image. Therefore, the partial virtual image used to represent the first object can be used as the first virtual image of the first object, and the partial virtual image used to represent the second object can be used as the second virtual image of the second object. Subsequently, the first virtual image and the second virtual image can be displayed in the virtual image display area of the conversation interface. Here, the partial virtual image can be understood as a part of the complete virtual image (e.g., the upper body of the complete virtual image). For example, see the above again. Figure 2In the corresponding embodiment, the virtual image 20a of object A displayed in the virtual image display area 201a of the conversation interface 201 can be the first virtual image, and the virtual image 20b of object B can be the second virtual image. It can be seen that at this time, the virtual image 20a is a partial virtual image of object A, and the virtual image 20b is a partial virtual image of object B.
[0140] Optionally, the complete virtual image representing the first object can be used as the first virtual image of the first object, and the complete virtual image representing the second object can be used as the second virtual image of the second object. The first and second virtual images can then be displayed in the virtual image display area of the session interface. It is understood that since the display size of the virtual image display area of the session interface is relatively small, the display size of the complete virtual image displayed in the virtual image display area will also be relatively small.
[0141] It should be understood that the virtual avatar display area (e.g.) Figure 2 The virtual avatar display area 201a in the context can be displayed in the session interface as a floating window, overlay, or semi-transparent layer (e.g., ...). Figure 2 In any area of the conversation interface 201, for example, the virtual avatar display area is the top area of the conversation interface; alternatively, it can also be displayed using an interface whose display size can be changed by triggering an operation (such as dragging) and can be shrunk, the size of which is smaller than the conversation interface.
[0142] In addition, the conversation interface also includes a message display area for showing historical conversation messages. These historical conversation messages are the recorded conversation messages between the first and second objects during their real-time conversation. Historical conversation messages can be displayed in the message display area in bubble format or other forms, as shown above. Figure 2 The corresponding embodiment shows the message display area 201b of the conversation interface 201. Optionally, this message display area may also display message sending information associated with historical conversation messages, such as the sending timestamp and sending object information (e.g., the sending object's avatar and name). It should be understood that the message display area can be displayed in any area of the conversation interface in the form of a floating window, a mask, or semi-transparent, for example, the message display area may be the bottom area of the conversation interface. The virtual avatar display area and the message display area may not overlap, or they may have partial overlap. Optionally, the message display area may also be displayed using an interface whose size can be changed and shrunk by triggering an operation (e.g., dragging), and this interface is smaller than the conversation interface.
[0143] This application also provides a novel conversation interface, called a conversation update interface. Both the first object and the second object can engage in real-time conversations through this conversation update interface, and the first terminal can easily switch between these two conversation interfaces at any time based on the first object's actions. Specifically, the first terminal can respond to a hiding operation on the message display area by hiding the message display area in the conversation interface, and use the display interface containing the virtual avatar display area as the conversation update interface. The hidden message display area can be placed at the bottom layer of the conversation interface, or it can be hidden by extending downwards. This application does not limit the specific method of hiding the message display area. Furthermore, in the conversation update interface, the first virtual avatar can be updated from a partial virtual avatar of the first object to a complete virtual avatar of the first object, and the second virtual avatar can be updated from a partial virtual avatar of the second object to a complete virtual avatar of the second object. Here, a complete virtual avatar can refer to a virtual avatar with a complete body or a virtual avatar integrated with a virtual background.
[0144] Optionally, if the virtual avatar originally displayed in the virtual avatar display area is the complete virtual avatar, then after switching from the session interface to the session update interface, the complete virtual avatar in the virtual avatar display area can be enlarged. In other words, the display size of the complete virtual avatar displayed in the virtual avatar display area of the session interface is smaller than the display size of the complete virtual avatar displayed in the session update interface.
[0145] It should be understood that the message display method of the session update interface is slightly different from that of the session interface. In one embodiment, when the first object and the second object are in an instant session, if the first object sends a second session message to the second object, the second session message can be displayed in the session update interface. Optionally, the second session message can be displayed in the vicinity of the first object's complete virtual avatar, or in other areas of the session update interface associated with the first object. This application embodiment does not limit the display position of the second session message in the session update interface.
[0146] Optionally, the second session message can be displayed in the form of a bubble, a scrolling message, or other forms. This application embodiment does not limit the specific display form of the second session message in the session update interface.
[0147] Optionally, to avoid interfering with the presentation of subsequent new session messages, a corresponding message display threshold can be set for the session messages displayed in the session update interface. For example, when the display duration of the second session message equals the message display threshold, the second session message can be hidden or deleted in the session update interface. Therefore, the session messages displayed in the session update interface are always the latest session messages, i.e., the session messages with the latest sending timestamp. It should be understood that the second session message will still be recorded in the historical session messages. That is, when the first terminal switches back to the original session interface, the second session message can still be displayed in that session interface, meaning that the sent session messages are traceable. In this embodiment, the specific size of the message display threshold is not limited.
[0148] For ease of understanding, please refer to the following: Figure 4 , Figure 4 This is a schematic diagram of a session interface provided in an embodiment of this application. For example... Figure 4 As shown, the conversation interface 401 includes a virtual avatar display area 401a and a message display area 401b. The virtual avatar display area 401a can display a partial virtual avatar (i.e., the first virtual avatar) of object A (i.e., the first object), for example, virtual avatar 4a, and simultaneously display a partial virtual avatar (i.e., the second virtual avatar) of object B (i.e., the second object), for example, virtual avatar 4b. When switching to another conversation interface, a hiding operation can be performed on the message display area 401b, for example, a swipe operation (such as pulling down the message display area 401b). Optionally, the conversation interface 401 may include an area hiding control 401c. Therefore, in response to a trigger operation (e.g., a click operation) on the area hiding control 401c, the message display area 401b can be hidden in the conversation interface 401, allowing switching from the conversation interface 401 to the conversation update interface 402, and updating virtual avatars 4a and 4b. Further, as... Figure 4 As shown, in the session update interface 402, virtual avatar 4a is updated to virtual avatar 4A, which is the complete virtual avatar of object A. At the same time, virtual avatar 4b is updated to virtual avatar 4B, which is the complete virtual avatar of object B. When object A sends a new session message (i.e., a second session message, such as session message 402a) to object B, the session message can be displayed in the session update interface 402. For example, if object A sends session message 402a to object B, the session message 402a can be displayed in an appropriate form (such as a bubble) in the vicinity of virtual avatar 4A displayed in the session update interface 402 (e.g., above the head of virtual avatar 4A), indicating that the session message 402a was sent by object A.
[0149] Furthermore, if the first object wishes to switch back to the session interface, the first terminal can respond to a trigger operation on the session update interface by displaying a message display area in the session update interface. This message display area can then be used as the session interface. In the session interface, the first virtual avatar can be restored from its complete virtual avatar to a partial virtual avatar of the first object, and the second virtual avatar can be restored from its complete virtual avatar to a partial virtual avatar of the second object. Additionally, previously received second session messages can be displayed in the message display area. Figure 4 As shown, the system can respond to a trigger operation on the session update interface 402, such as a swipe operation (e.g., pulling up the session update interface 402). Optionally, the session update interface 402 may include a region display control 402b, and in response to a trigger operation on the region display control 402b (e.g., a click operation), a message display area can be displayed on the session update interface 402, thereby allowing the system to switch back from the session update interface 402 to the session interface 401 and update the virtual avatars 4A and 4B.
[0150] Optionally, the first and second virtual avatars can also be used to represent the business status of the corresponding objects, such as online status, offline status, busy status, gaming status, resting status, and invisible status. That is, different business statuses will correspond to different forms / poses of the virtual avatars. Figure 4 The virtual avatar 4b shown is in a sleeping state, which indicates that the business status of the corresponding object B is resting. Therefore, in order to display the virtual avatar in the session interface, the first terminal can obtain the corresponding virtual avatar based on the avatar resource identifier and business status of each object. The specific process will be detailed later. Figure 10 Step S201 in the corresponding embodiment.
[0151] Step S102: In response to a trigger operation on the session interface, a first session message sent by the first object to the second object is displayed in the message display area of the session interface; the first session message carries media data of the first object associated with the first object.
[0152] Specifically, the first terminal can respond to a trigger operation on the session interface, generate a first session message sent by the first object to the second object, and display the first session message in the message display area of the session interface.
[0153] Optionally, the first object media data carried in the first session message may include a first type of image data. That is, the first terminal may, in response to a trigger operation on the session interface, determine the first type of image data used to characterize the object state of the first object during an instant session, and then use the first type of image data as the first object media data associated with the first object. Furthermore, it may generate a first session message to be sent to the second object based on the first object media data, and subsequently display the first session message in the message display area of the session interface (e.g., as described above). Figure 2 (Session message 201e). Here, the first type of image data can be image data input by the first object to represent its object state, such as an emoticon. Optionally, in addition to the media data of the first object, the first session message may also contain text data entered by the first object or other image data (such as landscape photos) unrelated to the object state of the first object, that is, the first session message may contain one or more types of data.
[0154] This application provides multiple methods for determining the first type of image data, and the first object can use any of these methods, as detailed below:
[0155] In one optional implementation, the first terminal can respond to a trigger operation on a text input control in the conversation interface, displaying text information entered through the text input control. When it detects that the text information carries state-mapped text, it displays the first type of image data mapped by the state-mapped text, representing the object state of the first object during an instant conversation. Here, state-mapped text refers to text that can be mapped to a certain object state, and the state-mapped text has a mapping relationship with the first type of image data. For example, the text "hahaha" can be used as state-mapped text that has a mapping relationship with the "laughing" emoticon. See also... Figure 5 , Figure 5 This is a schematic diagram of a scene for determining image data provided in an embodiment of this application. For example... Figure 5As shown, the session interface 500 includes a text input control 500a. In response to a trigger operation (e.g., a click operation) on the text input control 500a, a text input area 500b is output in the session interface 500. Then, in response to a text input operation on the text input area 500b, the text information 500d entered through this text input operation is displayed in a text input box 500c. During this period, the first terminal can perform text detection on the text information 500d. When it detects that there is state-mapped text (e.g., "hahaha") in the text information 500d, it can display a first image list 500e mapped to this state-mapped text in the session interface 500. The first image list 500e can include one or more image data, such as image data 5A, image data 5B, image data 5C, ..., image data 5D. Therefore, the first object can determine the first type of image data it needs from the first image list 500e. For example, in response to a first object's selection operation on one or more image data in the first image list 500e, the first terminal may use the image data corresponding to the selection operation as the first type of image data used to characterize its object state. For example, image data 5A may be selected as the first type of image data; or, all of these one or more image data may be used as the first type of image data; or, the first terminal may automatically select any one of these one or more image data as the first type of image data. For example, the image data with the most collections in the first image list 500e may be selected as the first type of image data.
[0156] Optionally, when the first object only wants to send plain text conversation messages, even if the text information it enters carries state mapping text and the image data mapped by the state mapping text is displayed in the conversation interface, the first object may not select these image data. Instead, it may use the text information carrying the state mapping text as the first conversation message. In this case, the first terminal can automatically select one image from the image data mapped by the state mapping text as the first type of image data. For example, it can randomly select from multiple image data or select based on popularity priority. Then, the first type of image data can be associated with the first conversation message. It should be understood that although the first type of image data will not be displayed together with the first conversation message in the message display area at this time, the first terminal can still update the first virtual avatar based on the "implicit" first type of image data.
[0157] In one optional implementation, the first terminal may, in response to a trigger operation on a status display control in the session interface, output an image selection panel associated with the status display control, and then, in response to a selection operation on the image selection panel, use the image data corresponding to the selection operation as a first type of image data representing the object state of the first object during an instant session. Please also refer to... Figure 6 , Figure 6 This is a schematic diagram of a scene for selecting image data provided in an embodiment of this application. For example... Figure 6 As shown, the session interface 600 includes a status display control 600a. In response to a trigger operation (e.g., a click operation) on the status display control 600a, an image selection panel 600b can be output to the session interface 600. The image selection panel 600b can contain one or more image data, such as image data 6A, image data 6B, image data 6C, ..., image data 6H. This image data can be image data used by the first object (i.e., historical image data), image data saved by the first object, image data shared by other objects to the first object, or image data recommended by the first terminal. Furthermore, in response to a selection operation by the first object on one or more image data contained in the image selection panel 600b, the image data corresponding to the selection operation can be designated as first-type image data. For example, image data 6C can be selected as first-type image data.
[0158] In an alternative implementation, the first terminal may, in response to a determination operation of target image data in the session interface, use the target image data as a first type of image data to characterize the object state of the first object during an instant session. See also... Figure 7 , Figure 7 This is a schematic diagram of a scene for determining image data provided in an embodiment of this application. For example... Figure 7 As shown, a second image list 700a can be directly displayed in the session interface 700. This second image list 700a may include one or more image data sets for the first object to select. For example, the second image list 700a may include image data 7A, image data 7B, image data 7C, ..., image data 7D. These image data sets may be image data used by the first object (i.e., historical image data), image data saved by the first object, image data shared by other objects with the first object, or image data recommended by the image system, such as image data recommended based on popularity rankings or the first object's profile. The first object can determine target image data (e.g., image data 7B) from the one or more image data sets included in the second image list 700a and can use this target image data as the first type of image data.
[0159] Optionally, the first object media data carried in the first session message may include a second type of image data. That is, in response to a trigger operation on the session interface, the first terminal can invoke the camera to capture the object image data of the first object when the first object records voice information via the voice control. Here, the object image data refers to data used to record the first object's facial expressions, body movements, etc., and is the actual image of the first object during the voice information recording process. For example, video data obtained by filming the first object can be used as the object image data. It should be understood that the camera is only activated after obtaining permission granted by the first object, and correspondingly, the object image data is data acquired only after obtaining permission granted by the first object.
[0160] Furthermore, upon capturing object image data, the image state of the first virtual avatar can be adjusted based on this image data. Subsequently, based on the adjusted first virtual avatar, a second type of image data representing the object state of the first object during a real-time conversation can be generated. This second type of image data is obtained based on the adjusted first virtual avatar and is associated with the object image data of the first object captured by the camera. For example, the body movements and facial expressions of the first object while speaking can be captured and fed back to the first virtual avatar. That is, the first virtual avatar will change in response to the first object's body movements or facial expressions. Simultaneously, the process of the first virtual avatar's changes will be recorded as video, and this video will ultimately be converted into corresponding image data (e.g., dynamic images, or GIFs). This image data will be used as the second type of image data, essentially turning the first object's own virtual avatar into an emoji pack, which will ultimately be sent to the second user along with the voice information, thereby increasing the diversity of voice interaction. The specific process can be as follows: When object image data is captured, state detection is performed on the object image data. The detected state is used as the object state to represent the first object during an instant conversation, thereby obtaining the first virtual avatar. Based on the object state, the avatar's avatar state is adjusted. Based on the adjusted avatar state, a second type of image data is generated to represent the object state. The avatar's avatar state can include its body movements or facial expressions. For example, when the first object is detected smiling, the avatar's facial expression can also be adjusted to a smiling expression.
[0161] Furthermore, the generated second-type image data can be integrated with the voice information to obtain first-object media data associated with the first object. The specific process can be as follows: the second-type image data is uploaded to the server. When the second-type image data is successfully uploaded, the image resource identifier corresponding to the second-type image data is obtained. At the same time, the voice information is uploaded to the server. When the voice information is successfully uploaded, the voice resource identifier corresponding to the voice information is obtained. Then, the second-type image data carrying the image resource identifier and the voice information carrying the voice resource identifier can be integrated. In other words, the second-type image data and the voice information are bound based on the image resource identifier and the voice resource identifier to obtain first-object media data associated with the first object.
[0162] Furthermore, a first conversational message for sending to a second object can be generated based on the media data of the first object. Specifically, this process involves: converting the voice information into speech to obtain corresponding converted text information, displaying this converted text information in the image capture area used to capture the object's image data, and then integrating the converted text information with the media data of the first object to obtain the first conversational message for sending to the second object. Finally, the first conversational message can be displayed in the message display area of the conversation interface.
[0163] Optionally, when no object image data is captured, the first terminal can determine third-type image data based on the first virtual avatar, and then integrate the third-type image data with voice information to obtain first object media data associated with the first object. Then, based on the first object media data, a first session message is generated for sending to the second object. Here, the third-type image data is obtained based on the original first virtual avatar. For example, the first virtual avatar can be directly used as the third-type image data, or a virtual avatar mapped to the first virtual avatar (e.g., a virtual avatar with a specific form) can be defaulted as the third-type image data. That is, when no object image data is captured, this virtual avatar can be used as the default third-type image data. The process of generating first object media data based on the third-type image data and then generating the first session message is similar to the process described above of generating first object media data based on second-type image data and then generating the first session message, and will not be elaborated further here.
[0164] Please see also Figure 8 , Figure 8 This is a schematic diagram illustrating a scenario for recording voice information, provided in an embodiment of this application. For example... Figure 8As shown, the conversation interface 801 may include a voice control 801a. The first terminal can respond to a trigger operation (e.g., a long press) on the voice control 801a, outputting an image capture area 801b for capturing object image data of the first object. Simultaneously, when the first object records voice information through the voice control 801a, converted text information 801c, obtained from the voice information, is simultaneously displayed in the image capture area 801b. It should be understood that during the recording of voice information, the camera continuously captures object image data of the first object. If no object image data is captured, the image capture area 801b may only display the converted text information 801c. Optionally, if no object image data is captured, a third type of image data may also be displayed in the image capture area 801b. It is understood that the third type of image data displayed at this time is unrelated to the object image data, and therefore the third type of image data will not change with the first object. Figure 8 As shown, after the voice input is completed (for example, the first object releases its hand to end the long press operation on the voice control 801a), the image capture area 801b can be hidden, and the conversation interface 802 can be returned. The final generated conversation message 802a (i.e., the first conversation message) for sending to the second object can be displayed in the message display area of the conversation interface 802. It can be seen that the conversation message 802a contains the first object's media data 802b and the converted text information 802c (i.e., the converted text information 801c).
[0165] Please see also Figure 9 , Figure 9 This is a schematic diagram illustrating a scenario for recording voice information, provided in an embodiment of this application. For example... Figure 9 As shown, the first terminal can respond to a trigger operation (e.g., a long press) on the voice control 901a included in the conversation interface 901, and output an image capture area 901b for capturing object image data of the first object. Thus, when the first object records voice information through the voice control 901a, the converted text information 901d obtained from the voice information is simultaneously displayed in the image capture area 901b. Simultaneously, if object image data is captured during voice information recording, for example, when the first object's face is facing the camera, a second type of image data 901c can also be displayed in the image capture area 901b. It can be understood that the displayed second type of image data 901c is related to the object image data and can be used to characterize the object state (e.g., facial expressions, body movements, etc.) of the first object during a real-time conversation. Figure 9As shown, after the voice input is completed (for example, the first object releases its hand to end the long press operation on the voice control 901a), the image capture area 901b can be hidden, and the conversation interface 902 can be returned. The final generated conversation message 902a (i.e., the first conversation message) for sending to the second object can be displayed in the message display area of the conversation interface 902. It can be seen that the conversation message 902a contains the first object's media data 902b and the converted text information 902c (i.e., the converted text information 901d).
[0166] The image capture area can be displayed in any area of the session interface in the form of a floating window, a mask, or a semi-transparent form; alternatively, it can be displayed in a shrinkable interface that can be resized by triggering an operation (such as dragging), and the size of the interface is smaller than that of the session interface.
[0167] It should be noted that the first terminal can respond to a trigger operation on the first object media data carried in the first session message, play voice information, and display sound effects and animations associated with the voice information in the session interface. Furthermore, while playing voice information, the converted text information contained in the first session message can be highlighted synchronously, and the first virtual avatar can also be updated synchronously along with the first object media data. Please refer again. Figure 9 ,like Figure 9 As shown in the conversation interface 903, in response to a trigger operation (e.g., a click operation) on the first object media data 902b carried by the conversation message 902a in the conversation interface 903, the voice information corresponding to the converted text information 902c can be played in the conversation interface 903. Simultaneously, the sound effect animation 903a associated with the voice information can be displayed. The sound effect animation 903a may include the image animation corresponding to the second type of image data 901c carried by the first object media data 902b. Optionally, it may also include a pulse animation associated with the voice information, for example, a randomly changing pulse animation or a pulse animation that changes according to the volume of the voice information. Optionally, the converted text information 902c can also be highlighted synchronously. For example, the text in the converted text information 902c is highlighted sequentially following the played voice information. Furthermore, the virtual avatar 903b can also be updated synchronously with the first object media data 902b. In this way, the body movements and facial expressions contained in the first object media data 902b can be reproduced.
[0168] The aforementioned speech conversion process can be implemented using the real-time speech translation function of the speech recognition interface, and the resulting converted text information has a synchronous logic with the speech information. That is to say, when the speech information is at a certain playback progress, the text position corresponding to that playback progress in the converted text information can be obtained according to the speech recognition interface. Therefore, the first terminal can highlight the text at the text position returned by the speech recognition interface.
[0169] Understandably, when the voice playback ends, the sound effects and animations will disappear, and the updated first virtual avatar will revert to its original state. Optionally, the updated first virtual avatar can also maintain its current form until a new conversation message is generated and it is updated again.
[0170] Optionally, when switching to the conversation update interface for voice input, the first terminal can respond to the trigger operation of the conversation update interface. When the first object inputs voice information through the voice control in the conversation update interface, the camera is invoked to capture the object image data of the first object. If the object image data is captured, the image state of the first virtual avatar is adjusted based on the object image data. It should be understood that the first virtual avatar at this time is the complete virtual avatar of the first object. The first virtual avatar after adjusting its image state can be directly displayed in real time in the conversation update interface. Furthermore, based on the first virtual avatar after adjusting its image state, a second type of image data can be generated to represent the object state of the first object during the instant conversation. For example, the entire process of adjusting the image state of the first virtual avatar can be recorded to obtain corresponding video data, and the image data converted from this video data can then be used as the second type of image data. Conversely, if the object image data is not captured, a third type of image data can be determined based on the first virtual avatar displayed in the conversation update interface. At the same time, a voice conversion area can be output in the conversation update interface to display the converted text information obtained by voice conversion processing of the input voice information in real time. Optionally, the complete first session message (i.e., including the first object's media data and text information) can be displayed in the session update interface. Alternatively, to save display space, only the text information carried by the first session message can be displayed in the session update interface, without displaying the first object's media data. In this case, in response to the triggering operation of the playback control related to the text information in the session update interface, the corresponding voice information is played, and during the playback, the complete virtual image of the first object will also change synchronously with the first object's media data. It should be understood that after switching back to the session interface, the complete first session message is still displayed in the message display area of the session interface.
[0171] Step S103: In the virtual image display area containing the second virtual image, the first virtual image is updated to the first virtual updated image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data.
[0172] Specifically, the first terminal can update the first virtual image based on the first object media data to obtain a first virtual updated image that matches the first object media data, and then update the first virtual image to the first virtual updated image in the virtual image display area containing the second virtual image.
[0173] Optionally, when the first object media data contains a first type of image data, the first virtual avatar can be updated based on the first type of image data to obtain a first updated virtual avatar that matches the first type of image data (e.g., the above). Figure 2 The virtual avatar (201f) in the process can be as follows: The message manager performs media data detection on the first session message. If the first object media data carried in the first session message contains first type of image data, a state trigger event is generated and sent to the virtual avatar processor. The state trigger event can include the object identifier of the first object and a list of image data. The object identifier can be used to mark the first object; for example, the user account of the first object can be used as the object identifier. The image data list records the first type of image data contained in the first object media data. The number of first type image data can be one or more, and they can be different types of first type image data. Further, when the virtual avatar processor receives the state trigger event, it can update the first virtual avatar associated with the object identifier based on the first type of image data in the image data list, obtaining a first updated virtual avatar matching the first type of image data. Both the message manager and the virtual avatar processor are application clients on the first terminal. Finally, the first virtual avatar can be updated to the first updated virtual avatar in the virtual avatar display area containing the second virtual avatar. Additionally, the first type of image data can be displayed in the virtual avatar display area, for example, in the area adjacent to the first updated virtual avatar.
[0174] Optionally, when the first object media data contains the second type of image data, in response to a triggering operation on the first object media data, the first virtual avatar is updated based on the second type of image data to obtain a first updated virtual avatar that matches the second type of image data (e.g., the above). Figure 9 The virtual image 903b in the image can then be updated to the first virtual updated image in the virtual image display area containing the second virtual image.
[0175] As described above, during real-time conversations via the conversation interface, the virtual avatars of both parties (i.e., the first object and the second object) can be displayed in the virtual avatar display area, and the conversation messages generated during the real-time conversation (i.e., historical conversation messages, such as the first conversation message) can be displayed in the message display area. This facilitates both parties in the conversation to review historical conversation messages without encountering a situation where historical conversation messages cannot be recorded and displayed normally. In other words, the embodiments of this application can maintain the normal recording and display of historical conversation messages in real-time conversation scenarios using virtual avatars. In addition, the first virtual avatar can be updated based on the first object media data carried by the first conversation message to present the object status of the first object in real time, thereby enriching the display methods of real-time conversations.
[0176] Please see Figure 10 , Figure 10 This is a flowchart illustrating a data processing method based on a virtual avatar, as provided in an embodiment of this application. This data processing method can be executed by a computer device, which may include, for example... Figure 1 For ease of understanding, this embodiment describes the method as being executed by a first terminal (e.g., user terminal 200a). The data processing method may include at least the following steps:
[0177] Step S201: Display the conversation interface when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, display the first virtual avatar of the first object and the second virtual avatar of the second object.
[0178] Specifically, the first terminal can first display the conversation interface when the first object and the second object are having a real-time conversation. Further, it can retrieve the corresponding virtual avatar based on the avatar resource identifier and business status of each object. The specific process is as follows: Retrieve the first avatar resource identifier and first business status corresponding to the first object, and retrieve the second avatar resource identifier and second business status corresponding to the second object. Search for avatar resources in the virtual avatar cache on the first terminal based on the first avatar resource identifier and first business status, and simultaneously search for avatar resources in the virtual avatar cache based on the second avatar resource identifier and second business status. The virtual avatar cache can be used to cache avatar resources locally. If the virtual avatar corresponding to the first object and the virtual avatar corresponding to the second object are not found in the virtual avatar cache, an avatar resource retrieval request can be generated based on the first avatar resource identifier, first business status, second avatar resource identifier, and second business status. This request is sent to the server. Upon receiving the avatar resource retrieval request, the server generates the first avatar resource address corresponding to the first virtual avatar based on the first avatar resource identifier and first business status, and generates the second avatar resource address corresponding to the second virtual avatar based on the second avatar resource identifier and second business status, and returns the first and second avatar resource addresses. The first image resource identifier and the first business status are obtained only after the permissions granted by the first object are obtained, and the second image resource identifier and the second business status are also obtained only after the permissions granted by the second object are obtained.
[0179] Furthermore, after receiving the first image resource address and the second image resource address returned by the server, the first terminal can obtain the first virtual image associated with the first object based on the first image resource address, and obtain the second virtual image associated with the second object based on the second image resource address. Subsequently, the first virtual image and the second virtual image can be displayed in the virtual image display area of the session interface.
[0180] Wherein, the first image resource address refers to the storage location of the first virtual image in the image resource library associated with the server, and the second image resource address refers to the storage location of the second virtual image in the image resource library. Therefore, the first terminal can read the first virtual image from the storage location indicated by the first image resource address in the image resource library, and can read the second virtual image from the storage location indicated by the second image resource address in the image resource library.
[0181] It should be understood that if a virtual avatar corresponding to an object (e.g., the first object) is found in the virtual avatar cache based on its avatar resource identifier and business status, there is no need to send an avatar resource retrieval request to the server. Conversely, if the virtual avatar corresponding to the object is not found in the virtual avatar cache, then an avatar resource retrieval request needs to be sent to the server.
[0182] Optionally, the first virtual avatar and the second virtual avatar can be obtained separately. That is, the first terminal can generate a first avatar resource acquisition request based on the first avatar resource identifier and the first business state, and then send the first avatar resource acquisition request to the server. Upon receiving the first avatar resource acquisition request, the server generates a first avatar resource address corresponding to the first virtual avatar based on the first avatar resource identifier and the first business state, and returns it to the first terminal. Simultaneously, the first terminal can generate a second avatar resource acquisition request based on the second avatar resource identifier and the second business state, and then send the second avatar resource acquisition request to the server. Upon receiving the second avatar resource acquisition request, the server generates a second avatar resource address corresponding to the second virtual avatar based on the second avatar resource identifier and the second business state, and returns it to the first terminal. Subsequently, the first terminal can obtain the first virtual avatar associated with the first object based on the first avatar resource address, and obtain the second virtual avatar associated with the second object based on the second avatar resource address.
[0183] To improve the efficiency of obtaining virtual avatars, after the first terminal obtains a virtual avatar from the avatar resource library associated with the server, it can store it in the local virtual avatar cache area.
[0184] It should be understood that the virtual image obtained by the first terminal can be a partial virtual image or a complete virtual image. Optionally, a partial virtual image and a complete virtual image of the same object can have the same image resource identifier. Therefore, the first terminal can first obtain the corresponding complete virtual image, and then extract the corresponding partial virtual image from the complete virtual image, and assign the same image resource identifier to the complete virtual image and the corresponding partial virtual image. Optionally, a partial virtual image and a complete virtual image of the same object can have different image resource identifiers. That is, different image resource identifiers can be assigned to a complete virtual image and the partial virtual image obtained by extracting the complete virtual image. The complete virtual image and the corresponding partial virtual image can be bound and stored based on their image resource identifiers. Therefore, the first terminal can quickly obtain the complete virtual image and the corresponding partial virtual image based on the image resource identifier of the complete virtual image, or it can quickly obtain the partial virtual image and the corresponding complete virtual image based on the image resource identifier of the partial virtual image, without needing to detect whether the virtual image belongs to a partial virtual image or a complete virtual image.
[0185] In this feature, since virtual avatars of both parties are added to the conversation interface, to display the virtual avatar of the other party (i.e., a friend, such as the second party), an item needs to be added to the basic information of that party's profile card to indicate the current avatar information. This mainly consists of the virtual avatar's avatar resource identifier (also known as avatar resource ID, where ID is an abbreviation for Identity Document). The application client can request to download the corresponding virtual avatar based on this avatar resource identifier and the party's business status. Some of the content of the basic information card can be found in Table 1.
[0186] Table 1
[0187]
[0188] As shown in Table 1 above, different state values can be used to represent different business states. For example, a state value of 1 indicates an online state; a state value of 2 indicates an offline state; a state value of 3 indicates a busy state; and a state value of 4 indicates an away state. This application embodiment does not limit the specific state value corresponding to each business state.
[0189] Please see also Figure 11 , Figure 11 This is an interactive schematic diagram illustrating the acquisition of a virtual avatar, provided in an embodiment of this application. The explanation takes the interaction between an application client on a first terminal and a server as an example. Figure 11As shown, the application client on the first terminal has the ability to obtain virtual avatars. This application client has a dedicated virtual avatar manager (also called a virtual avatar manager) that can return the corresponding virtual avatar based on the input avatar resource identifier and business status, and then use it for display. The specific interaction process may include the following steps:
[0190] In step S2011, the virtual avatar manager on the application client can obtain the avatar resource identifier (such as the second avatar resource identifier) and business status (such as the second business status) of the object (e.g., the second object) that is having a real-time conversation with the other party.
[0191] In step S2012, the virtual avatar manager can search in the virtual avatar cache for a virtual avatar that matches the avatar resource identifier and business status. If the search is successful, the matched virtual avatar is returned to the application client, and step S2018 is executed; otherwise, if the search fails, step S2013 is executed.
[0192] In step S2013, the virtual avatar manager can generate a corresponding avatar resource acquisition request (also known as a ticket, such as a second avatar resource acquisition request) based on the avatar resource identifier and business status, and can send the avatar resource acquisition request to the server.
[0193] Step S2014: After receiving the image resource acquisition request sent by the virtual image manager, the server can perform request verification on the image resource acquisition request, that is, verify whether the image resource acquisition request is a legitimate request. If the image resource acquisition request verification is successful, it means that the image resource acquisition request is a legitimate request, and then step S2015 is executed; otherwise, if the image resource acquisition request verification fails, it means that the image resource acquisition request is an illegitimate request, and then step S2019 is executed.
[0194] In step S2015, the server can generate a corresponding image resource address (also known as a download address, such as a second image resource address) based on the image resource identifier and business status carried in the image resource acquisition request, and return it to the virtual image manager.
[0195] In step S2016, the virtual avatar manager can download the virtual avatar (such as the second virtual avatar) of the object based on the received avatar resource address.
[0196] Step S2017: If the virtual avatar manager is successfully downloaded, proceed to step S2018; otherwise, if the virtual avatar manager fails to download, proceed to step S2019.
[0197] Step S2018: The application client displays the obtained virtual avatar;
[0198] In step S2019, the virtual avatar manager receives the verification failure result returned by the server, generates request failure information based on the verification failure result, and reports the request failure information to the application client.
[0199] In step S2020, the application client performs virtual avatar failure processing based on the request failure information. For example, it can generate an avatar to obtain a failure notification based on the request failure information and display it on the corresponding interface.
[0200] As can be seen from the above, the embodiments of this application can support application clients to obtain the virtual image of an object based on the object's image resource identifier, and can also display different forms of the virtual image according to the object's business status, thereby enriching the diversity of virtual images, and the current business status of the object can be understood intuitively and conveniently based on the form of the virtual image.
[0201] Step S202: In response to a trigger operation on the conversation interface, display the first conversation message sent by the first object to the second object in the message display area of the conversation interface;
[0202] In one implementation, the voice information sent by the first object can be transmitted to the second object in real time via real-time text conversion. For example, existing speech recognition interfaces can be used to translate the voice information recorded by the first object into text in real time. Therefore, when sending a conversation message generated in this way to the second object, not only the voice information but also the converted text information is sent to the second object. This method of generating conversation messages is simple and convenient, reduces usage costs, and improves the efficiency of conversation message generation. If the object image data (e.g., the first object's face) is captured during the first object's voice input, the image data based on the virtual avatar can also be sent to the second object. For example, using existing 3D depth reconstruction and application SDKs (Software Development Kits), the facial bones of the first object can be identified and applied to the first virtual avatar. This can then be recorded as a video and converted into GIF (Graphics Interchange Format) image data. The GIF can then display the behavior of the virtual avatar corresponding to the first object while it is recording voice information, making communication between objects more vivid.
[0203] Please see also Figure 12 , Figure 12 This is a schematic diagram illustrating a process for recording voice information, provided in an embodiment of this application. Figure 12 As shown, this process can be implemented on the application client of the first terminal, and the specific process is as follows:
[0204] Step S2021: When the user enters voice information, the application client of the first terminal can call the voice recognition interface (i.e., voice recognition SDK) to process the voice information.
[0205] In step S2022, the application client converts the voice information into text (i.e., converts the text information) in real time through the voice recognition interface, so that the first subject can send a first conversation message carrying the converted text information when the voice input ends.
[0206] In step S2023, the application client will detect whether a face is recognized through the camera. If a face is recognized, then step S2025 will be executed; otherwise, if a face is not recognized, then step S2024 will be executed.
[0207] In step S2024, the application client can obtain the default image data (i.e., the third type of image data).
[0208] In step S2025, the application client can use the face 3D depth reconstruction interface (i.e., face 3D depth reconstruction and the application's SDK) to identify changes in the skeleton, for example, continuously detecting key points on the first object (such as key points on the face or limbs) to obtain key point location data.
[0209] In step S2026, the application client can apply the skeletal changes identified through the 3D depth reconstruction interface of the face to the first virtual object, that is, update the first virtual object based on the acquired key point position data. For example, if the first object waves, the first virtual object will also wave.
[0210] In step S2027, the application client will record the changes of the first virtual object as a video;
[0211] In step S2028, the application client can convert the recorded video into GIF (Graphics Interchange Format) format image data (i.e., second type of image data, such as animated images).
[0212] Finally, the application client can also integrate the obtained image data (second type image data or third type image data) with voice information to obtain first object media data associated with the first object, and then generate a first session message to be sent to the second object based on the first object media data.
[0213] Specifically, when the first recipient sends voice information, it also uploads the voice information and corresponding image data to the server. This allows the second recipient to download the corresponding resources after receiving the session message, reducing data transmission costs. Please refer to [link / reference needed]. Figure 13 , Figure 13This is a schematic diagram illustrating a process for sending session messages according to an embodiment of this application. For example... Figure 13 As shown, when sending the first session message, the application client ensures that the voice information and corresponding image data are successfully uploaded to the server before triggering the sending of the first session message. In this embodiment, an example of a GIF animation as the second type of image data will be used. The specific process can be as follows:
[0214] Step S1: The application client detects whether there is voice information. If it exists, proceed to step S2; otherwise, if it does not exist, proceed to step S10.
[0215] Step S2: The application client uploads the voice information to the server;
[0216] Step S3: The application client obtains the voice upload result returned by the server. If the voice upload result indicates that the voice information was uploaded successfully, then step S4 is executed; otherwise, if the voice upload result indicates that the voice information was uploaded unsuccessfully, then step S10 is executed. Optionally, when the voice information upload fails, the application client can re-upload the voice information to obtain a new voice upload result.
[0217] Step S4: The application client obtains the voice resource identifier corresponding to the voice information, wherein the voice resource identifier is generated by the application client.
[0218] Step S5: The application client detects whether an animated image exists (i.e., whether the object image data of the first object has been captured). If it exists, proceed to step S6; otherwise, if it does not exist, proceed to step S10.
[0219] Step S6: The application client uploads the animated image to the server;
[0220] Step S7: The application client obtains the image upload result returned by the server. If the image upload result indicates that the animated image was successfully uploaded, then proceed to step S8; otherwise, if the image upload result indicates that the animated image failed to upload, then proceed to step S10. Optionally, when the animated image upload fails, the application client can re-upload the animated image to obtain a new image upload result.
[0221] Step S8: The application client obtains the image resource identifier corresponding to the animated image, wherein the image resource identifier is generated by the application client.
[0222] In step S9, the application client converts the video recorded on the first virtual avatar into a GIF animation and uses the animation as the second type of image data. This reduces the transmission cost of transmitting video and improves the transmission efficiency of session messages.
[0223] In step S10, the application client asks whether the voice information and the animated image are both ready. If both the voice information and the animated image are ready, then step S11 is executed. Otherwise, since the voice information and the animated image are transmitted separately, there may be a situation where one piece of data is ready but the other piece of data is not ready. Therefore, the application client can continue to wait.
[0224] Step S11: If the voice information upload fails or the animated image upload fails, proceed to step S13; otherwise, if both the voice information and the animated image are successfully uploaded, proceed to step S12.
[0225] Step S12: The application client sends the first session message generated based on voice information and animated image;
[0226] In step S13, if the application client fails to send a session message, it can send a failure message back to the first object.
[0227] Optionally, after the first session message is successfully uploaded, the second terminal can subsequently obtain the corresponding voice information and image data from the message resource library storing the session messages based on the image resource identifier and voice resource identifier, thereby improving resource transmission efficiency.
[0228] For details on how to implement step 202, please refer to the above. Figure 3 The same content will not be repeated here regarding step S102 in the corresponding embodiment.
[0229] Step S203: In the virtual avatar display area containing the second virtual avatar, update the first virtual avatar to the first updated virtual avatar;
[0230] In one implementation, to enrich the gameplay of interactions between objects, the emoji interactions between objects can be reflected in virtual avatars during real-time conversations (i.e., Figure 3 The scenario described in the corresponding embodiment where the first object media data includes the first type of image data. It should be understood that the emoji interaction display is limited to the current session's AIO (All-In-One, referring to a common chat window component that provides a unified interactive experience for objects), without roaming synchronization, and is only effective within the current AIO, requiring only processing by the application client itself. In the management of sending and receiving session messages (i.e., the message manager), as long as a session message containing emojis (i.e., the first type of image data) is detected, external processing can be notified, and the emoji information (also known as emoji information) can be passed to the business processing. The business can then perform corresponding processing upon receiving the notification.
[0231] Please see also Figure 14 , Figure 14This is a schematic diagram illustrating a process for updating a virtual avatar, as provided in an embodiment of this application. The application client integrates a message manager (also known as a message manager) and a virtual avatar processor (also known as an AIO virtual avatar). The message manager is used for sending and receiving session messages, while the virtual avatar processor handles tasks such as displaying and updating the virtual avatar. Figure 14 As shown, the process may specifically include the following steps:
[0232] In step S2031, the message manager continuously sends and receives session messages;
[0233] In step S2032, the message manager can perform message filtering on all session messages (including session messages sent and received by the first terminal) obtained by the first terminal, that is, perform media data detection on all session messages and filter out session messages containing the first type of image data (such as emoticons) based on the detection results.
[0234] In step S2033, the message manager determines whether there is first type image data in the session message based on the detection result. If no first type image data is detected in a session message (e.g., session message 1), the session message can be discarded and step S2031 can be executed; otherwise, if the first type image data is detected in a session message (e.g., session message 2), step S2034 can be executed.
[0235] In step S2034, the message manager can generate a status trigger event (i.e., Fire Emoji Event) and send the status trigger event to the virtual avatar processor. The status trigger event contains the object identifier (i.e., user identification number, uin) of the message sender (such as the first object) and an image data list (emoji_list, also known as the emoticon list).
[0236] In step S2035, the virtual avatar processor can listen for state trigger events sent from other places through the mechanism of binding events. Therefore, when a state trigger event is received from the message manager, the virtual avatar processor can update the virtual avatar of the object indicated by the object identifier contained in the state trigger event (such as the first virtual avatar) based on the first type of image data contained in the image data list.
[0237] In step S2036, the virtual avatar processor may display the first type of image data in the vicinity of the updated virtual avatar (such as the first updated virtual avatar).
[0238] Optionally, for the first virtual avatar, the virtual avatar processor can search in the avatar resource library or virtual avatar cache for a virtual avatar that matches the first type of image data contained in the image data list as the first virtual update avatar.
[0239] Optionally, when a session message includes multiple first-type image data, the virtual avatar processor can select any one of the multiple first-type image data as the first-type target image data, and update the corresponding virtual avatar based on the first-type target image data. For example, the first first-type image data in the session message can be selected as the first-type target image data, or the most numerous first-type image data in the session message can be selected as the first-type target image data, or the corresponding virtual avatar can be updated sequentially according to the order of the first-type image data in the session message.
[0240] For details on how to implement this step, please refer to the above. Figure 3 The same content as step S103 in the corresponding embodiment will not be repeated here.
[0241] Step S204: When background mapping text is detected in the first session message, the original virtual background is updated to a virtual updated background based on the background mapping text. Based on the first virtual updated image, the second virtual image, and the virtual updated background, a blended background virtual image is generated for display in the virtual image display area.
[0242] Specifically, in the virtual avatar display area, the virtual background associated with the first and second virtual avatars can be used as the original virtual background. The first terminal can perform text detection on the first session message. When background mapping text is detected in the first session message, the original virtual background can be updated to a virtual updated background based on the background mapping text. Then, the first virtual updated avatar, the second virtual avatar, and the virtual updated background are fused to obtain a blended background virtual avatar for display in the virtual avatar display area. The advantage of this is that it can improve the blending degree between the virtual background and the virtual avatar, thereby enhancing the sense of presence in the real-time session.
[0243] Optionally, a corresponding background display threshold can be set for the virtual update background. For example, when the display duration of the virtual update background is equal to the background display threshold, the virtual update background can be restored to the original virtual background, and correspondingly, the merged background virtual image can be restored to the first virtual update image.
[0244] Optionally, after the original virtual background is updated to a virtual updated background, the virtual updated background can be maintained for a long time until the next time the virtual updated background is updated.
[0245] Optionally, when a session message does not carry the first object media data but carries background mapping text, the virtual background can still be updated based on the background mapping text, and the updated virtual background can also be fused with the first virtual image and the second virtual image.
[0246] Please see also Figure 15 , Figure 15 This is a schematic diagram of a scene for updating a virtual background provided in an embodiment of this application. For example... Figure 15 As shown, object A and object B engage in an instant conversation through the conversation interface 150. Assuming that the conversation message 150c sent by object A carries background mapping text, such as "Happy New Year", the original virtual background associated with virtual images 150a and 150b in the virtual image display area 150d can be updated to the virtual updated background mapped by the background mapping text carried in the conversation message 150c. For example, it can be updated to the virtual fireworks background mapped by the background mapping text "Happy New Year". In addition, the virtual fireworks background can be merged with virtual images 150a and 150b. After successful merging, the virtual image finally displayed in the virtual image display area 150d (i.e., the merged background virtual image) is highly integrated with the virtual fireworks background. In other words, the virtual updated background will affect the virtual image within it. For example, when the virtual fireworks are red, the light projected onto the merged background virtual image is also red.
[0247] In one optional implementation, when a wearable mapping keyword is detected in the first session message—for example, if the text information or voice information contained in the first session message contains a wearable mapping keyword—the first virtual avatar can be updated based on the detected wearable mapping keyword. For example, virtual wearable items (e.g., clothing, headwear, hats, glasses, backpacks, toys, etc.) matching the wearable mapping keyword can be added to the first virtual avatar. For instance, when session message X contains the wearable mapping keyword "hat," any hat can be randomly selected from the avatar resource library and added to the corresponding virtual avatar, thus obtaining a virtual avatar wearing a hat.
[0248] Step S205: When the display duration of the first virtual updated image is equal to the display duration threshold, the first virtual updated image is restored to the first virtual image.
[0249] Specifically, in this embodiment, a display duration threshold can be set for the first virtual updated image. When the display duration of the first virtual updated image is equal to the display duration threshold, the first virtual updated image will be restored to its original state. This embodiment does not limit the specific value of the display duration threshold.
[0250] Step S206: When a third session message is received from the second object, the third session message is displayed in the message display area of the session interface, and the second virtual avatar is updated based on the second object media data associated with the second object carried in the third session message.
[0251] Specifically, when the first terminal receives a third session message from the second object, it can display the third session message in the message display area of the session interface. The third session message carries media data of the second object associated with it. Furthermore, in the virtual avatar display area containing the second virtual avatar, the second virtual avatar can be updated to a third updated virtual avatar, where the third updated virtual avatar is obtained by updating the second virtual avatar based on the second object's media data. The specific process of this step is the same as described above. Figure 3 The process of updating the first virtual image to the first virtual updated image based on the first object media data in the corresponding embodiment is similar and will not be described again here.
[0252] In step S207, in response to the business state switching operation for the first object, the business state of the first object is updated to the business update state, and in the virtual image display area containing the second virtual image, the first virtual image is updated to the second virtual update image that matches the business update state.
[0253] Specifically, in response to the switching operation of the business status of the first object, the first terminal updates the business status of the first object to the business update status. Then, in the virtual image display area containing the second virtual image, the first virtual image is updated to the second virtual update image that matches the business update status. For example, when object A changes its business status from resting to online, the virtual image of object A is updated from the virtual image that matches the resting state (e.g., sleeping) to the virtual image that matches the online state (e.g., standing).
[0254] Please see Figure 16 , Figure 16 This is a flowchart illustrating a data processing method based on a virtual avatar, as provided in an embodiment of this application. This data processing method can be executed by a computer device, which may include, for example... Figure 1 For ease of understanding, this embodiment uses the example of the method being executed by a first terminal (e.g., user terminal 200a) to illustrate the user terminal or server. This embodiment can be used as a basis for the above-mentioned... Figure 3 One specific implementation of the corresponding embodiment. This data processing method may include at least the following steps:
[0255] Step S301: Display the conversation interface when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, display the first virtual avatar of the first object and the second virtual avatar of the second object.
[0256] For details on how to implement this step, please refer to the above. Figure 3 The same content as step S101 in the corresponding embodiment will not be repeated here.
[0257] Step S302: In response to a trigger operation on the conversation interface, output the voice control and the image capture area for capturing object image data of the first object. When the first object enters voice information through the voice control, display the conversation image data of the first object during the real-time conversation in the image capture area.
[0258] Specifically, in response to a trigger operation on the conversation interface, the first terminal can output voice controls (such as those mentioned above). Figure 9 The voice control 901a shown) and the image capture area for capturing object image data of the first object (e.g., the one described above) Figure 9 As shown in the image capture area 901b, when the first object records voice information through the voice control, the camera is invoked to capture the object image data of the first object. The voice control can be a control independent of the image capture area, or it can be a control within the image capture area.
[0259] Furthermore, upon capturing object image data, session image data of the first object during an instant session can be determined based on the object image data, and this session image data can be displayed in the image capture area. This session image data can be used to characterize the object state of the first object during an instant session.
[0260] Optionally, the session image data is generated based on the captured object image data of the first object. That is, a reconstructed virtual image that fits the first object can be reconstructed based on the object image data, and session image data of the first object during real-time sessions can be generated based on the reconstructed virtual image. For example, the rendering engine on the application client can perform image rendering based on the real object data of the first object extracted from the session image data (such as the first object's hairstyle, clothing, location, etc.) to obtain the reconstructed virtual image of the first object. It can be understood that the reconstructed virtual image here will change with the changes of the first object.
[0261] Optionally, the session image data is obtained by adjusting the image state of the first virtual avatar based on the captured object image data of the first object. In other words, the image state of the first virtual avatar can be adjusted based on the object image data, and then session image data of the first object during real-time conversation can be generated based on the first virtual avatar with the adjusted image state, i.e., the second type of image data, for example, as described above. Figure 9 The second type of image data 901c in the corresponding embodiment can be referred to the above for details. Figure 3 The description of generating the second type of image data in step S102 of the corresponding embodiment.
[0262] Step S303: Display the first session message sent by the first object to the second object in the message display area of the session interface;
[0263] Specifically, the first terminal can determine the first object media data associated with the first object based on the session image data and voice information. The specific process can be as follows: the first terminal integrates the session image data and voice information to obtain integrated session image data carrying voice information, and determines the session image data carrying voice information as the first object media data associated with the first object. See the above for details. Figure 3 The description of generating the first object media data in step S102 of the corresponding embodiment.
[0264] Furthermore, a first session message can be generated based on the first object's media data, sending it from the first object to the second object. Optionally, the first terminal can use the first object's media data (e.g., as described above) Figure 9 The first object media data 902b shown is used as a first session message to be sent to the second object. That is, the first session message at this time may contain a sub-session message.
[0265] Optionally, the first terminal may also integrate the converted text information obtained from the speech-to-text conversion of the voice information with the media data of the first object to obtain a first session message for sending to the second object. That is, the first session message may contain two sub-session messages, for example, as described above. Figure 9 The first object media data 902b and converted text information 902c are shown. Optionally, the converted text information here can be the converted text information (i.e., the first converted text information) obtained by converting the voice information recorded by the first object through the voice control in the image capture area. This converted text information can be displayed in real time in the image capture area. The converted text information and the conversation image data of the first object during real-time conversation can be displayed together in different positions in the image capture area, as shown above. Figure 9 The image capture area shown is 901b.
[0266] Finally, after the first session message is sent, it can be displayed in the message display area of the session interface.
[0267] Step S304: In the virtual avatar display area containing the second virtual avatar, update the first virtual avatar to the first updated virtual avatar;
[0268] For details on how to implement this step, please refer to the above. Figure 3 The same content as step S103 in the corresponding embodiment will not be repeated here.
[0269] Step S305: In response to the triggering operation for the first object media data carried in the first session message, play the voice information and display the audio-visual animation of the session image data carrying the voice information in the message display area.
[0270] Specifically, the first object media data carried in the first session message may include session image data carrying voice information. The session image data can be obtained by adjusting the image state of the first virtual avatar based on the captured object image data of the first object, or it can be generated based on the captured object image data of the first object. This application embodiment does not limit the generation method of the session image data. The first terminal can respond to a trigger operation (e.g., a click operation) on the first object media data to play the voice information and display sound effects animations carrying the session image data in the message display area, similar to playing a video integrating voice information and session image data. Here, the sound effects animation can include image animations corresponding to the session image data. Optionally, it can further include pulse animations associated with the voice information, such as randomly changing pulse animations or pulse animations that change according to the volume of the voice information. See the above for example scenarios. Figure 9 The corresponding implementation examples.
[0271] In step S306, when displaying the audio-visual animation of the conversation image data carrying voice information in the message display area, the first converted text information is highlighted synchronously in the message display area;
[0272] Specifically, the first session message also includes first converted text information obtained by performing speech conversion processing on the speech information in the image capture area (e.g., the above). Figure 9When the first terminal displays the first converted text information (902c) in the message display area, it can simultaneously highlight the first converted text information in the message display area while displaying the audio-visual animation of the conversation image data carrying the voice information.
[0273] In step S307, when displaying the audio-visual animation of the conversation image data carrying voice information in the message display area, the voice information is processed by voice conversion to obtain the second converted text information corresponding to the voice information, and the second converted text information is highlighted synchronously in the message display area.
[0274] Specifically, if the first session message contains only the first object media data sub-session message, then when the first terminal displays the audio-visual animation of the session image data carrying voice information in the message display area, it can also perform voice conversion processing on the voice information in real time to obtain the second converted text information corresponding to the voice information, and synchronously highlight the second converted text information in the message display area.
[0275] This application's embodiments fully utilize virtual avatars of objects, innovating a new way to conduct real-time conversations in virtual social scenarios. First, in the interface structure of the conversation interface, the virtual avatar display area is used to display the virtual avatars of both parties in the conversation. The virtual avatars can display the business status of each object, and they also change according to the object's media data (such as emoticons) carried in the conversation messages sent by their corresponding objects. Through the virtual avatars, the emotions (i.e., object status) of the objects when sending conversation messages can be more vividly displayed. The message display area of the conversation interface is used to display historical conversation messages, making it convenient for objects to review past conversations. Second, when an object records voice information, once the camera captures the object's image data (such as the object's face), it begins recording the object's actions and expressions while speaking in a virtual avatar manner, generating a second type of image data (such as an animated GIF of the virtual avatar). Simultaneously, the voice information can be converted into text in real time. Furthermore, the conversation messages sent by the object also carry the second type of image data, which can be used to display the object's expressions while speaking. Moreover, triggering a conversation message carrying the second type of image data also allows the sender's voice to be heard. This method reduces the cost of text input, and recording the speaker's facial expressions can better convey the subject's emotions.
[0276] Please see Figure 17 , Figure 17 This is a flowchart illustrating a data processing method based on a virtual avatar, as provided in an embodiment of this application. This data processing method can be executed by a computer device, which may include, for example... Figure 1For ease of understanding, this embodiment describes the method as being executed by a second terminal (e.g., user terminal 200b). The data processing method may include at least the following steps:
[0277] Step S401: Display the conversation interface when the first object and the second object are having an instant conversation. In the virtual avatar display area of the conversation interface, display the first virtual avatar of the first object and the second virtual avatar of the second object.
[0278] In this embodiment, the interface structure of the conversation interface on the second terminal is consistent with that on the first terminal. Therefore, to distinguish between the two conversation interfaces, the conversation interface on the first terminal can be referred to as the first conversation interface, and the conversation interface on the second terminal can be referred to as the second conversation interface. The second terminal can display the second conversation interface when the first object and the second object are having an instant conversation, and can use the partial virtual image used to represent the first object as the first virtual image of the first object, and the partial virtual image used to represent the second object as the second virtual image of the second object. Furthermore, the first virtual image and the second virtual image can be displayed in the virtual image display area of the second conversation interface.
[0279] For details on how to implement this step, please refer to the above. Figure 3 Step S101 in the corresponding embodiment will not be described again here.
[0280] Step S402: Upon receiving a first session message sent by the first object, the first session message is displayed in the message display area of the session interface; the first session message carries media data of the first object associated with the first object.
[0281] Specifically, after the second terminal successfully receives the first session message sent by the first object, it can display the first session message in the message display area of the second session interface.
[0282] Step S403: In the virtual image display area containing the second virtual image, the first virtual image is updated to a first virtual updated image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data.
[0283] The specific implementation method for this step can be found above. Figure 3 Step S103 in the corresponding embodiment will not be described again here.
[0284] Please see Figure 18This is a schematic diagram of the structure of a data processing device based on a virtual avatar provided in an embodiment of this application. The data processing device based on a virtual avatar can be a computer program (including program code) running on a computer device; for example, the data processing device based on a virtual avatar can be an application software. This device can be used to execute corresponding steps in the data processing method based on a virtual avatar provided in the embodiment of this application. Figure 18 As shown, the data processing 1 based on virtual avatars may include: a first display module 11, a second display module 12, a first update module 13, a region hiding module 14, a second update module 15, a state switching module 16, a message generation module 17, a voice playback module 18, a third display module 19, a background update module 20, a fourth display module 21, and a third update module 22.
[0285] The first display module 11 is used to display the conversation interface when the first object and the second object are having an instant conversation. In the virtual image display area of the conversation interface, the first virtual image of the first object and the second virtual image of the second object are displayed.
[0286] The first display module 11 may include: a first display unit 111, an image determination unit 112, a second display unit 113, a resource request unit 114, an image acquisition unit 115, and a third display unit 116.
[0287] The first display unit 111 is used to display the conversation interface when the first object and the second object are having an instant conversation.
[0288] Image determination unit 112 is used to use the local virtual image used to represent the first object as the first virtual image of the first object, and the local virtual image used to represent the second object as the second virtual image of the second object.
[0289] The second display unit 113 is used to display the first virtual image and the second virtual image in the virtual image display area of the conversation interface;
[0290] The resource request unit 114 is used to obtain the first image resource identifier and the first business status corresponding to the first object, and to obtain the second image resource identifier and the second business status corresponding to the second object, generate an image resource acquisition request based on the first image resource identifier, the first business status, the second image resource identifier, and the second business status, and send the image resource acquisition request to the server; the server is used to generate the first image resource address corresponding to the first virtual image based on the first image resource identifier and the first business status when it receives the image resource acquisition request, and generate the second image resource address corresponding to the second virtual image based on the second image resource identifier and the second business status, and return the first image resource address and the second image resource address;
[0291] The image acquisition unit 115 is used to receive the first image resource address and the second image resource address returned by the server, acquire the first virtual image associated with the first object based on the first image resource address, and acquire the second virtual image associated with the second object based on the second image resource address.
[0292] The third display unit 116 is used to display the first virtual image and the second virtual image in the virtual image display area of the session interface.
[0293] The specific implementation methods of the first display unit 111, the image determination unit 112, the second display unit 113, the resource request unit 114, the image acquisition unit 115, and the third display unit 116 can be found above. Figure 3 The description of step S101 in the corresponding embodiment, or you can refer to the above. Figure 10 The description of step S201 in the corresponding embodiments will not be repeated here.
[0294] The second display module 12 is used to respond to a trigger operation on the session interface and display a first session message sent by the first object to the second object in the message display area of the session interface; the first session message carries media data of the first object associated with the first object.
[0295] The second display module 12 may include: a first data determination unit 121, a first message generation unit 122, a data capture unit 123, a second data determination unit 124, and a second message generation unit 125;
[0296] The first data determination unit 121 is used to determine a first type of image data that characterizes the object state of the first object when it is conducting an instant conversation in response to a trigger operation on the conversation interface.
[0297] The first data determination unit 121 may include: a text mapping subunit 1211, a data selection subunit 1212, and a data determination subunit 1213;
[0298] The text mapping subunit 1211 is used to respond to a trigger operation on the text input control in the conversation interface to display the text information entered through the text input control; when it is detected that the text information carries state mapping text, it displays the first type of image data mapped by the state mapping text to represent the object state of the first object when conducting an instant conversation.
[0299] The data selection subunit 1212 is used to output an image selection panel associated with the status display control in response to a trigger operation on the status display control in the session interface; and to use the image data corresponding to the selection operation as a first type of image data to characterize the object status of the first object when conducting an instant session in response to a selection operation on the image selection panel.
[0300] The data determination subunit 1213 is used to, in response to a determination operation of target image data in the session interface, use the target image data as a first type of image data to characterize the object state of the first object during an instant session.
[0301] The specific implementation methods of the text mapping subunit 1211, the data selection subunit 1212, and the data determination subunit 1213 can be found above. Figure 3 The description of step S102 in the corresponding embodiment will not be repeated here.
[0302] The first message generation unit 122 is used to take the first type of image data as first object media data associated with the first object, generate a first session message to be sent to the second object based on the first object media data, and display the first session message in the message display area of the session interface.
[0303] The data capture unit 123 is used to respond to a trigger operation on the conversation interface and, when the first object records voice information through the voice control, call the camera to capture the object image data of the first object.
[0304] The second data determination unit 124 is used to adjust the image state of the first virtual image based on the object image data when the object image data is captured, and to generate a second type of image data to represent the object state of the first object when conducting an instant conversation based on the first virtual image after the image state is adjusted.
[0305] The second data determination unit 124 may include: a state detection subunit 1241 and a state adjustment subunit 1242;
[0306] The state detection subunit 1241 is used to perform state detection on the object image data when the object image data is captured, and to use the detected state as the object state used to characterize the first object when it is conducting an instant session.
[0307] The state adjustment subunit 1242 is used to acquire a first virtual image, adjust the image state of the first virtual image based on the object state, and generate a second type of image data to represent the object state based on the first virtual image after adjusting the image state.
[0308] The specific implementation methods of the state detection subunit 1241 and the state adjustment subunit 1242 can be found above. Figure 3 The description of step S102 in the corresponding embodiment will not be repeated here.
[0309] The second message generation unit 125 is used to integrate the second type of image data with voice information to obtain the first object media data associated with the first object, generate a first conversation message to be sent to the second object based on the first object media data, and display the first conversation message in the message display area of the conversation interface.
[0310] The second message generation unit 125 may include: a first upload subunit 1251, a second upload subunit 1252, a first integration subunit 1253, a voice conversion subunit 1254, and a second integration subunit 1255.
[0311] The first upload subunit 1251 is used to upload the second type of image data to the server, and when the second type of image data is successfully uploaded, it obtains the image resource identifier corresponding to the second type of image data.
[0312] The second uploading subunit 1252 is used to upload voice information to the server and obtain the voice resource identifier corresponding to the voice information when the voice information is successfully uploaded.
[0313] The first integration subunit 1253 is used to integrate the second type of image data carrying image resource identifiers and the voice information carrying voice resource identifiers to obtain the first object media data associated with the first object.
[0314] The speech conversion subunit 1254 is used to perform speech conversion processing on speech information to obtain the converted text information corresponding to the speech information, and to display the converted text information in the image capture area used to capture image data of the object.
[0315] The second integration subunit 1255 is used to integrate the converted text information with the media data of the first object to obtain a first session message for sending to the second object;
[0316] The specific implementation methods of the first uploading subunit 1251, the second uploading subunit 1252, the first integration subunit 1253, the speech conversion subunit 1254, and the second integration subunit 1255 can be found above. Figure 3 The description of step S102 in the corresponding embodiment will not be repeated here.
[0317] The specific implementation methods of the first data determination unit 121, the first message generation unit 122, the data capture unit 123, the second data determination unit 124, and the second message generation unit 125 can be found above. Figure 3 The description of step S102 in the corresponding embodiment will not be repeated here.
[0318] The first update module 13 is used to update the first virtual image to a first virtual updated image in the virtual image display area containing the second virtual image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data.
[0319] The first update module 13 is specifically used to update the first virtual image based on the first object media data to obtain a first virtual updated image that matches the first object media data, and to update the first virtual image to the first virtual updated image in the virtual image display area containing the second virtual image.
[0320] The first update module 13 may include: a first update unit 131, a second update unit 132, a third update unit 133, and a fourth update unit 134;
[0321] The first updating unit 131 is used to update the first virtual image based on the first type of image data when the first object media data contains the first type of image data, so as to obtain a first virtual updated image that matches the first type of image data.
[0322] The first update unit 131 may include: a data detection subunit 1311 and an update subunit 1312;
[0323] The data detection subunit 1311 is used to detect media data in the first session message through the message manager. If the first object media data carried in the first session message contains the first type of image data, a state trigger event is generated and sent to the virtual avatar processor. The state trigger event includes the object identifier of the first object and the image data list. The image data list is used to record the first type of image data contained in the first object media data.
[0324] The update subunit 1312 is used to update the first virtual image associated with the object identifier based on the first type of image data in the image data list when the virtual image processor receives a state trigger event, so as to obtain a first virtual updated image that matches the first type of image data.
[0325] The specific implementation methods of the data detection subunit 1311 and the update subunit 1312 can be found in the above description. Figure 3The description of step S103 in the corresponding embodiment will not be repeated here.
[0326] The second updating unit 132 is used to update the first virtual image to the first updated virtual image in the virtual image display area containing the second virtual image;
[0327] The third updating unit 133 is used to update the first virtual image based on the second type of image data in response to a triggering operation on the first object media data when the first object media data contains the second type of image data, so as to obtain a first virtual updated image that matches the second type of image data.
[0328] The fourth update unit 134 is used to update the first virtual image to the first virtual update image in the virtual image display area containing the second virtual image.
[0329] The specific implementation methods of the first update unit 131, the second update unit 132, the third update unit 133, and the fourth update unit 134 can be found above. Figure 3 The description of step S103 in the corresponding embodiment will not be repeated here.
[0330] The conversation interface includes a message display area for showing historical conversation messages; historical conversation messages are the conversation messages recorded when the first object and the second object have an instant conversation.
[0331] The area hiding module 14 is used to respond to the hiding operation of the message display area, hide the message display area in the session interface, and use the display interface where the virtual image display area is located as the session update interface.
[0332] The second update module 15 is used to update the first virtual image from a partial virtual image of the first object to a complete virtual image of the first object in the session update interface, and to update the second virtual image from a partial virtual image of the second object to a complete virtual image of the second object; when the first object and the second object are having an instant conversation, the second session message sent by the first object to the second object is displayed in the session update interface.
[0333] The state switching module 16 is used to respond to the business state switching operation for the first object, update the business state of the first object to the business update state, and update the first virtual image to the second virtual update image that matches the business update state in the virtual image display area containing the second virtual image.
[0334] The message generation module 17 is used to determine a third type of image data based on the first virtual image when no object image data is captured, integrate the third type of image data with voice information to obtain the first object media data associated with the first object, and generate a first session message to be sent to the second object based on the first object media data.
[0335] The voice playback module 18 is used to respond to a trigger operation on the first object media data carried in the first session message, play voice information, display sound effects and animations associated with the voice information in the message display area, and synchronously highlight the converted text information contained in the first session message.
[0336] The third display module 19 is used to display the first type of image data in the virtual image display area.
[0337] Background update module 20 is used to use the virtual background associated with the first virtual image and the second virtual image as the original virtual background in the virtual image display area; when background mapping text is detected in the first session message, the original virtual background is updated to a virtual update background based on the background mapping text; and the first virtual update image, the second virtual image and the virtual update background are fused to obtain a fused background virtual image for display in the virtual image display area.
[0338] The fourth display module 21 is used to display the third session message in the message display area of the session interface when the third session message sent by the second object is received; the third session message carries the media data of the second object associated with the second object.
[0339] The third update module 22 is used to update the second virtual image to a third virtual updated image in the virtual image display area containing the second virtual image; the third virtual updated image is obtained by updating the second virtual image based on the second object media data.
[0340] The specific implementation methods of the first display module 11, the second display module 12, the first update module 13, the area hiding module 14, the second update module 15, the state switching module 16, the message generation module 17, the voice playback module 18, the third display module 19, the background update module 20, the fourth display module 21, and the third update module 22 can be found above. Figure 3 The description of steps S101-S103 in the corresponding embodiments can also be found above. Figure 10 The descriptions of steps S201-S207 in the corresponding embodiments will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated.
[0341] Please see Figure 19This is a schematic diagram of the structure of a data processing device based on a virtual avatar provided in an embodiment of this application. The data processing device based on a virtual avatar can be a computer program (including program code) running on a computer device; for example, the data processing device based on a virtual avatar can be an application software. This device can be used to execute corresponding steps in the data processing method based on a virtual avatar provided in the embodiment of this application. Figure 19 As shown, the data processing 2 based on virtual avatars may include: a first display module 21, a voice input display module 22, a second display module 23, an avatar update module 24, a third display module 25, a fourth display module 26, and a fifth display module 27;
[0342] The first display module 21 is used to display the conversation interface when the first object and the second object are having an instant conversation. In the virtual image display area of the conversation interface, the first virtual image of the first object and the second virtual image of the second object are displayed.
[0343] The voice input display module 22 is used to respond to the trigger operation of the conversation interface, output the image capture area for capturing object image data of the first object, and display the conversation image data of the first object when it is conducting an instant conversation when the first object enters voice information through the voice control in the image capture area.
[0344] The second display module 23 is used to display a first session message sent by the first object to the second object in the message display area of the session interface; the first session message carries first object media data associated with the first object; the first object media data is determined based on session image data and voice information;
[0345] The image update module 24 is used to update the first virtual image to a first virtual updated image in the virtual image display area containing the second virtual image; the first virtual updated image is obtained by updating the first virtual image based on the first object media data;
[0346] The first object media data includes conversational image data carrying voice information; the conversational image data is generated based on the captured object image data of the first object, or the conversational image data is obtained by adjusting the image state of the first virtual avatar based on the captured object image data of the first object;
[0347] The third display module 25 is used to play voice information in response to a trigger operation on the first object media data carried by the first session message, and to display sound effects and animations of session image data carrying voice information in the message display area.
[0348] The first session message also includes first converted text information obtained by converting speech information into speech in the image capture area;
[0349] The fourth display module 26 is used to synchronously highlight the first conversion text information in the message display area when displaying the audio-visual animation of the conversation image data carrying voice information in the message display area;
[0350] The fifth display module 27 is used to perform voice conversion processing on the voice information when displaying the audio-visual animation of the conversation image data carrying voice information in the message display area, to obtain the second converted text information corresponding to the voice information, and to synchronously highlight the second converted text information in the message display area.
[0351] The specific implementation methods of the first display module 21, voice input display module 22, second display module 23, image update module 24, third display module 25, fourth display module 26, and fifth display module 27 can be found above. Figure 16 The descriptions of steps S301-S307 in the corresponding embodiments will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated.
[0352] Please see Figure 20 This is a schematic diagram of the structure of a computer device provided in an embodiment of this application. Figure 20 As shown, the computer device 1000 may include a processor 1001, a network interface 1004, and a memory 1005. Furthermore, the computer device 1000 may also include a user interface 1003 and at least one communication bus 1002. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen and a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1004 may be high-speed RAM or non-volatile memory, such as at least one disk drive. The memory 1005 may optionally be at least one storage device located remotely from the processor 1001. Figure 20 As shown, the memory 1005, which is a computer-readable storage medium, may include an operating system, a network communication module, a user interface module, and a device control application.
[0353] In such Figure 20In the computer device 1000 shown, the network interface 1004 provides network communication functionality; the user interface 1003 is mainly used to provide an input interface for the user; and the processor 1001 can be used to call the device control application stored in the memory 1005 to execute the aforementioned... Figure 3 , Figure 10 , Figure 16 , Figure 17 The description of the data processing method based on the virtual image in any corresponding embodiment will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated.
[0354] Furthermore, it should be noted that this application embodiment also provides a computer-readable storage medium, which stores the computer program executed by the aforementioned data processing device 1 and data processing device 2 based on virtual avatars. The computer program includes program instructions, and when the processor executes the program instructions, it can execute the aforementioned... Figure 3 , Figure 10 , Figure 16 , Figure 17 The description of the data processing method based on virtual avatars in any corresponding embodiment is already provided and will not be repeated here. Furthermore, the beneficial effects of using the same method will also not be repeated. For technical details not disclosed in the computer-readable storage medium embodiments related to this application, please refer to the description of the method embodiments of this application.
[0355] The aforementioned computer-readable storage medium can be the data processing device based on a virtual image provided in any of the foregoing embodiments, or the internal storage unit of the aforementioned computer device, such as the hard drive or memory of the computer device. The computer-readable storage medium can also be an external storage device of the computer device, such as a plug-in hard drive, smart media card (SMC), secure digital (SD) card, flash card, etc., provided on the computer device. Furthermore, the computer-readable storage medium can include both internal storage units and external storage devices of the computer device. The computer-readable storage medium is used to store the computer program and other programs and data required by the computer device. The computer-readable storage medium can also be used to temporarily store data that has been output or will be output.
[0356] Furthermore, it should be noted that this application also provides a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the aforementioned... Figure 3 , Figure 10 , Figure 16 , Figure 17 The method is provided in any of the corresponding embodiments. Furthermore, the beneficial effects of using the same method will not be repeated here. For technical details not disclosed in the computer program products or computer program embodiments involved in this application, please refer to the description of the method embodiments of this application.
[0357] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this application.
[0358] The above-disclosed embodiments are merely preferred embodiments of this application and should not be construed as limiting the scope of this application. Therefore, any equivalent variations made in accordance with the claims of this application shall still fall within the scope of this application.
Claims
1. A data processing method based on virtual avatars, characterized in that, include: The interface displays the conversation interface when the first object and the second object have an instant conversation. The conversation interface includes a virtual avatar display area and a message display area. The message display area is used to display historical conversation messages, which are the recorded conversation messages when the first object and the second object have an instant conversation. In the virtual avatar display area of the session interface, the first virtual avatar of the first object and the second virtual avatar of the second object are displayed; In response to a trigger operation on the session interface, a first session message sent by the first object to the second object is displayed in the message display area of the session interface; The first session message carries first object media data associated with the first object; In the virtual avatar display area containing the second virtual avatar, the first virtual avatar is updated to a first updated virtual avatar; the first updated virtual avatar is obtained by updating the first virtual avatar based on the first object media data; The step of displaying a first session message sent by the first object to the second object in the message display area of the session interface in response to a trigger operation on the session interface includes: In response to a trigger operation on the conversation interface, when the first object records voice information through the voice control, the camera is invoked to capture the object image data of the first object; When the object image data is captured, the image state of the first virtual image is adjusted based on the object image data; Based on the first virtual image after adjusting its image state, a second type of image data is generated to represent the object state of the first object during an instant conversation. The second type of image data includes dynamic image data to show the image state change process of the first virtual image. The second type of image data is integrated with the voice information to obtain first object media data associated with the first object. A first session message is generated based on the first object media data for sending to the second object, and the first session message is displayed in the message display area of the session interface. In response to a trigger operation on the first object media data carried in the first session message, the voice information is played and sound effects animations associated with the voice information are displayed in the message display area.
2. The method according to claim 1, characterized in that, The session interface for displaying a real-time conversation between the first object and the second object, wherein the virtual avatar display area of the session interface displays the first virtual avatar of the first object and the second virtual avatar of the second object, including: Displays the conversation interface when the first object and the second object are having an instant conversation; The local virtual image used to represent the first object is used as the first virtual image of the first object, and the local virtual image used to represent the second object is used as the second virtual image of the second object. The first virtual avatar and the second virtual avatar are displayed in the virtual avatar display area of the session interface.
3. The method according to claim 2, characterized in that, The method further includes: In response to the operation of hiding the message display area, the message display area is hidden in the conversation interface, and the display interface where the virtual image display area is located is used as the conversation update interface. In the session update interface, the first virtual image is updated from a partial virtual image of the first object to a complete virtual image of the first object, and the second virtual image is updated from a partial virtual image of the second object to a complete virtual image of the second object; When the first object and the second object are having an instant conversation, the second conversation message sent by the first object to the second object is displayed in the conversation update interface.
4. The method according to claim 1, characterized in that, Also includes: In response to a business status switching operation for the first object, the business status of the first object is updated to a business update status, and in the virtual image display area containing the second virtual image, the first virtual image is updated to a second virtual update image that matches the business update status.
5. The method according to claim 1, characterized in that, The step of displaying a first session message sent by the first object to the second object in the message display area of the session interface in response to a trigger operation on the session interface includes: In response to a trigger operation on the session interface, a first type of image data is determined to characterize the object state of the first object during an instant session. The first type of image data is used as first object media data associated with the first object. A first session message is generated based on the first object media data for sending to the second object, and the first session message is displayed in the message display area of the session interface.
6. The method according to claim 5, characterized in that, The step of determining a first type of image data characterizing the object state of the first object during an instantaneous session in response to a trigger operation on the session interface includes: In response to a trigger operation on the text input control in the session interface, the text information entered through the text input control is displayed; When the text information is detected to carry state mapping text, the first type of image data mapped by the state mapping text, which characterizes the object state of the first object during an instant session, is displayed.
7. The method according to claim 5, characterized in that, The step of determining a first type of image data characterizing the object state of the first object during an instantaneous session in response to a trigger operation on the session interface includes: In response to a trigger operation on the status display control in the session interface, an image selection panel associated with the status display control is output; In response to a selection operation on the image selection panel, the image data corresponding to the selection operation is used as a first type of image data to characterize the object state of the first object during an instant conversation.
8. The method according to claim 5, characterized in that, The step of determining a first type of image data characterizing the object state of the first object during an instantaneous session in response to a trigger operation on the session interface includes: In response to the determination operation of target image data in the session interface, the target image data is used as a first type of image data to characterize the object state of the first object during an instant session.
9. The method according to claim 1, characterized in that, When the object image data is captured, the image state of the first virtual avatar is adjusted based on the object image data. Based on the first virtual avatar with the adjusted image state, a second type of image data is generated to characterize the object state of the first object during an instant conversation, including: When the object image data is captured, state detection is performed on the object image data, and the detected state is used as the object state to characterize the first object when it is conducting an instant session. Obtain the first virtual image, adjust the image state of the first virtual image based on the object state, and generate a second type of image data to represent the object state based on the first virtual image after adjusting the image state.
10. The method according to claim 1, characterized in that, The step of integrating the second type of image data with the voice information to obtain first object media data associated with the first object includes: Upload the second type of image data to the server, and when the second type of image data is successfully uploaded, obtain the image resource identifier corresponding to the second type of image data; The voice information is uploaded to the server, and when the voice information is successfully uploaded, the voice resource identifier corresponding to the voice information is obtained; The second type of image data carrying the image resource identifier and the voice information carrying the voice resource identifier are integrated to obtain first object media data associated with the first object.
11. The method according to claim 1, characterized in that, Also includes: If the object image data is not captured, a third type of image data is determined based on the first virtual image, and the third type of image data is integrated with the voice information to obtain first object media data associated with the first object. A first session message for sending to the second object is generated based on the first object media data.
12. The method according to claim 1, characterized in that, The step of generating a first session message for sending to the second object based on the media data of the first object includes: The speech information is processed by speech conversion to obtain the converted text information corresponding to the speech information, and the converted text information is displayed in the image capture area used to capture the image data of the object. The converted text information is integrated with the media data of the first object to obtain a first session message for sending to the second object.
13. The method according to claim 12, characterized in that, When playing the voice information and displaying sound effects animation associated with the voice information in the message display area in response to a triggering operation for the first object media data carried in the first session message, the method further includes: The converted text information contained in the first session message is highlighted synchronously.
14. The method according to claim 1, characterized in that, The step of updating the first virtual image to a first updated virtual image in the virtual image display area containing the second virtual image includes: The first virtual avatar is updated based on the first object media data to obtain a first virtual updated avatar that matches the first object media data. In the virtual avatar display area containing the second virtual avatar, the first virtual avatar is updated to the first virtual updated avatar.
15. The method according to claim 14, characterized in that, The step of updating the first virtual avatar based on the first object media data to obtain a first updated virtual avatar matching the first object media data, and updating the first virtual avatar to the first updated virtual avatar in the virtual avatar display area containing the second virtual avatar, includes: When the first object media data contains the first type of image data, the first virtual image is updated based on the first type of image data to obtain a first virtual updated image that matches the first type of image data; In the virtual image display area containing the second virtual image, the first virtual image is updated to the first updated virtual image; The method further includes: The first type of image data is displayed in the virtual avatar display area.
16. The method according to claim 15, characterized in that, When the first object media data contains a first type of image data, updating the first virtual image based on the first type of image data to obtain a first updated virtual image that matches the first type of image data includes: The message manager performs media data detection on the first session message. If it detects that the first object media data carried in the first session message contains first type of image data, a state trigger event is generated and the state trigger event is sent to the virtual avatar processor. The state trigger event includes the object identifier of the first object and an image data list. The image data list is used to record the first type of image data contained in the first object media data. When the virtual avatar processor receives the state trigger event, it updates the first virtual avatar associated with the object identifier based on the first type of image data in the image data list, thereby obtaining a first updated virtual avatar that matches the first type of image data.
17. The method according to claim 15, characterized in that, The step of updating the first virtual avatar based on the first object media data to obtain a first updated virtual avatar matching the first object media data, and updating the first virtual avatar to the first updated virtual avatar in the virtual avatar display area containing the second virtual avatar, includes: When the first object media data contains the second type of image data, in response to the triggering operation for the first object media data, the first virtual image is updated based on the second type of image data to obtain a first virtual updated image that matches the second type of image data; In the virtual image display area containing the second virtual image, the first virtual image is updated to the first updated virtual image.
18. The method according to claim 1, characterized in that, Also includes: In the virtual avatar display area, the virtual background associated with the first virtual avatar and the second virtual avatar is used as the original virtual background; When background mapping text is detected in the first session message, the original virtual background is updated to a virtual updated background based on the background mapping text; The first virtual updated image, the second virtual image, and the virtual updated background are fused together to obtain a fused background virtual image for display in the virtual image display area.
19. The method according to claim 1, characterized in that, Also includes: Upon receiving a third session message sent by the second object, the third session message is displayed in the message display area of the session interface; The third session message carries media data of the second object associated with the second object; In the virtual avatar display area containing the second virtual avatar, the second virtual avatar is updated to a third virtual updated avatar; the third virtual updated avatar is obtained by updating the second virtual avatar based on the second object media data.
20. The method according to claim 1, characterized in that, The step of displaying the first virtual avatar of the first object and the second virtual avatar of the second object in the virtual avatar display area of the session interface includes: The system obtains a first image resource identifier and a first business status corresponding to the first object, and obtains a second image resource identifier and a second business status corresponding to the second object. Based on the first image resource identifier, the first business status, the second image resource identifier, and the second business status, it generates an image resource acquisition request and sends the image resource acquisition request to the server. Upon receiving the image resource acquisition request, the server generates a first image resource address corresponding to the first virtual image based on the first image resource identifier and the first business status, and generates a second image resource address corresponding to the second virtual image based on the second image resource identifier and the second business status. The server then returns the first image resource address and the second image resource address. Receive the first image resource address and the second image resource address returned by the server, obtain the first virtual image associated with the first object based on the first image resource address, and obtain the second virtual image associated with the second object based on the second image resource address; The first virtual avatar and the second virtual avatar are displayed in the virtual avatar display area of the session interface.
21. A data processing method based on virtual avatars, characterized in that, include: The interface displays the conversation interface when the first object and the second object have an instant conversation. The conversation interface includes a virtual avatar display area and a message display area. The message display area is used to display historical conversation messages, which are the recorded conversation messages when the first object and the second object have an instant conversation. In the virtual avatar display area of the session interface, the first virtual avatar of the first object and the second virtual avatar of the second object are displayed; In response to a trigger operation on the conversation interface, a voice control and an image capture area for capturing object image data of the first object are output. When the first object records voice information through the voice control, the conversation image data of the first object during the real-time conversation is displayed in the image capture area. The conversation image data is generated based on the image state of the first virtual avatar reconstructed from the captured object image data of the first object, or the conversation image data is obtained by adjusting the image state of the first virtual avatar based on the captured object image data of the first object. The conversation image data includes dynamic image data for displaying the process of image state changes of the first virtual avatar. The first session message sent by the first object to the second object is displayed in the message display area of the session interface; The first session message carries first object media data associated with the first object; the first object media data includes session image data carrying the voice information; In response to a trigger operation on the first object media data carried in the first session message, the voice information is played, and an audio-visual effect of the session image data carrying the voice information is displayed in the message display area.
22. The method according to claim 21, characterized in that, The first session message also includes first converted text information obtained by performing speech conversion processing on the speech information in the image capture area; The method further includes: When the audio-visual effects animation carrying the voice information is displayed in the message display area, the first converted text information is synchronously highlighted in the message display area.
23. A computer device, characterized in that, include: Processor and memory; The processor is connected to the memory, wherein the memory is used to store a computer program, and the processor is used to invoke the computer program to cause the computer device to perform the method according to any one of claims 1-22.
24. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any one of claims 1-22.
25. A computer program product, characterized in that, The computer program product includes computer instructions stored in a computer-readable storage medium, the computer instructions being adapted to be read and executed by a processor to cause a computer device having the processor to perform the method of any one of claims 1-22.