Live broadcast information processing method and device and electronic equipment

By using CDN and messaging channels to translate interactive information in real time during live video streaming, the interaction barriers caused by language differences in cross-border scenarios have been resolved, enabling effective interaction in cross-border live streaming.

CN115802068BActive Publication Date: 2026-06-12ALIBABA SINGAPORE HLDG PTE LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIBABA SINGAPORE HLDG PTE LTD
Filing Date
2020-12-29
Publication Date
2026-06-12

Smart Images

  • Figure CN115802068B_ABST
    Figure CN115802068B_ABST
Patent Text Reader

Abstract

Embodiments of the present application disclose a live broadcast information processing method and device and electronic equipment, wherein one method comprises: after a video live broadcast session is created, obtaining interactive information content to be displayed in the video live broadcast session; obtaining at least one translation result corresponding to the interactive information content; according to a regional attribute associated with a service node in a content distribution network (CDN) and language information associated with the regional attribute, sending a translation result corresponding to a language to the service node; wherein if the translation result corresponding to the language required by the client does not exist in the service node in the CDN, the translation result corresponding to the language required by the client is obtained through a backtracking manner to a superior service node and provided to the client. Through the embodiments of the present application, it is possible to make video live broadcast interaction in a cross-border scenario possible.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of live streaming technology, and in particular to live streaming information processing methods, devices and electronic devices. Background Technology

[0002] The emergence of live video streaming has changed the relationship between people, goods, and place in traditional product information systems. Hosts introduce product information and interact with consumers in real time through live streaming, thereby shortening the decision-making path for consumers and improving their shopping experience.

[0003] However, live video streaming struggles in cross-border scenarios of "global buying and global selling." This is because in live video streaming, the host primarily describes products through images and language, and can interact via text input in the interactive area. However, in such cross-border scenarios, both buyers and sellers may come from multiple countries and speak various languages, leading to situations where users cannot understand each other's languages. For example, if a Chinese seller wants to introduce their product via live streaming, the host typically uses Chinese. However, viewers may come from the US, Europe, Russia, Japan, and other countries. Buyers from these countries may not understand the host's Chinese. Furthermore, hosts often lack fluency in multiple languages, making it difficult for them to understand messages sent in their native languages, hindering interaction. Additionally, buyers may not understand each other's messages, preventing true interaction and rendering live video streaming ineffective.

[0004] In the era of text, images, and short videos, users could translate comments they wanted to see into their native language using platform translation tools. However, in the era of live streaming and real-time interaction, by the time a user completes the process of "selecting a comment → choosing their native language → starting translation," viewers in the live stream may have already moved on to the next topic. Therefore, even translation capabilities cannot solve the problem of real-time interaction in live streaming.

[0005] Therefore, how to achieve interaction during live video streaming in cross-border scenarios has become a technical problem that needs to be solved by those skilled in the art. Summary of the Invention

[0006] This application provides a method, apparatus, and electronic device for processing live streaming information, which enables interactive live video streaming in cross-border scenarios.

[0007] A method for processing live streaming information, comprising:

[0008] After the video live streaming session is created, obtain the interactive information content to be displayed in the video live streaming session;

[0009] Obtain at least one translation result corresponding to the interactive information content;

[0010] Based on the geographic attributes associated with service nodes in the Content Delivery Network (CDN) and the language information associated with those geographic attributes, translation results for the corresponding language are sent to the service nodes. This allows the clients of the participants in the live video session to retrieve the translation results for the required language from the associated service nodes and display the interactive information content based on the translation results. If the service nodes in the CDN do not have the translation results for the client's required language, the translation results for the client's required language are retrieved by backtracking to a higher-level service node and then provided to the client.

[0011] A method for processing live streaming information, comprising:

[0012] During the process of participating in a live video session through a client, determine the target language required by the associated participant users;

[0013] The system obtains the translation results of the interactive information content to be displayed in the live video session, and the translation results correspond to the target language. Specifically, the server sends the translation results for the corresponding language to the service nodes in the Content Delivery Network (CDN) based on the regional attributes associated with those service nodes and the language information associated with those regional attributes. This allows the client to retrieve the translation results for the desired target language from the associated service nodes. If the service nodes in the CDN do not have the translation results for the client's desired language, the system retrieves the translation results for the client's desired language by backtracking to a higher-level service node and provides them to the client.

[0014] The translation results of the interactive information content are displayed on the interface associated with the live video session.

[0015] A method for processing live streaming information, comprising:

[0016] During the live video session, you receive the original text of the interactive information input in your first language;

[0017] The original text of the interactive information content is submitted to the server for translation, obtaining a translation result in at least one second language. This translation result is then provided to other participating user clients who require the corresponding second language. When providing the translation result to the client, the translation result for the corresponding language is sent to the service node based on the regional attribute associated with the service node in the Content Delivery Network (CDN) and the language information associated with that regional attribute. This allows the client corresponding to the participating user in the live video session to retrieve the translation result for the required language from the associated service node. If the service node in the CDN does not have the translation result for the client's required language, the translation result is retrieved by backtracking to a higher-level service node and then provided to the client.

[0018] A video information processing method, comprising:

[0019] During the playback of the target video, obtain the commentary subtitles associated with the target video that are to be displayed;

[0020] Obtain at least one translation result corresponding to the commentary caption content;

[0021] Based on the geographic attributes associated with service nodes in the Content Delivery Network (CDN) and the language information associated with those geographic attributes, the translation results for the corresponding language are sent to the service nodes. This allows the client associated with the viewer of the target video to retrieve the translation results for the required language from the associated service nodes and display the translated commentary subtitles. If the CDN service nodes do not have the translation results for the client's required language, the translation results for the client's required language are retrieved by backtracking to a higher-level service node and then provided to the client.

[0022] According to the specific embodiments provided in this application, the following technical effects are disclosed:

[0023] Through the embodiments of this application, interactive information content generated in a live video session can be translated in real time, and the translation results can be provided to the user client participating in the live session for display, so that the user can see interactive information content expressed in their desired language, thereby making live video interaction in cross-border scenarios possible.

[0024] Of course, any product implementing this application does not necessarily need to achieve all of the advantages described above at the same time. Attached Figure Description

[0025] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0026] Figure 1 This is a schematic diagram of the system architecture provided in the embodiments of this application;

[0027] Figure 2 This is a flowchart of the first method provided in the embodiments of this application;

[0028] Figure 3 This is a flowchart of the second method provided in the embodiments of this application;

[0029] Figure 4 This is a flowchart of the third method provided in the embodiments of this application;

[0030] Figure 5 This is a flowchart of the fourth method provided in the embodiments of this application;

[0031] Figure 6 This is a schematic diagram of the user interface provided in an embodiment of this application;

[0032] Figure 7 This is a flowchart of the fifth method provided in the embodiments of this application;

[0033] Figure 8 This is a flowchart of the sixth method provided in the embodiments of this application;

[0034] Figure 9 This is a schematic diagram of the first device provided in the embodiments of this application;

[0035] Figure 10 This is a schematic diagram of the second device provided in the embodiments of this application;

[0036] Figure 11 This is a schematic diagram of the third device provided in the embodiments of this application;

[0037] Figure 12 This is a schematic diagram of the fourth device provided in the embodiments of this application;

[0038] Figure 13 This is a schematic diagram of the fifth device provided in the embodiments of this application;

[0039] Figure 14 This is a schematic diagram of the sixth device provided in the embodiments of this application;

[0040] Figure 15 This is a schematic diagram of the electronic device provided in the embodiments of this application. Detailed Implementation

[0041] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.

[0042] In this embodiment, to enable interaction during live video streaming in cross-border scenarios, a real-time translation service for interactive information generated during the live stream can be provided. This service can also translate specific interactive information into multiple target languages ​​according to actual needs, and provide the translation results to the client according to the language requested by the specific client user. In this way, the client can display the translated interactive information, allowing users to directly see the interactive information expressed in their desired language. For example, if users associated with a live stream (including the "host" user and viewers watching the live stream) are from multiple countries, and multiple users send comments in their respective native languages, the specific comments can first be received by the server. Then, the server can translate the specific comments into multiple languages ​​and provide the corresponding translation results according to the language requested by the specific client user. For example, if user A, who is watching the live stream, is natively English, and a comment was edited in Russian by user B, the interactive information ultimately displayed on user A's interface can still be a translated result expressed in English. In this way, user A can understand the interactive content sent by user B; correspondingly, if user A sends a comment in English, the comment will also be translated into multiple target languages, so that users whose native language is Russian, Japanese, etc., can also understand the comment sent by user A, and so on.

[0043] In implementing the above solution, the real-time translation of interactive information and the transmission of translation results can put significant pressure on the server. Therefore, ensuring the server can effectively support the solution under this pressure is crucial. Furthermore, the amount of information received by the client can increase exponentially, leading to increased client-side workload. Ensuring optimal information display for the client is also a critical consideration. This application provides solutions for both of these issues. For example, on the server side, translation results can be provided to the client via message channels or a CDN (Content Delivery Network). Additionally, translation can be tailored to the languages ​​associated with the actual number of online users in the specific "live stream," rather than translating all languages, thus reducing server load. On the client side, received interactive information can be merged, deleted, or processed to reduce the amount of information the client needs to render and display.

[0044] From a system architecture perspective, see Figure 1 This application embodiment may involve a video live streaming system. This system can be a system dedicated to providing video live streaming services, or it can be a system associated with a product information service system, providing video live streaming services for a related application system, and so on. Specifically, the video live streaming system may include a server and a client. The client may include a broadcaster and a viewer. The broadcaster can create live streams, and the viewer can select a specific "live room" to watch and participate in interactions, etc. The server mainly provides backend data services. In this application embodiment, it can provide translation services for interactive information content and provide the translation results to the client through various methods. For example, it can establish a long connection with the client through a message channel to push the translation results to the client, or distribute the translation through a CDN (Content Delivery Network) so that the client can obtain the translation results by pulling them, etc. Through these methods, users associated with the client can understand interactive information content initiated by users from multiple other countries, thereby realizing live streaming interaction in cross-border scenarios.

[0045] It should be noted that in practical applications, during live streaming in cross-border scenarios, in addition to translating interactive information content, real-time translation of audio content in the live stream (mainly including what the host says) can also be performed. This part will not be emphasized in this application embodiment. This application embodiment mainly focuses on the real-time translation of interactive information content during the live stream, including announcements posted by the host, comments sent by viewers, system push notifications, etc.

[0046] The specific technical solutions provided in the embodiments of this application will be described in detail below.

[0047] Example 1

[0048] This first embodiment provides a method for processing interactive information during live video streaming from the server's perspective. (See [link to previous document]). Figure 2 The method may specifically include:

[0049] S201: After the video live streaming session is created, obtain the interactive information content to be displayed in the video live streaming session.

[0050] When a user goes live and starts a video live stream, the server can create a video live stream session for them (specifically, a "live room"). Each live stream session can have a unique ID to distinguish it from other live stream sessions. Multiple users can participate in the video live stream session, with at least two participants speaking different languages. For example, users from multiple countries may be watching the live stream simultaneously. Participants in the video live stream session can be categorized into two types: live streamers and viewers. The live streamer can vary depending on the application scenario. For instance, in a product information system, the live streamer might typically be a user associated with a specific store.

[0051] Once a specific live video session is created, participating users can interact. Viewers can send comments or ask questions of the streamer, while the streamer can send announcements, reply to other users' inquiries, or initiate interactive games. Additionally, the system typically pushes notifications, including the live stream status, user login / logout notifications, the number of online viewers, and the number of likes. Furthermore, in scenarios such as product object information systems, the system can also include status information related to product objects associated with the specific live stream session, such as adding a product object, or highlighting / unhighlighting a product object.

[0052] In summary, live video sessions typically generate a large amount of interactive content, encompassing various types and initiators. In cross-border scenarios, this interactive content is often edited by the initiator user in their native language; therefore, viewers or broadcasters speaking other languages ​​may not understand it. To address this, this application provides a real-time translation solution, enabling the translated interactive content to be directly displayed on the interfaces associated with participants in the live video session, facilitating user comprehension and achieving genuine interaction.

[0053] It's important to note that in practical implementations, in cross-border scenarios, the servers for a specific live video streaming system may be distributed across multiple countries or regions. However, the servers providing translation services may not be the same system as the live streaming system's servers, and these servers may only be deployed within a single country. For example, the main operator and maintainer of a specific product information service system might be located in a city in China. While the product information service and live streaming servers may be deployed globally, the servers providing translation services might only be deployed in that same city in China. Since the participants in a live video session come from multiple countries or regions, in practice, after an interactive message is generated during the live video session, the interactive information can be routed back to the main server in the city where the translation server is located. Each time the main server receives an interactive message, it can interact with the server providing translation services to obtain the translation result, which is then provided to the client participating in the current live session. Of course, if the servers providing translation services are also deployed globally, translation can be performed directly through the nearest server; this is not a limitation here.

[0054] S202: Obtain at least one translation result corresponding to the interactive information content.

[0055] After obtaining the specific interactive information, the server can retrieve at least one corresponding translation result. In practice, the server can obtain the translation result from a server providing translation services. For example, it can send the interactive information and the required target language information to the server, and then retrieve the translation result from that server. Alternatively, if the server has a built-in translation module or system, it can also retrieve the translation result independently.

[0056] In cross-border scenarios, multiple countries may be involved. Therefore, each received interactive message can be translated into multiple target languages. For example, there are 18 commonly used languages; thus, each interactive message can be translated into 18 languages, allowing participating clients to choose the desired language for display. Of course, the specific number of languages ​​to be translated can be determined based on information such as the countries or regions supported by the application system. For instance, if an application system only covers Europe, it only needs to be translated into English, French, and Russian, without needing to translate into Japanese, Korean, etc.

[0057] The above approach, which indiscriminately translates all interactive information into various commonly used languages, allows for a large number of clients to access the translation results. Users from virtually every country or region participating in the live stream can directly obtain the translation in their desired language. However, this method consumes significant server resources, including translation resources. Furthermore, a specific live video session may not necessarily cover all commonly used languages; the specific language required depends on the participants. For example, an application system might cover most countries and regions globally, using 18 common languages. However, a live stream might only have users speaking English, Russian, and Japanese. Therefore, if all interactive information generated in the live stream is uniformly translated into the 18 target languages, the translations for languages ​​other than these three will not be actually used, resulting in a waste of translation and other resources.

[0058] Therefore, in a preferred embodiment of this application, the set of target languages ​​to be translated for the interactive information content in the video live stream session can be determined based on the online users associated with the video live stream session and their respective required languages. Then, according to the set of target languages ​​to be translated corresponding to the video live stream session identifier, at least one translation result corresponding to the interactive information content is obtained. That is, for a specific live stream session, the number of languages ​​to be translated can be dynamically determined based on the real-time online users in the live stream session. In specific implementation, when a user joins a live stream session, the client can submit the user's required language and other information to the server, and the server can statistically analyze the language types required by each online user in the live stream session. Then, based on the specific statistical results, the server can dynamically determine which specific target languages ​​the interactive information content generated in the live stream session can be translated into.

[0059] Of course, new users may join a live stream session at any time, and there may be situations where the language required by the new user is not included in the previously translated target languages. For example, the previous statistics for a live stream session included N languages, and the specific interactive information content was translated into N languages. However, when a new user joins the live stream session, the language required by the new user is actually the (N+1)th language. This application also provides a corresponding solution for this situation. Specifically, when a new user joins the video live stream session, if the language required by the new user is not included in the aforementioned target language set, the historical interactive information content associated with the target number of messages (here referring to the original text of the interactive information content before translation; in specific implementation, the original text of all or part of the historical interactive information content can be saved on the server side), for example, obtaining the 10 most recent interactive information messages. Then, the translation result corresponding to the historical interactive information content of the target number of messages can be obtained according to the language required by the new user and provided to the client associated with the new user. Simultaneously, the language required by the new user can also be added to the target language set. In other words, once a new user joins the live stream session, the interactive information generated in that session will be translated into N+1 languages.

[0060] S203: Provide the translation results in the required language to the client associated with the participant user, wherein the client is used to display the translation results of the interactive information content according to the language required by the associated user.

[0061] After obtaining the specific translation results, these results can be provided to the client associated with the participating user. Once the client receives the translation of the interactive information content, it can display the translation according to the language required by the associated user. Specifically, an interactive area can be created above the live stream interface of the live session, where the translation results of the received interactive information content can be displayed.

[0062] In practical implementation, since the server uniformly processes various interactive information generated during the live session to obtain translation results, how to provide the translation results to multiple clients participating in the live session is a problem that needs to be considered. Specifically, this application provides two specific implementation schemes. In one implementation, the translation results can be provided to the client through a message channel, which is used to establish a long connection with the client. Specifically, the server providing the channel service can complete the creation of the specific message channel and the long connection. After obtaining the translation results, they can be pushed to the client through this message channel. There can be multiple message channels, each used to push translation results in different languages. The client can choose to subscribe to one of the channels to receive the translation results pushed by that channel. In another approach, the translation results can also be provided to the clients participating in the live session through CDN (Content Delivery Network).

[0063] Furthermore, since both message channels and CDN distribution methods have their advantages and disadvantages, they can be combined. For example, message channels offer the advantage of low latency, but the message delivery rate is relatively low; while CDN distribution may introduce latency, but the message delivery rate is relatively high. The timeliness requirements for interactive content generated in specific live sessions may also vary. Therefore, in another approach, the priority of the interactive content can be determined first, based on its timeliness requirements. Then, different methods can be used to provide the client with interactive content of different priorities. For example, for first-priority (e.g., higher priority) interactive content, the translation result can be provided to the client via message channels, while for second-priority (e.g., lower priority) interactive content, the translation result can be provided to the client via CDN distribution.

[0064] Of course, since the delivery rate of specific interactive information content may be relatively low under the message channel method, in order to improve the delivery rate, for the first priority interactive information content, in addition to providing the translation result to the client through the message channel, it can also be distributed through the CDN. This allows clients that fail to receive the translation result through the message channel to obtain the corresponding translation result by pulling it from the associated CDN service node. In other words, for the first priority interactive information content, it can be provided to the client using both the message channel method and the CDN method simultaneously. This ensures timeliness while also supplementing the delivery rate through the CDN method. It should be noted that since the same interactive message may be delivered to the client through both methods, the client can perform deduplication after receiving it.

[0065] Specifically, when determining the priority of the interactive information content, the priority can be determined based on the content type and / or sender identity information of the interactive information content. For example, the interactive information content with the first priority may include one or more of the following: status change information of product objects associated with the live video session (e.g., adding a product object to the live video session, highlighting a product object, de-highlighting a product object, etc.), and interactive information content sent by the host user associated with the live video session (e.g., announcement messages, or interactive game information such as "red envelopes," etc.). The interactive information content with the second priority may include one or more of the following: status information related to the live video session sent by the server (e.g., a user entering the live video room, leaving the live video room, etc.), statistical information (e.g., number of online users, number of likes, etc.), notification information, and interactive information content sent by the viewer user associated with the live video session (e.g., user comments, or questions asked to the host user, etc.). Of course, in specific implementations, the priority of specific interactive information content can also be divided in other ways, depending on the needs of the actual application scenario, and is not limited here.

[0066] The following describes the specific implementation methods for providing translation results to the client via message channels and CDN. Specifically, when providing the translation results to the client via message channels, multiple message channels can be created when the video live streaming session is created, with each message channel corresponding to a language. For example, in one implementation, 18 message channels can be created for 18 commonly used languages, and so on.

[0067] It should be noted that the specific message channel in this embodiment may be a different message channel from the video live streaming session channel. That is, the live content (live video stream, etc.) of the specific video live streaming session can also be pushed to the participant's client via a message channel. However, to avoid affecting the video live streaming data stream, the message channel used to push the translation results of the specific interactive information content in this embodiment may be different from the channel specifically used to push the video live stream. Furthermore, the specific message channel may be a simplex communication channel, used for one-way message pushing from the server to the client.

[0068] After obtaining the translation results, they can be pushed to the corresponding message channels based on the language of the translation. The client can then establish a long-lived connection with the server by subscribing to one of these message channels and obtain the translation results from that channel. It should be noted that in a practical implementation, if the target language set for translation is dynamically determined based on the online users in the live stream session, message channels for all commonly used languages ​​can be established when the live stream session is created; for example, there could still be 18 channels. After dynamically determining the target language set and obtaining the corresponding translation results, the translation results can be pushed to the corresponding message channels. For example, if the current online users in a live stream session mainly include users speaking three languages, the translation results for these three languages ​​can be obtained and pushed to the corresponding three message channels. At this time, the other message channels are empty and cannot provide interactive information to users. If a new user joins the live stream and requires a fourth language, the target number of historical interaction messages can be retrieved, translated into that fourth language, and then pushed to the corresponding message channel. This allows the new user to access the translation results of the most recent historical interaction messages. Similarly, newly received interaction messages can be translated into all four languages, and the corresponding translation results can be pushed to the four message channels, and so on.

[0069] It should be noted that, in practice, the specific message channel resources can be provided by the channel server. That is, the live streaming server can request the creation of a message channel from the channel server. In this case, the channel server can establish a long-lived connection between itself and the participant's client based on the participant's client's subscription request for the target message channel corresponding to the desired language. This allows the translation results to be pushed to the target message channel and then, via the long-lived connection, to be pushed to the participant's client. After the live video session concludes, the channel server can also be notified to destroy or reclaim the specific message channel resources.

[0070] In the case of distribution via CDN, CDN is an intelligent virtual network built on the existing network infrastructure. Relying on edge servers deployed in various locations, and through the load balancing, content distribution, and scheduling modules of the central platform, it allows users to obtain the content they need from the nearest location, reducing network congestion and improving user access response speed and hit rate. Since a CDN network includes multiple service nodes, which are typically deployed in multiple different countries or regions, translation results can be pushed to specific service nodes. Then, the client can retrieve the translation results in the desired language from the nearest service node. In practice, since the URLs and other information of specific CDN service nodes are usually fixed or relatively fixed, and are often associated with geographical attributes—for example, CDN service nodes deployed in a certain country or region typically provide services to users within that country or region—in this embodiment, the translation results in the corresponding language can also be sent to the service node based on the geographical attributes associated with the service node in the CDN and the language information associated with those geographical attributes. This allows the client to retrieve the translation results in the desired language from the associated service node.

[0071] For example, if a service node is deployed in Russia, and users in that country primarily speak Russian as their native language, then only the Russian translation results can be sent to that service node; translation results for other languages ​​do not need to be pushed to it. However, in practical applications, the following situations may arise: while users in a particular country or region may mostly use the same native language, there may be users from other countries or regions located within that country or region. For instance, a British person traveling or working in Russia might actually need an English translation. In this case, while watching a live stream, when requesting the translation results for interactive content from a CDN service node in Russia, the British person might include "English" as the target language in their request. Upon receiving this request, the Russian CDN service node might find that it does not have the corresponding translation results for that language. Therefore, in this situation, the translation results for the client's required language can be obtained by backtracking to a higher-level service node and then provided to the client.

[0072] In other words, in a CDN network, service nodes are typically deployed in a tiered manner. For a service node targeting Russia, its upstream node might be a service node targeting Europe. Since the European service node covers a wider range of users, it usually pushes a greater variety of translation results to it, including, for example, English, French, and German in addition to Russian. Therefore, when the Russian service node receives a request from an English user, it can trace back from its upstream node to obtain the English translation result, and then provide it to the English user. If the corresponding language translation result is still not found in the direct upstream node of a service node, it can continue to trace back to even higher-level service nodes, and so on.

[0073] It should be noted that, in specific implementations, the interactive information content to be displayed in a specific video live stream session in this embodiment may include: text-based interactive information content, in which case the specific translation result may also be a text-based translation result. Alternatively, the interactive information content to be displayed in a video live stream session may also include: audio-based interactive information content, in which case the specific translation result may be a text-based translation result, or it may also be an audio-based translation result.

[0074] In practical implementation, besides providing the corresponding translation results according to the language required by the specific participant's client, the original text of the interactive information can also be provided to the participant's client. This allows the client to compare and display the original text and the translation. That is, assuming the original text of an interactive message is Chinese, the server translates it into multiple languages ​​such as English, Japanese, and Russian. If a participant's target language is English, the server can provide the user with both Chinese and English versions of the interactive message. The client can then display the Chinese and English versions side-by-side, for example, Chinese at the top and English at the bottom, etc. In this way, each interactive message can be displayed in the client's interface as a comparison of the original and translated text. For example, if the interface displays five interactive messages at a certain moment, each with its corresponding original text in Chinese, Japanese, Russian, English, and Korean, and the current user's target language is English, then the five interactive messages can be displayed as: Chinese-English, Japanese-English, Russian-English, English, Korean-English, etc.

[0075] In addition, specific live video sessions can include: live video sessions within a product information system. For example, a live video session initiated by a user associated with a specific store to introduce a product. Alternatively, specific live video sessions can also include: live video sessions in video conferencing scenarios. For example, this could include live streaming of online courses, online lectures, etc. During such video conferences, participants can also make comments and interact with the specific meeting content. For users participating in multiple languages, a service can be provided to translate the interactive information and display it on the client side of the participating users, and so on.

[0076] In summary, through the embodiments of this application, interactive information content generated in a live video session can be translated in real time, and the translation results can be provided to the user client participating in the live session for display, so that the user can see interactive information content expressed in their desired language, thereby making live video interaction in cross-border scenarios possible.

[0077] Example 2

[0078] In the aforementioned Embodiment 1, a specific method for processing live streaming information was described. In this solution, after obtaining the translation result of the specific interactive information content, the translation result can be provided to the client through a message channel or CDN. In Embodiment 2, the implementation method of the message channel is specifically protected separately. Specifically, Embodiment 2 provides a method for processing live streaming information, see [link to documentation]. Figure 3 The method may include:

[0079] S301: After the video live stream session is created, multiple message channels are created; these multiple message channels correspond to the different languages ​​required by the participating users.

[0080] In other words, a live video session can have multiple participants, with at least two requiring different languages ​​for interaction. To accommodate the diverse language needs of multiple participants, the message channels can be individually associated with different languages ​​to push translation results to the clients of participants with varying language requirements. Specifically, the message channels can establish long-lived connections with the clients to push translation results to them.

[0081] S302: Obtain the interactive information content to be displayed in the video live broadcast session;

[0082] S303: Obtain at least one translation result corresponding to the interactive information content;

[0083] S304: Push the translation result to the corresponding message channel, and provide the translation result in the required language to the participant user client through the message channel, so as to display the interactive information content according to the translation result.

[0084] The message channel is distinct from the video live streaming session channel. This message channel can be a simplex communication channel used for one-way message pushing from the server to the client.

[0085] Before acquiring the interactive information content to be displayed in the live video session, at least two comment messages from each participant user can also be acquired; at this time, the interactive information content to be displayed includes the acquired interactive information content determined based on the comment messages from the at least two participant users.

[0086] In practice, the target language set required for the interactive information content in the video live stream session can be determined based on the participating users associated with the video live stream session and their respective required languages. At this time, at least one translation result corresponding to the interactive information content can be obtained according to the target language set.

[0087] When a new user joins the live video session, if the language required by the new user is not included in the target language set, the historical interaction information content associated with the target number of live video sessions can be retrieved. Then, the translation result corresponding to the historical interaction information content of the target number of sessions can be obtained according to the language required by the new user, and the translation result corresponding to the historical interaction information content can be pushed to the corresponding message channel so that the client associated with the new user can obtain the translation result corresponding to the historical interaction information content through the message channel. Afterwards, the language required by the new user can be added to the target language set so that the new user can obtain the translation result of the new interaction information content through the corresponding channel.

[0088] The message channel can be provided by a channel server. The channel server is used to establish a long-lived connection between itself and the participant's client based on the participant's client's subscription request for the target message channel corresponding to the desired language. This long-lived connection is used to push the translation result to the participant's client after the translation result has been pushed to the target message channel.

[0089] In addition, to improve the delivery rate of information, the translation results can be distributed to the clients of the participating users through a Content Delivery Network (CDN). This allows clients that fail to receive the message through the message channel to obtain the translation results for the required language by pulling them from the service nodes associated with the CDN.

[0090] Example 3

[0091] This third embodiment addresses the implementation of CDN and provides a method for processing live streaming information. (See [link to relevant documentation]). Figure 4 The method may specifically include:

[0092] S401: After the video live streaming session is created, obtain the interactive information content to be displayed in the video live streaming session;

[0093] S402: Obtain at least one translation result corresponding to the interactive information content;

[0094] S403: Distribute the translation results through a Content Delivery Network (CDN) to provide the clients of the participants in the live video session with the translation results in the required languages, so as to display the interactive information content based on the translation results.

[0095] In practice, the target language set required for the interactive information content in the video live stream session can be determined based on the participating users associated with the video live stream session and their respective required languages. At this time, at least one translation result corresponding to the interactive information content can be obtained according to the target language set.

[0096] When a new user joins the live video session, if the language requested by the new user is not included in the target language set, the historical interaction information content of the target number of entries associated with the live video session can be obtained. Then, the translation result corresponding to the historical interaction information content of the target number of entries is obtained according to the language requested by the new user, and provided to the client associated with the new user through the CDN network; subsequently, the language requested by the new user can be added to the target language set.

[0097] In practice, the translation results for the corresponding language can be sent to the service node based on the regional attributes associated with the service node in the CDN and the language information associated with the regional attributes, so that the client can obtain the translation results for the required language by pulling them from the associated service node.

[0098] If the service nodes in the CDN do not have the translation result corresponding to the language required by the client, the translation result corresponding to the language required by the client can be obtained by backtracking to the superior service node and then provided to the client.

[0099] Furthermore, the priority of the interactive information content can be determined based on the timeliness requirements of the interactive information content; then, for interactive information content whose priority meets the target conditions, the translation result can be provided to the client corresponding to the participant user through a message channel.

[0100] Example 4

[0101] This fourth embodiment, corresponding to the first embodiment, provides a live streaming information processing method from the client's perspective. The client mentioned here can include both the broadcaster's client and the viewer's client. That is, regardless of the viewer's identity, client-side processing can be performed in a similar manner. For details, see... Figure 5 The method may include:

[0102] S501: During the process of participating in a live video session, determine the target language required by the associated participant users;

[0103] S502: Obtain the translation result of the interactive information content to be displayed in the video live session, wherein the translation result corresponds to the target language;

[0104] In practice, a long-lived connection can be established with the channel server by subscribing to a target message channel corresponding to the target language created by the server in advance. This allows the translation results in the target language to be obtained through the long-lived connection and pushed via the target message channel. Alternatively, the translation results can be retrieved from the associated CDN service node.

[0105] S503: Display the translation results of the interactive information content based on the interface associated with the video live stream session.

[0106] Specifically, after receiving the interactive information, it can be displayed based on the specific live stream interface. For example, such as... Figure 6 As shown at point 61, assuming the current participant's preferred language is English, the interactive information displayed in the comment section of that user's live stream interface can all be translated into English, allowing the user to understand the specific interactive information. Furthermore, the interactive information sent by the user in English can also be translated into multiple other languages ​​and sent to users in other countries.

[0107] In practical implementation, compared to ordinary live streaming scenarios, the number of long-lived connections may increase, and the amount of received interactive information may also be very large, for example, hundreds per second. Such a large amount of interactive information will put pressure on the client's rendering process. Furthermore, displaying all of it is usually meaningless, as users typically cannot browse all the content in such a short time. If much of the information only flashes by, it is even more difficult for users to accurately view it. Therefore, in practical implementation, when the concurrent number of received interactive information exceeds a threshold, the received interactive information can be optimized to determine which interactive information meets the display criteria. Then, only the interactive information that meets the display criteria can be sent to the rendering layer so that it can be displayed based on the interface associated with the live video session.

[0108] Specifically, when optimizing the received interactive information content, the priority of the interactive information content can be determined first. Then, based on the priority of the interactive information content, it can be determined whether the interactive information content meets the display conditions. The determination of the priority of interactive information content during display processing may differ from the server-side method, but it can still be determined based on the type of interactive information content and / or the sender's identity. For example, interactive information content sent by a broadcaster user can also have a higher priority, and so on.

[0109] Specifically, when determining whether interactive information content meets display conditions based on its priority, first-priority interactive information content can be directly identified as meeting the display conditions. For example, higher-priority interactive information content can be directly displayed. For second-priority interactive information content, a specific strategy can be used to determine which information content meets the display conditions and which does not. To support this determination process, second-priority interactive information content can first be temporarily stored in a first message queue. Then, according to the target optimization strategy, it can be determined whether the interactive information content meets the display conditions. Afterward, interactive information content meeting the display conditions can be added to a second message queue so that the optimized interactive information content can be sent to the rendering layer according to the order in the second message queue. In other words, since specific interactive information content is associated with timestamp information, and timestamps are important information, the received interactive information content can first be stored in the first message queue during the determination process. Then, the messages in the first message queue can be consumed according to the specific optimization strategy. Afterward, content meeting the display conditions can enter the second message queue, waiting to be consumed by the specific rendering and display module, while other content not meeting the display conditions can be discarded. In other words, specific optimization processes can be achieved through multi-level message queues (also known as multi-level message pools).

[0110] Specifically, when determining whether second-priority interactive information content meets the display conditions according to the target optimization strategy, the same target optimization strategy can be applied to all second-priority interactive information content. Alternatively, the second-priority interactive information content can be further divided into multiple secondary priorities, and different optimization strategies can be applied to each secondary priority. In other words, second-priority interactive information content can further include multiple different priorities, thus forming a multi-level priority system. Then, based on the secondary priority to which the interactive information content belongs, the corresponding target optimization strategy can be determined, and then the target optimization strategy corresponding to the interactive information content can be used to determine whether the interactive information content meets the display conditions.

[0111] There are various specific optimization strategies. These can be determined based on the type of interactive information content and / or the sender's identity information, establishing a specific secondary priority and corresponding optimization strategy. For example, one optimization strategy could be to merge identical interactive information received at different times. Specifically, this strategy can be used to optimize statistical messages sent by the system. For instance, if the number of online users or likes in a live stream session remains unchanged between two consecutive messages, these two messages can be merged, allowing only one message to be displayed.

[0112] Alternatively, specific optimization strategies could include discarding system notification messages related to user status changes or timestamped interactive information. For example, this strategy can be used to optimize system-sent status change notification messages. Specifically, this includes notification messages about a user entering or leaving a live stream session; since these messages are relatively low in importance, they can be discarded when the number of concurrent interactive messages is high.

[0113] In addition, specific target optimization strategies can also include peak shaving strategies. That is, if the number of interactive information received within a certain time period (e.g., one second) is very large and a peak occurs, in addition to the aforementioned merging or discarding strategies, if there are multiple interactive messages of the same type within that time period, the latest one or a few interactive information messages within that time period can be displayed, while the others can be discarded, and so on.

[0114] In addition, in a specific implementation, the original text of the interactive information content to be displayed can also be obtained. At this time, the original text and translation results of the interactive information content can be displayed based on the interface associated with the video live broadcast session.

[0115] In practical implementation, a switch mechanism can be provided to the client, allowing users to determine whether to enable the specific real-time translation function based on actual conditions. Specifically, options can be provided to enable or disable the translation function, allowing control over whether the translation results of the interactive information content are displayed during the live video broadcast; wherein, when the translation function is disabled, the original text of the interactive information content is displayed. For example, as... Figure 6 As shown at point 62, this operation option can be provided in the upper right corner of the live streaming interface, etc. The text displayed regarding this operation option can also be customized to the language preferred by the user; for example, in... Figure 6In the example shown, since the specific language required by the user is English, the operation option is specifically displayed as "Auto-Translated," or "automatic translation," and so on. This switch mechanism is set up because a live broadcast session may involve many interactive elements, and enabling translation could affect the interactive experience. For example, if a user says they want to type "superdeal" in the "live room" and then take a screenshot as proof to participate in the activity, if the translation function is enabled, the Russian user's "superdeal" will be automatically translated as "супер сделка," making it uncertain whether they can participate in the activity. Therefore, users are given the option to choose.

[0116] Example 5

[0117] This fifth embodiment provides a live streaming information processing method from the perspective of interactive information content sent by participants in a live video broadcast. (See [link to previous document]). Figure 7 The method may specifically include:

[0118] S701: During a live video session, receive the original text of interactive information input in the first language;

[0119] S702: Submit the original text of the interactive information content to the server for translation to obtain at least one translation result in a second language. The translation result is provided to other participating user clients who require the corresponding second language.

[0120] In a specific implementation, the translation results of the interactive information content sent by other participating users can also be obtained, and the translation results correspond to the first language; then, the translation results of the interactive information content can be displayed based on the interface associated with the video live broadcast session.

[0121] Example 6

[0122] The foregoing embodiments primarily address translation solutions for interactive information generated in live streaming scenarios. However, similar needs exist in other scenarios in practical applications. For example, in typical online video playback scenarios, users are often allowed to send "bullet comments" during playback. These "bullet comments" are displayed as commentary subtitles on the playback interface (when a large amount of commentary text floats across the screen, the effect resembles bullet comments in a shooting game, hence the term "bullet comments"). If multiple users speaking different languages ​​are watching the video simultaneously, commentary subtitles sent in one language may not be understood by users speaking other languages. Therefore, the solution provided in this application can also be used to provide translation services. Specifically, the commentary subtitles can be received on the video playback server and then translated into multiple different languages ​​by a translation server. The translation results can then be provided to the client of the user watching the video, ensuring that while the original text of the commentary subtitles may be in multiple different languages, they are all displayed in the user's desired language, allowing the user to understand the content.

[0123] For details, see Figure 8 This application provides a method for processing video information. Specifically, the method may include:

[0124] S801: During the playback of the target video, obtain the commentary subtitles associated with the target video to be displayed;

[0125] S802: Obtain at least one translation result corresponding to the commentary caption content;

[0126] S803: Provide the translation results in the required language to the client associated with the viewer user for displaying the translation results of the commentary subtitle content.

[0127] In a specific implementation, the original text of the commentary subtitles can also be provided to the client associated with the viewer, so that the client can display the original text and translation results of the commentary subtitles.

[0128] Furthermore, the system can determine whether to provide the original text of the commentary subtitles to the client associated with the viewer based on the number of commentary subtitles generated within the target time period. For example, the scenario can be categorized into different states, such as network congestion or idle time, based on the number of users watching the video online or the number of commentary subtitles generated per second. Different approaches can be used to provide information to the client in different states. For instance, if the network is congested, only the translation result needed by the corresponding user can be provided to the client; if the network is idle, the original text of the interactive information can also be provided to the client, allowing the original text and translation result to be displayed together, and so on.

[0129] It should be noted that the method of providing the translation results to the client in this fifth embodiment can also include various methods such as push notifications via message channels or retrieval by the client through a CDN network, or a combination of the two methods. Furthermore, during the translation process, the number of languages ​​to be translated into for the commentary subtitles can be selected based on the actual number of online viewers of the video. For specific implementation details, please refer to the description in Embodiment 1. Additionally, for other parts of Embodiments 2 to 5 that are not detailed above, please refer to the description in Embodiment 1, and will not be repeated here.

[0130] It should also be noted that the embodiments of this application may involve the use of user data. In practical applications, user-specific personal data may be used in the scheme described herein within the scope permitted by applicable laws and regulations, provided that it complies with the applicable laws and regulations of the country (e.g., with the user's explicit consent, with the user being properly notified, etc.).

[0131] Corresponding to Embodiment 1, this application also provides a live broadcast information processing device, see [link to embodiment]. Figure 9 The device may include:

[0132] The interactive information content acquisition unit 901 is used to acquire the interactive information content to be displayed in the video live streaming session after the video live streaming session is created.

[0133] Translation result acquisition unit 902 is used to acquire at least one translation result corresponding to the interactive information content;

[0134] The translation result providing unit 903 is used to provide translation results in the required language to the client associated with the participant user. The client is used to display the translation results of the interactive information content according to the language required by the associated user.

[0135] In a specific implementation, the device may further include:

[0136] The target language set determination unit is used to determine the target language set required for the interactive information content in the video live broadcast session based on the participating users associated with the video live broadcast session and their respective required languages.

[0137] The translation result acquisition unit can be specifically used for:

[0138] According to the set of target languages ​​required for translation corresponding to the video live session identifier, obtain at least one translation result corresponding to the interactive information content.

[0139] Additionally, the device may also include:

[0140] The historical interaction information content acquisition unit is used to acquire the target number of historical interaction information contents associated with the video live broadcast session when a new user joins the video live broadcast session, if the language required by the new user is not included in the target language set.

[0141] The historical content translation result providing unit is used to obtain the translation results corresponding to the target number of historical interaction information contents according to the language required by the new user, and provide them to the client associated with the new user;

[0142] The set update unit is used to add the language required by the new user to the target language set.

[0143] Specifically, the translation result providing unit can be used for:

[0144] The translation results are provided to the client via a message channel, which is used to establish a long-lived connection with the client.

[0145] Alternatively, the translation result providing unit can be specifically used for:

[0146] The translation results are provided to the client through a Content Delivery Network (CDN).

[0147] Alternatively, the translation result providing unit can be specifically used for:

[0148] The priority determination subunit is used to determine the priority of the interactive information content; the priority is determined based on the timeliness requirements of the interactive information content.

[0149] The translation result provision subunit is used to provide the translation results to the client in different ways for interactive information content of different priorities.

[0150] Specifically, the translation result providing subunit can be used for:

[0151] For interactive information content of the highest priority, the translation result is provided to the client through a message channel, which is used to establish a long connection with the client;

[0152] For interactive information content of the second priority, the translation results are provided to the client through CDN distribution.

[0153] The translation result providing subunit can also be used for:

[0154] For interactive information content of the first priority, after the translation result is provided to the client through the message channel, it is also distributed through the CDN so that for clients that fail to receive through the message channel, the corresponding translation result can be obtained by pulling from the associated CDN service node.

[0155] Specifically, the priority determination subunit can be used for:

[0156] The priority of the interactive information content is determined based on the content type and / or sender identity information of the interactive information content.

[0157] The interactive information content of the first priority includes one or more of the following: product object status change information associated with the video live broadcast session, and interactive information content sent by the anchor user associated with the video live broadcast session.

[0158] The second priority interactive information content includes one or more of the following: status information, statistical information, notification information sent by the server related to the video live session, and interactive information content sent by the viewer users associated with the video live session.

[0159] Specifically, the translation result providing unit may include:

[0160] The message channel creation subunit is used to create multiple message channels when the video live streaming session is created, with each message channel corresponding to a language;

[0161] The translation result push subunit is used to push the translation result to the corresponding message channel according to the language of the translation result after obtaining the translation result, so that the client can establish a long connection with the server by subscribing to one of the message channels, and obtain the translation result on the corresponding message channel through the long connection.

[0162] Alternatively, the translation result providing unit can be specifically used for:

[0163] Based on the regional attributes associated with the service nodes in the CDN, and the language information associated with the regional attributes, the translation results for the corresponding language are sent to the service nodes, so that the client can obtain the translation results for the required language by pulling them from the associated service nodes.

[0164] The device may further include:

[0165] The backtracking unit is used to obtain the translation result corresponding to the language required by the client by backtracking to the superior service node if the service node in the CDN does not have the translation result corresponding to the language required by the client, and then provide it to the client.

[0166] The interactive information content to be displayed in the video live stream session includes: text-based interactive information content, and the translation result is a text-based translation result.

[0167] Alternatively, the interactive information content to be displayed in the video live stream session may include: audio interactive information content, and the translation result may be a text translation result or an audio translation result.

[0168] Additionally, the device may also include:

[0169] The original text providing unit is used to provide the original text of the interactive information content to the client of the participant user for displaying the original text and translation results of the interactive information content.

[0170] The video live streaming sessions include: video live streaming sessions in a product object information system, or video live streaming sessions in a video conferencing scenario.

[0171] Corresponding to Embodiment 2, this application also provides a live broadcast information processing device, see [link to embodiment]. Figure 10 The device may include:

[0172] The channel creation unit 1001 is used to create multiple message channels after the video live streaming session is created; the multiple message channels correspond to multiple different languages ​​required by the participating users;

[0173] Interactive information content acquisition unit 1002 is used to acquire interactive information content to be displayed in the video live broadcast session;

[0174] The translation result acquisition unit 1003 is used to acquire at least one translation result corresponding to the interactive information content;

[0175] The push unit 1004 is used to push the translation result to the corresponding language message channel, and provide the translation result in the required language to the participant user client through the message channel, so as to display the interactive information content according to the translation result.

[0176] The message channel is different from the video live streaming session channel.

[0177] The message channel is a simplex communication channel, used for one-way message pushing from the server to the client.

[0178] Additionally, the device may also include:

[0179] The comment message receiving unit is used to obtain comment messages from at least one participant user before obtaining the interactive information content to be displayed in the video live broadcast session;

[0180] The interactive message content to be displayed includes interactive information content determined based on the comment messages of the at least two participating users.

[0181] Additionally, the device may also include:

[0182] The target language set determination unit is used to determine the target language set required for the interactive information content in the video live broadcast session based on the participating users associated with the video live broadcast session and their respective required languages.

[0183] The translation result acquisition unit can be specifically used for:

[0184] According to the set of target languages, obtain at least one translation result corresponding to the interactive information content.

[0185] Additionally, the device may also include:

[0186] The historical information content acquisition unit is used to acquire the target number of historical interaction information contents associated with the video live broadcast session when a new user joins the video live broadcast session, if the language required by the new user is not included in the target language set.

[0187] The historical content translation result providing unit is used to obtain the translation results corresponding to the target number of historical interaction information contents according to the language required by the new user, and push the translation results corresponding to the historical interaction information contents to the corresponding message channel so that the client associated with the new user can obtain the translation results corresponding to the historical interaction information contents through the message channel;

[0188] The set update unit is used to add the language required by the new user to the target language set.

[0189] The message channel can be provided by a channel server. The channel server is used to establish a long connection between the channel server and the participant client based on the participant user client's subscription request for the target message channel corresponding to the required language, so that after the translation result is pushed to the target message channel, the translation result is pushed to the participant client through the long connection.

[0190] Additionally, the device may also include:

[0191] The CDN distribution unit is used to provide the translation results to the clients corresponding to the participants' users through a content delivery network (CDN). For clients that fail to receive the translation through the message channel, the translation results for the required language can be obtained by pulling them from the service nodes associated with the CDN.

[0192] Corresponding to Embodiment 3, this application also provides a live broadcast information processing device, see [link to embodiment]. Figure 11 The device may include:

[0193] The interactive information content acquisition unit 1101 is used to acquire the interactive information content to be displayed in the video live streaming session after the video live streaming session is created.

[0194] Translation result acquisition unit 1102 is used to acquire at least one translation result corresponding to the interactive information content;

[0195] The distribution unit 1103 is used to distribute the translation results through a content delivery network (CDN) and provide the translation results in the required language to the clients of the participants in the live video session, so as to display the interactive information content based on the translation results.

[0196] The device may further include:

[0197] The target language set determination unit is used to determine the target language set required for the interactive information content in the video live broadcast session based on the participating users associated with the video live broadcast session and their respective required languages.

[0198] The translation result acquisition unit can be specifically used for:

[0199] According to the set of target languages, obtain at least one translation result corresponding to the interactive information content.

[0200] Additionally, the device may also include:

[0201] The historical information content acquisition unit is used to acquire the target number of historical interaction information contents associated with the video live broadcast session when a new user joins the video live broadcast session, if the language required by the new user is not included in the target language set.

[0202] The historical content translation result providing unit is used to obtain the translation results corresponding to the target number of historical interaction information contents according to the language required by the new user, and provide them to the client associated with the new user through the CDN network;

[0203] The set update unit is used to add the language required by the new user to the target language set.

[0204] Specifically, the distribution unit can be used for:

[0205] Based on the regional attributes associated with the service nodes in the CDN, and the language information associated with the regional attributes, the translation results for the corresponding language are sent to the service nodes, so that the client can obtain the translation results for the required language by pulling them from the associated service nodes.

[0206] Additionally, the device may also include:

[0207] The backtracking unit is used to obtain the translation result corresponding to the language required by the client by backtracking to the superior service node if the service node in the CDN does not have the translation result corresponding to the language required by the client, and then provide it to the client.

[0208] Corresponding to Embodiment 4, this application also provides a live broadcast information processing device, see [link to embodiment]. Figure 12 The device may include:

[0209] The target language determination unit 1201 is used to determine the target language required by the associated participant user during the process of participating in a live video session.

[0210] The translation result acquisition unit 1202 is used to acquire the translation result of the interactive information content to be displayed in the video live broadcast session, and the translation result corresponds to the target language;

[0211] The translation result display unit 1203 is used to display the translation results of the interactive information content based on the interface associated with the video live broadcast session.

[0212] Specifically, the translation result acquisition unit can be used for:

[0213] By subscribing to a target message channel corresponding to the target language created by the server in advance, a long connection is established with the channel server so as to obtain the translation results in the target language pushed through the target message channel through the long connection.

[0214] Alternatively, the translation result acquisition unit can be specifically used for:

[0215] The translation results of the interactive information content are obtained by fetching the translation results corresponding to the target language from the associated CDN service node.

[0216] In one embodiment, the device may further include:

[0217] An optimization processing unit is used to optimize the translation results of the received interactive information content when the concurrent number of received translation results exceeds a first threshold, so as to reduce the amount of interactive information content to be displayed.

[0218] The translation result display unit can be specifically used for:

[0219] The translation results of interactive information content that meets the display conditions are sent to the rendering layer so that the translation results of the interactive information content can be displayed on the interface associated with the video live broadcast session.

[0220] Specifically, the optimization processing unit may include:

[0221] The priority determination subunit is used to determine the priority of the interactive information content;

[0222] The display condition judgment subunit is used to determine whether the translation result of the interactive information content meets the display conditions based on the priority of the interactive information content.

[0223] Specifically, the display condition judgment subunit is used for:

[0224] The interactive information content with the highest priority will be directly selected as the interactive information content that meets the display conditions.

[0225] Alternatively, the display condition judgment subunit may include:

[0226] The first message queue storage sub-unit is used to temporarily store the translation results of the second priority interactive information content in the first message queue;

[0227] The condition judgment subunit is used to determine whether the translation result of the interactive information content meets the display conditions according to the target optimization strategy.

[0228] The second message queue storage unit is used to add the translation results of interactive information content that meets the display conditions to the second message queue, so that the translation results of the optimized interactive information content can be sent to the rendering layer according to the order in the second message queue.

[0229] Specifically, the condition judgment subunit can be used for:

[0230] The interactive information content of the second priority is divided into multiple secondary priorities;

[0231] Based on the secondary priority level to which the interactive information content belongs, the corresponding target optimization strategies are determined respectively;

[0232] Using the target optimization strategy corresponding to the interactive information content, determine whether the translation result of the interactive information content meets the display conditions.

[0233] The target optimization strategy includes merging identical interactive information received at different time points.

[0234] Alternatively, the target optimization strategy may include discarding system notification messages related to changes in user status, or interactive information content whose timestamps have expired.

[0235] Alternatively, the target optimization strategy includes: if the number of interactive information contents received within a target time period exceeds a second threshold, and includes multiple interactive information contents of the same type, then the translation result of the latest one or more interactive information contents within that time period is determined as the content that meets the display conditions, and the translation results of other interactive information contents are discarded.

[0236] Additionally, the device may also include:

[0237] The original text acquisition unit is used to acquire the original text of the interactive information content to be displayed, so as to display the original text and translation results of the interactive information content based on the interface associated with the video live broadcast session.

[0238] Additionally, the device may also include:

[0239] A switch option providing unit is used to provide operation options for turning the translation function on or off, so that during the live video broadcast, the operation options can be used to control whether the translation result of the interactive information content is displayed; wherein, when the translation function is off, the original text of the interactive information content is displayed.

[0240] Corresponding to Embodiment 5, this application also provides a live broadcast information processing device, see [link to embodiment 5]. Figure 13 The device may include:

[0241] The interactive information content original text receiving unit 1301 is used to receive the original text of interactive information content input in the first language during the process of participating in a live video session.

[0242] The interactive information content original text submission unit 1302 is used to submit the original text of the interactive information content to the server for translation, to obtain translation results in at least one second language, and the translation results are used to provide to other participating user clients that need the corresponding second language.

[0243] Additionally, the device may also include:

[0244] The translation result acquisition unit is used to acquire the translation result of the interactive information content sent by the other participating users, and the translation result corresponds to the first language;

[0245] The translation result display unit is used to display the translation results of the interactive information content based on the interface associated with the video live broadcast session.

[0246] Corresponding to Embodiment Six, this application also provides a video information processing apparatus, see [link to embodiment six]. Figure 14 The device may include:

[0247] The commentary caption content acquisition unit 1401 is used to acquire the commentary caption content to be displayed associated with the target video during the playback of the target video;

[0248] The translation result acquisition unit 1402 is used to acquire at least one translation result corresponding to the commentary subtitle content;

[0249] The translation result providing unit 1403 is used to provide the translation result in the required language to the client associated with the viewer user, so as to display the translation result of the commentary subtitle content.

[0250] Additionally, the device may also include:

[0251] The original text providing unit is used to provide the original text of the commentary captions to the client associated with the viewer user, so that the client can display the original text and translation results of the commentary captions.

[0252] The judgment unit is used to determine whether to provide the original text of the commentary captions to the client associated with the viewer user based on the number of commentary captions generated within the target time period.

[0253] In addition, embodiments of this application also provide a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the method described in any of the foregoing method embodiments.

[0254] And an electronic device, comprising:

[0255] One or more processors; and

[0256] A memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, perform the steps of the method described in any of the foregoing method embodiments.

[0257] in, Figure 15 The architecture of an electronic device is illustrated by example. For instance, device 1500 may be a mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, medical device, fitness equipment, personal digital assistant, aircraft, etc.

[0258] Reference Figure 15 The device 1500 may include one or more of the following components: a processing component 1502, a memory 1504, a power supply component 1506, a multimedia component 1508, an audio component 1510, an input / output (I / O) interface 1512, a sensor component 1514, and a communication component 1516.

[0259] Processing component 1502 typically controls the overall operation of device 1500, such as operations associated with display, telephone calls, data communication, camera operation, and recording operations. Processing component 1502 may include one or more processors 1520 to execute instructions to perform all or part of the steps of the methods provided in this disclosure. Furthermore, processing component 1502 may include one or more modules to facilitate interaction between processing component 1502 and other components. For example, processing component 1502 may include a multimedia module to facilitate interaction between multimedia component 1508 and processing component 1502.

[0260] Memory 1504 is configured to store various types of data to support the operation of device 1500. Examples of this data include instructions for any application or method operating on device 1500, contact data, phonebook data, messages, pictures, videos, etc. Memory 1504 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk.

[0261] Power supply component 1506 provides power to various components of device 1500. Power supply component 1506 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 1500.

[0262] Multimedia component 1508 includes a screen that provides an output interface between device 1500 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touchscreen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensors may sense not only the boundaries of touch or swipe actions but also the duration and pressure associated with the touch or swipe operation. In some embodiments, multimedia component 1508 includes a front-facing camera and / or a rear-facing camera. When device 1500 is in an operating mode, such as a shooting mode or a video mode, the front-facing camera and / or rear-facing camera may receive external multimedia data. Each front-facing camera and rear-facing camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

[0263] Audio component 1510 is configured to output and / or input audio signals. For example, audio component 1510 includes a microphone (MIC) configured to receive external audio signals when device 1500 is in an operating mode, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 1504 or transmitted via communication component 1516. In some embodiments, audio component 1510 also includes a speaker for outputting audio signals.

[0264] I / O interface 1512 provides an interface between processing component 1502 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to, home buttons, volume buttons, power buttons, and lock buttons.

[0265] Sensor assembly 1514 includes one or more sensors for providing status assessments of various aspects of device 1500. For example, sensor assembly 1514 may detect the on / off state of device 1500, the relative positioning of components such as the display and keypad of device 1500, changes in the position of device 1500 or a component of device 1500, the presence or absence of user contact with device 1500, the orientation or acceleration / deceleration of device 1500, and temperature changes of device 1500. Sensor assembly 1514 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 1514 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, sensor assembly 1514 may also include an accelerometer, a gyroscope, a magnetometer, a pressure sensor, or a temperature sensor.

[0266] Communication component 1516 is configured to facilitate wired or wireless communication between device 1500 and other devices. Device 1500 can access wireless networks based on communication standards, such as WiFi, or mobile communication networks such as 2G, 3G, 4G / LTE, and 5G. In one exemplary embodiment, communication component 1516 receives broadcast signals or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, communication component 1516 also includes a near-field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

[0267] In an exemplary embodiment, device 1500 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the methods described above.

[0268] In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 1504 including instructions, which can be executed by a processor 1520 of device 1500 to perform the method provided by the present disclosure. For example, the non-transitory computer-readable storage medium may be a ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, and optical data storage device, etc.

[0269] As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware platforms. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product can be stored in a storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of this application.

[0270] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, for system or system embodiments, since they are basically similar to method embodiments, the description is relatively simple, and relevant parts can be referred to the descriptions in the method embodiments. The systems and system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without creative effort.

[0271] The live streaming information processing method, apparatus, and electronic device provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and its core ideas. Furthermore, those skilled in the art will recognize that, based on the ideas of this application, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A method for processing live broadcast information, characterized in that, The method is applied to the server side of a cross-border video live streaming system, where participants in the live streaming session include users from multiple countries / regions, and the server providing interactive information translation services for the cross-border video live streaming system is deployed only in a single country / region. The method includes: After the video live stream session is created, multiple message channels are created, and the interactive information content to be displayed in the video live stream session is obtained; the multiple message channels correspond to multiple different languages ​​required by the participating users; Obtain at least one translation result corresponding to the interactive information content; The translation results are pushed to the multiple message channels and provided to the client via a Content Delivery Network (CDN). This allows the participating user client to receive the translation results in the desired language through the message channels. For clients that fail to receive the translation results through the message channels, the corresponding translation results are retrieved from the associated CDN service nodes. The CDN comprises multiple service nodes deployed in different countries / regions in a tiered manner. When distributing the translation results through the CDN, the translation results in the corresponding language are sent to the service nodes based on their associated regional attributes and language information. This allows the participating user client in the live video session to retrieve the translation results in the desired language from the associated service nodes and display the interactive information content based on the translation results. If the CDN service nodes do not have the translation results for the client's desired language, the translation results are retrieved from the superior service nodes and provided to the client.

2. The method according to claim 1, characterized in that, Also includes: Based on the participating users associated with the video live stream session and their respective required languages, determine the set of target languages ​​required for the interactive information content in the video live stream session; The step of obtaining at least one translation result corresponding to the interactive information content includes: According to the set of target languages, obtain at least one translation result corresponding to the interactive information content.

3. The method according to claim 2, characterized in that, Also includes: When a new user joins the video live stream session, if the language required by the new user is not included in the target language set, then the historical interaction information content of the target number of items associated with the video live stream session is obtained. According to the language required by the new user, the translation results corresponding to the target number of historical interaction information contents are obtained, and provided to the client associated with the new user through the CDN network; Add the language required by the new user to the target language set.

4. The method according to claim 1, characterized in that, Also includes: Determine the priority of the interactive information content; The priority is determined based on the timeliness requirements of the interactive information content. Specifically, for interactive information content of the first priority, the translation result is provided to the client through a message channel and then distributed through the CDN; for interactive information content of the second priority, the translation result is provided to the client through the CDN distribution method.

5. The method according to claim 4, characterized in that, The interactive information content of the first priority includes one or more of the following: product object status change information associated with the video live broadcast session, and interactive information content sent by the anchor user associated with the video live broadcast session; The second priority interactive information content includes one or more of the following: status information, statistical information, notification information sent by the server related to the video live session, and interactive information content sent by the viewer users associated with the video live session.

6. A method for processing live broadcast information, characterized in that, The method is applied to a client of a cross-border video live streaming system, including: During the process of participating in a live video session through a client, determine the target language required by the associated participant users; The system receives the translation results of the interactive information content to be displayed in the live video session through the message channel corresponding to the target language. If receiving fails, it obtains the translation results of the interactive information content to be displayed in the live video session from the CDN service node associated with the participant user's country / region. The translation results correspond to the target language. The message channel is created after the live video session is created, and multiple message channels correspond to multiple different languages ​​required by the participant user. The CDN includes multiple service nodes deployed in multiple different countries / regions in a tiered deployment manner. The translation results are provided by the system for the cross-border live video session. A server providing interactive information translation services is deployed only in a single country / region. The server of the cross-border video live streaming system pushes the translation results to the corresponding language message channel and sends the translation results in the corresponding language to the service nodes in the CDN according to the regional attributes associated with the service nodes and the language information associated with the regional attributes. If the service nodes in the CDN associated with the participant user's country / region do not have the translation results corresponding to the language required by the client, the translation results corresponding to the language required by the client are obtained by backtracking to the superior service node and provided to the client. The translation results of the interactive information content are displayed on the interface associated with the live video session.

7. The method according to claim 6, characterized in that, The method further includes: When the number of concurrent translation results of received interactive information content exceeds a first threshold, the translation results of the received interactive information content are optimized to reduce the amount of interactive information content to be displayed. The display of the translation results of the interactive information content on the interface associated with the live video session includes: The translation results of interactive information content that meets the display conditions are sent to the rendering layer so that the translation results of the interactive information content can be displayed on the interface associated with the video live broadcast session.

8. The method according to claim 7, characterized in that, The optimization processing of the translation results of the received interactive information content includes: Determine the priority of the interactive information content; Based on the priority of the interactive information content, determine whether the translation result of the interactive information content meets the display conditions.

9. The method according to claim 8, characterized in that, The step of determining whether the translation result of the interactive information content meets the display conditions based on the priority of the interactive information content includes: The interactive information content with the highest priority will be directly identified as the interactive information content that meets the display conditions; The translation results of the second priority interactive information content are temporarily stored in the first message queue; Based on the target optimization strategy, determine whether the translation result of the interactive information content meets the display conditions; The translation results of interactive information content that meets the display conditions are added to the second message queue so that the optimized translation results of interactive information content are sent to the rendering layer according to the order in the second message queue.

10. A method for processing live broadcast information, characterized in that, The method is applied to a client of a cross-border video live streaming system, including: During the live video session, you receive the original text of the interactive information input in your first language; The original text of the interactive information content is submitted to the server of the cross-border video live streaming system. This allows the server, which provides interactive information translation services for the cross-border video live streaming system, to create multiple message channels in different languages ​​required by the participating users. The original text of the interactive information content is then translated to obtain a translation result in at least one second language. This translation result is provided to other participating user clients who require the corresponding second language. The server of the cross-border video live streaming system pushes the translation result to the multiple message channels and provides the translation result to the clients via a CDN (Content Delivery Network). The CDN includes multiple service nodes deployed in multiple different countries. The CDN is used to distribute translation results across regions and employs a tiered deployment approach. When distributing translation results via the CDN, the CDN sends the corresponding language translation results to the service nodes based on their associated regional attributes and language information. This allows the clients of the participants in the live video session to retrieve the required language translation results from the associated service nodes based on their country / region. If the CDN service nodes do not contain the required language translation results, the CDN retrieves the results by backtracking to a higher-level service node and provides them to the client.

11. A video information processing method, characterized in that, The method is applied to the server side of a cross-border video live streaming system. The participants in the live streaming session of the cross-border video live streaming system include users from multiple countries / regions. The server providing subtitle translation services for the cross-border video live streaming system is deployed only in a single country / region, including: During the playback of the target video, multiple message channels are created, and the commentary subtitles associated with the target video to be displayed are obtained; the multiple message channels correspond to multiple different languages ​​required by the participating users; Obtain at least one translation result corresponding to the commentary caption content; The translation results are pushed to the multiple message channels and provided to the client via a Content Delivery Network (CDN). This allows the participating user client to receive the translation results in the desired language through the message channels. For clients that fail to receive the translation results through the message channels, the corresponding translation results are retrieved from the associated CDN service nodes. The CDN comprises multiple service nodes deployed in different countries / regions in a tiered manner. When distributing the translation results through the CDN, the translation results in the corresponding language are sent to the service nodes based on their associated regional attributes and language information. This allows the client associated with the viewer of the target video to retrieve the translation results in the desired language from the associated service nodes and display the translated commentary subtitles. If the CDN service nodes do not have the translation results in the client's desired language, the translation results are retrieved from the superior service nodes and provided to the client.

12. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the program performs the steps of the method described in any one of claims 1 to 11.

13. An electronic device, characterized in that, include: One or more processors; as well as A memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, perform the steps of the method according to any one of claims 1 to 11.