Information processing method and apparatus, and device and storage medium

By constructing and updating knowledge graphs, detecting and updating the attributes and entity information of media content, and generating knowledge graphs of standard entities, the efficiency and accuracy issues of users obtaining media content of interest are solved, and more efficient and accurate media content delivery is achieved.

WO2026123251A1PCT designated stage Publication Date: 2026-06-18BEIJING ZITIAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
BEIJING ZITIAO NETWORK TECH CO LTD
Filing Date
2024-12-11
Publication Date
2026-06-18

Smart Images

  • Figure CN2024138539_18062026_PF_FP_ABST
    Figure CN2024138539_18062026_PF_FP_ABST
Patent Text Reader

Abstract

According to the embodiments of the present disclosure, provided are an information processing method and apparatus, and a device and a storage medium. The method comprises: for first media content to be processed, detecting, from a plurality of pieces of media content associated with a first knowledge graph, media content matching the first media content, wherein the first knowledge graph represents a plurality of content entities and relationships between the plurality of content entities, and is determined on the basis of the plurality of pieces of media content; in response to having detected first target media content matching the first media content, updating the attribute of the first target media content in the first knowledge graph; and in response to having not detected any media content matching the first media content, updating content entity information of the first knowledge graph on the basis of the first media content. In this way, there is no duplicate media content among media content associated with a first knowledge graph. Therefore, the duplicate media content is prevented from being provided to a user, thereby improving the efficiency of the user viewing the media content.
Need to check novelty before this filing date? Find Prior Art

Description

Methods, apparatus, devices and storage media for information processing Technical Field

[0001] The exemplary embodiments disclosed herein generally relate to the field of computers, and more particularly to methods, apparatus, devices and computer-readable storage media for information processing. Background Technology

[0002] With the development of information technology, various terminal devices can provide people with a variety of services in work and life. Applications providing these services can be deployed on these terminal devices. The terminal devices present relevant content and interact with users through the application's user interface, meeting various user needs. Terminal devices or applications can provide users with a variety of media content. However, the amount of media content that users can access is constantly increasing. Therefore, users expect to be able to conveniently and quickly access media content that interests them, avoiding being disturbed by too much uninteresting content. Summary of the Invention

[0003] In a first aspect of this disclosure, an information processing method is provided. The method includes: for first media content to be processed, detecting media content matching the first media content from a plurality of media content associated with a first knowledge graph, the first knowledge graph representing a plurality of content entities and relationships between the plurality of content entities and determined based on the plurality of media content; updating attributes of the first target media content in the first knowledge graph in response to detecting first target media content matching the first media content; and updating content entity information of the first knowledge graph based on the first media content in response to no media content matching the first media content being detected.

[0004] In a second aspect of this disclosure, an apparatus for information processing is provided. The apparatus includes: a detection module configured to, for first media content to be processed, detect media content matching the first media content from a plurality of media content associated with a first knowledge graph, the first knowledge graph representing a plurality of content entities and relationships between the plurality of content entities and determined based on the plurality of media content; a first update module configured to, in response to detecting first target media content matching the first media content, update attributes of the first target media content in the first knowledge graph; and a second update module configured to, in response to not detecting media content matching the first media content, update content entity information of the first knowledge graph based on the first media content.

[0005] In a third aspect of this disclosure, an electronic device is provided. The device includes at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit. When executed by the at least one processing unit, the instructions cause the device to perform the method of the first aspect.

[0006] In a fourth aspect of this disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program that can be executed by a processor to implement the method of the first aspect.

[0007] It should be understood that the content described in this content section is not intended to limit the key or essential features of the embodiments of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description

[0008] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. In the drawings, the same or similar reference numerals denote the same or similar elements, wherein:

[0009] Figure 1 shows a schematic diagram of an example environment in which embodiments of the present disclosure may be implemented;

[0010] Figure 2 shows a schematic diagram of an example system for information processing according to some embodiments of the present disclosure;

[0011] Figure 3 illustrates an example architecture of an example system for extracting entity information from media content according to some embodiments of the present disclosure;

[0012] Figure 4 illustrates a schematic diagram of at least a portion of an example first knowledge graph provided according to embodiments of the present disclosure;

[0013] Figure 5 shows a schematic diagram of an example system for presenting media content according to some embodiments of the present disclosure;

[0014] Figure 6 illustrates a schematic diagram of an example architecture of an information processing platform according to some embodiments of the present disclosure;

[0015] Figure 7 shows a flowchart of an example process for information processing according to some embodiments of the present disclosure;

[0016] Figure 8 shows a schematic structural block diagram of an example apparatus for information processing according to some embodiments of the present disclosure; and

[0017] Figure 9 shows a block diagram of an electronic device capable of implementing several embodiments of the present disclosure. Detailed Implementation

[0018] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0019] It should be noted that the headings of any section / subsection provided herein are not limiting. Various embodiments are described throughout this document, and embodiments of any type may be included under any section / subsection. Furthermore, embodiments described in any section / subsection may be combined in any way with any other embodiments described in the same section / subsection and / or different sections / subsections.

[0020] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions may also be included below. The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below.

[0021] The embodiments of this disclosure may involve user data, data acquisition, and / or use. All of these aspects comply with applicable laws, regulations, and relevant provisions. In the embodiments of this disclosure, all data collection, acquisition, processing, manipulation, forwarding, and use are conducted with the user's knowledge and confirmation. Accordingly, in implementing the embodiments of this disclosure, the type, scope of use, and usage scenarios of any data or information that may be involved should be communicated to the user and their authorization obtained in accordance with relevant laws and regulations through appropriate means. The specific methods of notification and / or authorization may vary depending on the actual situation and application scenario, and the scope of this disclosure is not limited in this respect.

[0022] In this specification and the embodiments, any processing of personal information will be carried out only under the premise of legality (such as obtaining the consent of the personal information subject, or being necessary for the performance of a contract), and will only be carried out within the scope stipulated or agreed upon. A user's refusal to process personal information other than that necessary for basic functions will not affect the user's use of basic functions.

[0023] As used herein, the term media content can be any suitable form of content capable of providing information. For example, media content can be news reports in the form of text, images, audio, video, or a combination thereof. Media content can be media content obtained from various platforms (e.g., news platforms). For text-based media content, the desired information can be extracted directly from the text. For media content in the form of images, videos, audio, etc., any known or future-developed technology can be used to extract the desired information from the images, audio, or video. For example, relevant information can be extracted from image, video, or audio formats based on image recognition or speech recognition technologies.

[0024] Figure 1 illustrates a schematic diagram of an example environment 100 in which embodiments of the present disclosure can be implemented. As shown in Figure 1, the example environment 100 may include an electronic device 110 and a server 130.

[0025] In this example environment 100, electronic device 110 may run an application 120 that supports information retrieval services. Application 120 may be any suitable type of application for information retrieval services, examples of which may include, but are not limited to, news applications or other suitable applications. User 140 may interact with application 120 via electronic device 110 and / or its attached devices.

[0026] In environment 100 of Figure 1, if application 120 is active, electronic device 110 can use application 120 to present interface 150 for supporting information query services. User 140 can view media content provided by application 120 based on interface 150.

[0027] In some embodiments, electronic device 110 communicates with server 130 to provide services to application 120. For example, application 120 can obtain user input for an instruction information query request through interface 150 and provide the user input to the server. Server 130 provides media content corresponding to the provided user input to application 120 to present the corresponding media content to the user. Server 130 can provide media content to user 140 based on a pre-determined media content library 160 (e.g., a knowledge graph), or obtain media content corresponding to the user input from publicly available media content sources. Media content library 160 can be implemented or included in server 130. Server can store the collected media content in media content library 160 to provide services to application 120.

[0028] Electronic device 110 can be any type of mobile terminal, fixed terminal, or portable terminal, including mobile phones, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, media computers, multimedia tablets, handheld computers, portable gaming terminals, VR / AR devices, personal communication system (PCS) devices, personal navigation devices, personal digital assistants (PDAs), audio / video players, digital cameras / camcorders, positioning devices, television receivers, radio receivers, e-book devices, gaming devices, or any combination thereof, including accessories and peripherals of these devices or any combination thereof. In some embodiments, electronic device 110 can also support any type of user-facing interface (such as "wearable" circuitry).

[0029] Server 130 can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks, and big data and artificial intelligence platforms. Server 130 may include, for example, computing systems / servers such as mainframes, edge computing nodes, computing devices in a cloud environment, etc. Server 130 can provide backend services for applications 120 in electronic devices 110 that support information query services.

[0030] A communication connection can be established between server 130 and electronic device 110. This communication connection can be established via wired or wireless means. The communication connection may include, but is not limited to, Bluetooth, mobile network, Universal Serial Bus (USB), and Wireless Fidelity (WiFi) connections; the embodiments of this disclosure are not limited in this respect. In the embodiments of this disclosure, server 130 and electronic device 110 can achieve signaling interaction through the communication connection between them.

[0031] It should be understood that the structure and function of the various elements in environment 100 are described for illustrative purposes only and do not imply any limitation on the scope of this disclosure.

[0032] As mentioned earlier, users expect to easily and quickly access media content that interests them, avoiding the distraction of too much uninteresting content. To this end, some applications offer media content subscription services. Applications can provide users with relevant media content from their library based on their subscription information. For example, an application can provide news items corresponding to keywords subscribed to, based on chronological order or content relevance. However, the number of media items in a library grows exponentially every day, and a large amount of duplicate content exists. In this situation, the media content provided to users may contain duplicates, affecting the efficiency of their media viewing experience.

[0033] In view of this, embodiments of the present disclosure provide an information processing scheme. In this scheme, for a first media content to be processed, media content matching the first media content is detected from multiple media content associated with a first knowledge graph. The first knowledge graph represents multiple content entities and the relationships between these entities and is determined based on the multiple media content. In response to detecting a first target media content matching the first media content, the attributes of the first target media content in the first knowledge graph are updated. In response to not detecting any media content matching the first media content, the content entity information of the first knowledge graph is updated based on the first media content.

[0034] As will be more clearly understood from the following description, according to the scheme disclosed herein, the first step is to determine whether the first knowledge graph used for storing media content includes media content that matches the first media content to be processed. Only when no media content matches the first media content to be processed is the content entity information of the first knowledge graph updated, associating the first media content with the first knowledge graph. In this way, duplicate media content is avoided among the media content associated with the first knowledge graph. This prevents duplicate media content from being provided to the user and improves the efficiency of the user's media content viewing.

[0035] The following section provides a detailed description of various example implementations of this scheme, with reference to the accompanying drawings.

[0036] Figure 2 illustrates a schematic diagram of an example system 200 for information processing according to some embodiments of the present disclosure. As shown in Figure 2, the example system 200 may be implemented or included at server 130. Alternatively, the example system 200 may be implemented collaboratively by server 130 and electronic device 110.

[0037] In some embodiments, server 130 may obtain media content from a media content source (such as a news website or a live streaming platform). For example, server 130 may obtain media content provided by the media content source at fixed time intervals. Server 130 may also obtain updated media content from the media content source when an update is detected. Server 130 then stores the obtained media content in media content library 160.

[0038] In some embodiments, the server can store media content in the form of a knowledge graph. A knowledge graph is a structured semantic knowledge base used to store information about relationships between entities. It organizes knowledge in a graphical way, where each node represents an entity (e.g., person, place, item, etc.), and edges represent relationships between entities (e.g., "belongs to," "located in," etc.). The purpose of knowledge graphs is to enable machines to understand, process, and utilize the knowledge stored within them, thereby supporting applications such as semantic search in search engines, recommendation systems, and natural language processing.

[0039] Figure 3 illustrates an example architecture of an example system 300 for extracting entity information from media content according to some embodiments of the present disclosure. System 300 may be implemented or included in server 130. Multiple media contents to be processed exist in system 300 (e.g., media content in media content library 160 or media content acquired in real time by server 130). As shown in Figure 3, system 300 includes media content 310-1, media content 310-2, and media content 310-3, which may be individually or collectively referred to as media content 310. The number of media contents 310 shown in Figure 3 is merely exemplary and is not intended to be any limitation. For a given media content 310, key information of the media content 310 (e.g., news text, etc.) is first determined, and the key information is processed using language model 330 to obtain one or more content entities included in the media content 310. As shown in Figure 3, the content entities obtained by the language model 330 include a first content entity 360-1, a second content entity 360-2, and a third content entity 360-3, which can be individually or collectively referred to as content entities 360. The number of content entities 360 shown in Figure 3 is merely exemplary and is not intended to be a limitation. For example, to prevent an excessive number of content entities from being processed, during the processing of any media content 310, the language model 330 may first determine the topic of the media content 310. Subsequently, only content entities 360 associated with that topic are extracted. In some embodiments, the language model 330 may extract only content entities 360 of a predetermined type. The predetermined type may be a user-specified type or a popular type in various fields (entity types mentioned multiple times in different media content).

[0040] Furthermore, the language model 330 (e.g., a large language model) can be used to obtain the attributes 340 of content entities (e.g., the frequency of occurrence of content entities, the weight of content entities, etc.) and the relationships 350 between content entities. Subsequently, the server 130 generates a first knowledge graph associated with each media content 310 based on the obtained content entities 360 and the relationships 350 between content entities. In some embodiments, the language model 330 can be used to obtain the content information of the media content. The content information of the media content and the attributes 340 of the content entities can be stored in the media content library 160 in the form of metadata.

[0041] Figure 4 illustrates a schematic diagram of at least a portion of an example first knowledge graph 400 provided according to embodiments of the present disclosure. As shown in Figure 4, the first knowledge graph 400 includes a first content node 410-1, a second content node 410-2, a third content node 410-3, a fourth content node 410-4, and a fifth content node 410-5, which may be individually or collectively referred to as content nodes 410. The number of content nodes 410 shown in Figure 4 is merely exemplary and is not intended to be any limitation. Each content node 410 corresponds to a content entity (e.g., name, event, etc.) in each media content 310. In some embodiments, the content nodes 410 and media content 310 are not in a one-to-one correspondence; a media content 310 may include multiple content entities 360, and therefore a media content 310 may correspond to multiple content nodes 410. For example, media content 310-1 corresponds to the third content node 410-3 and the fourth content node 410-4. Media content 310-2 corresponds to the first content node 410-1. Media content 310-3 corresponds to the second content node 410-2 and the third content node 410-3. In some embodiments, different media content 310 may include the same content entity 360, so multiple media content 310 may correspond to the same content node 410. For example, media content 410-1 and media content 410-3 both correspond to the third content node 410-3.

[0042] Furthermore, the first knowledge graph 400 also includes a first content directed edge 420-1, a second content directed edge 420-2, and a third content directed edge 420-3, which can be individually or collectively referred to as content directed edges 420. The number of directed edges 420 shown in Figure 4 is merely exemplary and is not intended to be a limitation. Each content directed edge 420 represents a relationship between content nodes 410. For example, the second directed edge 420-2 indicates that the content entity represented by the first content node 410-1 is included in the content entity represented by the third content node 410-3.

[0043] Referring again to Figure 2, server 130 stores multiple media contents in the form of a knowledge graph to provide services to application 120. If there is unprocessed first media content 210, server 130 stores the first media content 210 in database 220, which is, for example, a database used to store media content details. To ensure that the media content in database 220 meets the requirements of system 200, inspection unit 230 can be used to verify each media content 310 in database 220. For example, inspection unit 230 can perform any appropriate type of inspection on media content 310. After the first media content 210 verified by inspection unit 230 is vectorized, it is stored in vector library 240 in the form of vectorized tags. Subsequently, matching unit 250 detects media content that matches the first media content 210 from the multiple media contents 310 associated with the first knowledge graph 400.

[0044] Vector library 240 includes multiple vectorized representations. Each vectorized representation corresponds to a media content 310 associated with the first knowledge graph 400. In some embodiments, matching unit 250 may perform a matching operation based on tags stored in vector library 240. For example, by comparing the first media content 210 with multiple media content 310 character by character, if the number of identical characters between the first media content 210 and a certain media content 310 is greater than a threshold, it can be determined that the first media content 210 matches that media content 310.

[0045] In some embodiments, multiple media contents may be stored in the vector library simultaneously. Since the matching unit performs the matching operation based on the vectorized representation in the vector library, it cannot detect whether other media contents stored at the same time overlap with the first media content. To solve this problem, the received first media content 210 can be stored in the vector library 240. After a predetermined delay, media content matching the first media content 210 is detected from the multiple media contents. In some embodiments, for multiple simultaneously received first media contents 210, a different predetermined delay time can be determined for each first media content 210 to ensure that each first media content 210 performs the matching operation at a different time.

[0046] In some embodiments, if a first target media content matching the first media content 210 is detected, the attributes of the first target media content in the first knowledge graph are updated. For example, the attributes of media content in the first knowledge graph 400 may indicate information such as the number of times the media content appears, its influence, or its storage time. In some embodiments, if multiple media contents match the first media content 210, the media content with the highest weight can be selected as the first target media content. The weight of a media content indicates the degree of attention it receives (e.g., reader reach, credibility, professionalism). In some embodiments, the media content with the most citations can be selected as the first target media content. If media content matching the first media content 210 is detected, the attribute update unit 270 updates the attributes of the first target media content based on the first media content 210. For example, the attribute update unit 270 may increase the weight or citation count of the first target media content.

[0047] In some embodiments, if no media content matching the first media content 210 is detected, the content entity information of the first knowledge graph is updated based on the first media content 210. For example, updating the content entity information of the first knowledge graph includes adding new content entities 360 and adding relationships 350 between the content entities in the first knowledge graph.

[0048] In some embodiments, if no media content matching the first media content 210 is detected, the entity update unit 260 can be used to identify the first media content to obtain the identified content entities in the first media content 210. If the identified content entities do not match any of the content entities in the first knowledge graph 400, a first node 410 representing the identified content entities is added to the first knowledge graph 400. Subsequently, a first relationship between the identified content entities and the first content entities represented by the first knowledge graph 400 is determined based on the first media content, and a first directed edge representing the first relationship is added to the first knowledge graph.

[0049] In some embodiments, the identified content entities may already be represented in the first knowledge graph 400, but the first media content does not represent the relationships between content entities included in the first knowledge graph. In this case, the entity update unit 260 can use the first media content to determine the second relationship between the identified content entities and other content entities, and add a second directed edge representing the second relationship in the first knowledge graph.

[0050] In this way, the media content associated with the first knowledge graph will not contain duplicate media content, so as to provide media content push services to users using the first knowledge graph.

[0051] As shown in Figure 2, system 200 further includes a subscription service unit 280. The subscription service unit 280 is used to obtain a user's subscription instruction and provide the user with media content corresponding to the subscription instruction. In some embodiments, the subscription service unit 280 can determine the media content corresponding to the subscription instruction based on a first knowledge graph. For example, server 130 can provide media content to the user based on the content entity 360 represented by the first knowledge graph. For instance, based on the content entity 360 included in the user's subscription information (or query instruction), media content 310 corresponding to that content entity 360 is provided to the user. In some embodiments, after the first knowledge graph is updated (e.g., the attributes of the media content are updated or the content entity information is updated), the corresponding media content can be provided to the user based on the updated first knowledge graph.

[0052] However, since content entities are directly determined based on media content, they may be inaccurate or incomplete due to limitations in how the media content is described. For example, the same entity may be expressed differently in different media content (e.g., the Chinese name and English name of the same person). Furthermore, the same entity may have different meanings in different application scenarios. For instance, both "My name is Alice" and "Alice Street" include "Alice," but "Alice" in the two sentences cannot be considered the same content entity. Therefore, if media content is provided to users directly based on content entities, the provided media content may not meet the user's needs. In some embodiments, standard entities that are not strongly related to the news and are more objective than content entities can be generated based on the content entities represented by the first knowledge graph. Media content push services can then be provided to users based on a second knowledge graph representing these standard entities.

[0053] Referring to Figure 3, for the multiple content entities represented by the first knowledge graph, clustering operation is performed on the multiple content entities using clustering disambiguation unit 370 based on the attribute information of the multiple content entities to determine multiple standard entities. The standard entities include a first standard entity 380-1, a second standard entity 380-2, and a third standard entity 380-3, which can be individually or collectively referred to as standard entities 380. The number of standard entities 380 shown in Figure 3 is merely exemplary and not intended to be a limitation. Exemplarily, standard entities 380 can be determined using a language model. For example, the name, alias, attributes, and description of each content entity 360 can be provided to the language model, and the standard entity 380 corresponding to each content entity 360 can be determined based on the output of the language model. In some embodiments, standard entities 380 can be determined manually or by importing from an external entity library. Compared to content entities 360, standard entities 380 are not directly related to media content, and there is no overlap between standard entities 380.

[0054] Subsequently, for a specific standard entity among multiple standard entities 380, the target content entity corresponding to that standard entity and the relationships between each target content entity are determined. Then, based on the relationships between the target content entities, the relationships between them and each standard entity are determined. Based on each standard entity 380 and the relationships between them, a second knowledge graph representing the multiple standard entities 380 and the relationships between them is generated. In this way, media content can be presented to users based on standard entities, thereby improving the accuracy of the presented media content.

[0055] In some embodiments, corresponding media content can be presented to the user based on a query instruction indicated by user input. For example, a query entity corresponding to the received user input can be determined. Subsequently, based on a second knowledge graph, one or more target standard entities matching the query entity are determined from a plurality of standard entities. Based on the one or more target standard entities, one or more second media contents corresponding to the query entity are determined. One or more second media contents are presented as at least part of a response to the user input.

[0056] Figure 5 illustrates a schematic diagram of an example system 500 for presenting media content according to some embodiments of the present disclosure. As shown in Figure 5, user input 510 can be content obtained by application 120 through interface 150. User input 510 indicates the media content the user wants to query (i.e., the user's query intent). User input 510 can be in text or voice form. Upon receiving user input 510, user input entity 512 is first determined. Subsequently, based on contextual information related to user input 510, query entity 520 corresponding to the user's query intent is determined from user input entity 512. Subsequently, target entity 530 corresponding to query entity 520 is determined from a plurality of standard entities represented by second knowledge graph 550. Second media content 540 corresponding to target entity 530 is determined from media content associated with first and second knowledge graphs. In some embodiments, user input 510 may include a plurality of query entities 520. For example, if user input 510 is "media content about Alice and Bob", then the query entities include "Alice" and "Bob".

[0057] In some embodiments, if multiple target entities 530 correspond to a single query entity 520, a language model can be used to disambiguate the multiple target entities 530. Subsequently, based on the output of the language model, the second media content 540 corresponding to the user input 510 is determined.

[0058] As shown in Figure 5, the first knowledge graph 400 further includes a graph index 580 and a media content index 570. The media content index is used to identify multiple media contents in the first knowledge graph 400, and the graph index 580 is used to identify multiple content entities in the first knowledge graph 400. The media content index 570 is used to indicate the position of each media content in the first knowledge graph 400, and the graph index 580 is used to indicate the information of the entity or media content corresponding to each content entity in the first knowledge graph 400. In some embodiments, the second media content 540 can be obtained based on the graph index 580 and the media content index 570. For example, after determining the target entity 530 corresponding to the query entity 520, one or more target content entities corresponding to the target entity 530 in the first knowledge graph 400 are determined based on the correspondence between nodes between the first knowledge graph 400 and the second knowledge graph 550 and the graph index 580. Subsequently, the second media content 540 corresponding to the target content entity is determined based on the media content index 570.

[0059] In some embodiments, if the second knowledge graph does not include a standard entity that matches the query entity, the content entity that matches the query entity in the first knowledge graph can be used as the target entity 530, thereby providing the user with the second media content 540 corresponding to the user input 510.

[0060] In some embodiments, to better provide media content to users, the second knowledge graph 550 can be updated based on new media content. Taking the first media content 210 as an example, the update process of the second knowledge graph 550 is described below. For example, for multiple pieces of first media content 210 to be processed, firstly, multiple content entities included in the multiple pieces of first media content 210 are determined. Then, based on the relationships between content entities or based on content entity vectors, a clustering operation is performed on the multiple content entities 360 associated with each of the first media content 210. Based on the clustering results, candidate standard entities 560 corresponding to each type of content entity 360 in the clustering operation are determined. It is determined whether there is a standard entity matching the candidate standard entity 560 among the standard entities 380 represented by the second knowledge graph. If there is no standard entity matching the candidate standard entity 560, a language model is used to perform a normalization and disambiguation operation on the candidate standard entities to generate new standard entities. Subsequently, nodes representing the newly generated standard entities and directed edges representing the relationships between the newly generated standard entities and other standard entities are added to the second knowledge graph.

[0061] In some embodiments, nodes corresponding to some content entities 360 can be removed from the first knowledge graph 400 to better utilize the first knowledge graph 400 to provide media content to users. In some embodiments, the number of second target media contents corresponding to a given content entity can be determined from multiple media contents 310 associated with the first knowledge graph, and content entities 360 whose number of corresponding second target media contents is less than a threshold can be designated as target content entities. Subsequently, nodes corresponding to the target content entities are removed from the first knowledge graph 400. In some embodiments, the target content entities can be determined based on the weight of each content entity 360 corresponding to the second target media content, or they can be determined based on the number of times each content entity 360 is used to provide media content to users (e.g., the number of times it is used to obtain second media content).

[0062] Figure 6 illustrates a schematic diagram of an example architecture of an information processing platform 600 according to some embodiments of the present disclosure. As shown in Figure 6, the information processing platform 600 includes a media content acquisition system 610, an information processing system 620, and a media content library 630. The media content acquisition system 610 is used to acquire media content from media content sources (e.g., internet pages, etc.) and provide the acquired media content to the information processing system 620. The information processing system 620 is used to process the acquired media content. For example, the information processing system 620 can delete duplicate media content, store media content, process media content, and verify media content. The media content processed by the information processing system 620 is stored in the media content library 630 in the form of content entities, so that the information processing platform 600 can provide corresponding media content to users through the media content library 630. The modules in the information processing platform 600 communicate with each other through message queues to achieve decoupling between modules. In this way, the reliability of the information processing platform is improved.

[0063] Therefore, the solution disclosed in this application updates the content entity information of the first knowledge graph only when no matching media content exists, thus associating the first media content with the first knowledge graph. This ensures that there is no duplicate media content among the media content associated with the first knowledge graph. This prevents the provision of duplicate media content to users and improves the efficiency of users viewing media content. Furthermore, based on the first knowledge graph, a second knowledge graph representing standard entities is generated. Compared to content entities, standard entities are of higher quality and more objective. Media content is provided to users based on the second knowledge graph, thereby providing users with more accurate, comprehensive, and user-relevant media content.

[0064] Figure 7 shows a flowchart of an example process 700 for information processing according to some embodiments of the present disclosure. Process 700 may be implemented at server 130. Process 700 will now be described with reference to Figure 1.

[0065] As shown in Figure 7, in box 710, server 130 detects media content that matches the first media content from multiple media content associated with the first knowledge graph for the first media content to be processed. The first knowledge graph represents multiple content entities and the relationships between multiple content entities and is determined based on multiple media content.

[0066] In some embodiments, detecting media content matching the first media content from a plurality of media content associated with the first knowledge graph includes: storing the first media content in response to receiving the first media content; and detecting media content matching the first media content from a plurality of media content in response to a predetermined delay period after receiving the first media content.

[0067] In box 720, server 130 updates the attributes of the first target media content in the first knowledge graph in response to detecting a first target media content that matches the first media content.

[0068] In some embodiments, updating the attributes of the first target media content in the first knowledge graph includes: in response to detecting the first target media content, updating the weight of the first target media content in the first knowledge graph, wherein the weight indicates the degree of attention the first target media content receives.

[0069] In box 730, server 130 updates the content entity information of the first knowledge graph based on the first media content in response to the absence of media content matching the first media content.

[0070] In some embodiments, updating the content entity information of the first knowledge graph based on the first media content includes: for the content entities identified in the first media content, detecting content entities that match the identified content entities from a plurality of content entities; in response to no content entity matching the identified content entities being detected, adding a first node representing the identified content entity to the first knowledge graph; determining a first relationship between the identified content entities and the first content entities represented by the first knowledge graph based on the first media content; and adding a first directed edge representing the first relationship to the first knowledge graph.

[0071] In some embodiments, process 700 further includes: in response to detecting a target content entity that matches the identified content entity, determining a second relationship between the target content entity and a second content entity represented by a first knowledge graph based on the first media content; and adding a second directed edge representing the second relationship to the first knowledge graph.

[0072] In some embodiments, process 700 further includes: performing a clustering operation on the multiple content entities represented by the first knowledge graph based on the attribute information of the multiple content entities to determine multiple standard entities; and generating a second knowledge graph representing the multiple standard entities and the relationships between the multiple standard entities.

[0073] In some embodiments, process 700 further includes: determining a query entity corresponding to the received user input based on the received user input; determining one or more target standard entities matching the query entity from a plurality of standard entities based on a second knowledge graph; determining one or more second media content corresponding to the query entity based on the one or more target standard entities; and presenting one or more second media content as at least part of a response to the user input.

[0074] In some embodiments, the first knowledge graph further includes a graph index and a media content index, wherein the media content index is used to identify multiple media contents in the first knowledge graph, the graph index is used to identify multiple content entities in the first knowledge graph, and wherein determining one or more second media contents includes: obtaining one or more second media contents determined for one or more target standard entities based on the graph index and the media content index.

[0075] In some embodiments, process 700 further includes: for a given content entity represented by a first knowledge graph, determining the number of second target media contents corresponding to the given content entity from a plurality of media contents associated with the first knowledge graph; and in response to the number of second target media contents being less than a number threshold, removing nodes representing the given content entity from the first knowledge graph.

[0076] Embodiments of this disclosure also provide corresponding apparatus for implementing the methods or processes described above. FIG8 shows a schematic structural block diagram of an example apparatus 800 for information processing according to certain embodiments of this disclosure. Apparatus 800 may be implemented as or included in server 130. The various modules / components in apparatus 800 may be implemented by hardware, software, firmware, or any combination thereof.

[0077] As shown in Figure 8, the device 800 includes a detection module 810 configured to detect media content matching the first media content from multiple media contents associated with a first knowledge graph, for a first media content to be processed. The first knowledge graph represents multiple content entities and the relationships between them, and is determined based on multiple media contents. The device 800 also includes a first update module 820 configured to update the attributes of the first target media content in the first knowledge graph in response to detecting the first target media content matching the first media content. The device 800 also includes a second update module configured to update the content entity information of the first knowledge graph based on the first media content in response to not detecting any media content matching the first media content.

[0078] In some embodiments, the detection module 810 is further configured to store the first media content in response to receiving the first media content; and to detect media content that matches the first media content from a plurality of media contents in response to receiving the first media content after a predetermined delay.

[0079] In some embodiments, the first update module 820 is further configured to update the weight of the first target media content in the first knowledge graph in response to detecting the first target media content, wherein the weight indicates the degree of attention the first target media content receives.

[0080] In some embodiments, the second update module 820 is further configured to: detect, from a plurality of content entities, a content entity that matches the identified content entity in the first media content; in response to no content entity matching the identified content entity being detected, add a first node representing the identified content entity in the first knowledge graph; determine a first relationship between the identified content entity and the first content entity represented by the first knowledge graph based on the first media content; and add a first directed edge representing the first relationship in the first knowledge graph.

[0081] In some embodiments, the apparatus 800 further includes a second relation generation module configured to, in response to detecting a target content entity that matches the identified content entity, determine a second relation between the target content entity and a second content entity represented by a first knowledge graph based on the first media content; and add a second directed edge representing the second relation to the first knowledge graph.

[0082] In some embodiments, the apparatus 800 further includes a clustering module configured to perform a clustering operation on the multiple content entities represented by the first knowledge graph, based on the attribute information of the multiple content entities, to determine multiple standard entities; and to generate a second knowledge graph representing the multiple standard entities and the relationships between the multiple standard entities.

[0083] In some embodiments, the apparatus 800 further includes a query module configured to: determine a query entity corresponding to a received user input; determine one or more target standard entities matching the query entity from a plurality of standard entities based on a second knowledge graph; determine one or more second media contents corresponding to the query entity based on the one or more target standard entities; and present one or more second media contents as at least part of a response to the user input.

[0084] In some embodiments, the first knowledge graph further includes a graph index and a media content index. The media content index is used to identify multiple media contents in the first knowledge graph, and the graph index is used to identify multiple content entities in the first knowledge graph. The query module is also configured to obtain one or more second media contents determined for one or more target standard entities based on the graph index and the media content index.

[0085] In some embodiments, the apparatus 800 further includes a removal module configured to, for a given content entity represented by a first knowledge graph, determine the number of second target media contents corresponding to the given content entity from a plurality of media contents associated with the first knowledge graph; and, in response to the number of second target media contents being less than a number threshold, remove nodes representing the given content entity from the first knowledge graph.

[0086] As shown in Figure 9, the electronic device 900 is in the form of a general-purpose electronic device. Components of the electronic device 900 may include, but are not limited to, one or more processors or processing units 910, a memory 920, a storage device 930, one or more communication units 940, one or more input devices 950, and one or more output devices 960. The processing unit 910 may be a physical or virtual processor and is capable of performing various processes according to the program stored in the memory 920. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to improve the parallel processing capability of the electronic device 900.

[0087] Electronic device 900 typically includes multiple computer storage media. Such media can be any accessible media that is accessible to electronic device 900, including but not limited to volatile and non-volatile media, removable and non-removable media. Memory 920 can be volatile memory (e.g., registers, cache, random access memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 930 can be removable or non-removable media and can include machine-readable media, such as flash drives, disks, or any other media that can be used to store information and / or data and can be accessed within electronic device 900.

[0088] Electronic device 900 may further include additional removable / non-removable, volatile / non-volatile storage media. Although not shown in FIG. 9, disk drives for reading from or writing to removable, non-volatile disks (e.g., "floppy disks") and optical disk drives for reading from or writing to removable, non-volatile optical disks may be provided. In these cases, each drive may be connected to a bus (not shown) via one or more data media interfaces. Memory 920 may include computer program product 925 having one or more program modules configured to perform various methods or actions of various embodiments of the present disclosure.

[0089] The communication unit 940 enables communication with other electronic devices via a communication medium. Additionally, the functionality of the components of the electronic device 900 can be implemented using a single computing cluster or multiple computing machines capable of communicating via communication connections. Therefore, the electronic device 900 can operate in a networked environment using logical connections to one or more other servers, network personal computers (PCs), or another network node.

[0090] Input device 950 can be one or more input devices, such as a mouse, keyboard, trackball, etc. Output device 960 can be one or more output devices, such as a monitor, speaker, printer, etc. Electronic device 900 can also communicate with one or more external devices (not shown) via communication unit 940 as needed. These external devices include storage devices, display devices, etc., and can communicate with one or more devices that enable user interaction with electronic device 900, or with any device that enables electronic device 900 to communicate with one or more other electronic devices (e.g., network card, modem, etc.). Such communication can be performed via input / output (I / O) interface (not shown).

[0091] According to an exemplary implementation of this disclosure, a computer-readable storage medium is provided that stores computer-executable instructions thereon, wherein the computer-executable instructions are executed by a processor to implement the methods described above. According to an exemplary implementation of this disclosure, a computer program product is also provided, which is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions, which are executed by a processor to implement the methods described above.

[0092] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products implemented according to this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0093] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0094] Computer-readable program instructions can be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions that execute on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0095] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0096] Various implementations of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed implementations. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described implementations. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to technology in the market, or to enable others skilled in the art to understand the various implementations disclosed herein.

Claims

1. An information processing method, comprising: For the first media content to be processed, media content matching the first media content is detected from multiple media content associated with the first knowledge graph, wherein the first knowledge graph represents multiple content entities and the relationships between the multiple content entities and is determined based on the multiple media content; In response to detecting first target media content that matches first media content, update the attributes of the first target media content in the first knowledge graph; as well as In response to the absence of media content matching the first media content, the content entity information of the first knowledge graph is updated based on the first media content.

2. The method according to claim 1, wherein updating the content entity information of the first knowledge graph based on the first media content includes: For the content entities identified in the first media content, detect content entities that match the identified content entities from the plurality of content entities; In response to the absence of a content entity matching the identified content entity, a first node representing the identified content entity is added to the first knowledge graph; Based on the first media content, a first relationship is determined between the identified content entity and the first content entity represented by the first knowledge graph; as well as Add a first directed edge representing the first relation to the first knowledge graph.

3. The method according to claim 2, further comprising: In response to detecting a target content entity that matches the identified content entity, a second relationship is determined between the target content entity and the second content entity represented by the first knowledge graph, based on the first media content; as well as Add a second directed edge representing the second relation to the first knowledge graph.

4. The method of claim 1, wherein detecting media content matching the first media content from a plurality of media content associated with the first knowledge graph comprises: In response to receiving the first media content, the first media content is stored; as well as In response to a predetermined delay after receiving the first media content, media content matching the first media content is detected from the plurality of media content.

5. The method according to claim 1, further comprising: For the multiple content entities represented by the first knowledge graph, a clustering operation is performed on the multiple content entities based on the attribute information of the multiple content entities to determine multiple standard entities; as well as A second knowledge graph representing the plurality of standard entities and the relationships between the plurality of standard entities is generated.

6. The method according to claim 5, further comprising: Based on the received user input, determine the query entity corresponding to the user input; Based on the second knowledge graph, one or more target standard entities that match the query entity are determined from the plurality of standard entities; Based on the one or more target standard entities, determine one or more second media contents corresponding to the query entity; as well as Present the one or more second media contents as at least part of a response to the user input.

7. The method according to claim 6, wherein the first knowledge graph further includes a graph index and a media content index, the media content index being used to identify the plurality of media contents respectively in the first knowledge graph, and the graph index being used to identify the plurality of content entities respectively in the first knowledge graph, and The determination of the one or more second media contents includes: Based on the graph index and the media content index, the one or more second media contents determined for the one or more target standard entities are obtained respectively.

8. The method according to claim 1, further comprising: For a given content entity represented by the first knowledge graph, determine the number of second target media contents corresponding to the given content entity from multiple media contents associated with the first knowledge graph; as well as In response to the number of the second target media content falling below a threshold, nodes representing the given content entity are removed from the first knowledge graph.

9. The method according to claim 1, wherein updating the attributes of the first target media content in the first knowledge graph includes: In response to detecting the first target media content, the weight of the first target media content in the first knowledge graph is updated, the weight indicating the degree of attention the first target media content receives.

10. An apparatus for information processing, comprising: The detection module is configured to detect media content that matches the first media content from multiple media content associated with a first knowledge graph, for the first media content to be processed, the first knowledge graph representing multiple content entities and the relationships between the multiple content entities and determined based on the multiple media content; The first update module is configured to update the attributes of the first target media content in the first knowledge graph in response to detecting a first target media content that matches the first media content; as well as The second update module is configured to update the content entity information of the first knowledge graph based on the first media content in response to the absence of media content matching the first media content.

11. An electronic device, comprising: At least one processing unit; as well as At least one memory, coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions causing the electronic device to perform the method according to any one of claims 1 to 9 when executed by the at least one processing unit.

12. A computer-readable storage medium having a computer program stored thereon, the computer program being executable by a processor to implement the method according to any one of claims 1 to 9.