Intelligent data processing method, device and equipment with time sequence perception, and storage medium

By constructing a dynamic knowledge graph and using multi-agent systems for collaborative processing, the problem of temporal conflicts in dynamic data processing in traditional RAG systems is solved, enabling real-time information updates and accurate responses.

CN122309527APending Publication Date: 2026-06-30SHENZHEN COOCAA NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHENZHEN COOCAA NETWORK TECH CO LTD
Filing Date
2026-02-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Traditional RAG systems cannot automatically identify and handle timing conflicts when processing dynamically updated data, resulting in the generation of misleading hallucination responses and affecting the timeliness and accuracy of the output information.

Method used

Semantic segmentation is performed by acquiring the time frame identifiers of documents, factual data and timestamps are extracted, a dynamic knowledge graph is constructed, and a multi-agent system is used for collaborative processing to detect temporal conflicts and dynamically update the knowledge base to ensure the timeliness and accuracy of information.

Benefits of technology

It significantly improves the response timeliness and accuracy of the RAG system, solves the problems of entity redundancy and fact conflict, and provides deeper query support.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309527A_ABST
    Figure CN122309527A_ABST
Patent Text Reader

Abstract

This application relates to the field of big data processing technology and discloses a time-aware intelligent data processing method, apparatus, device, and storage medium. It addresses the problem of misleading hallucinatory responses in traditional RAG systems. The method includes: acquiring a document to be processed and extracting the time frame identifier corresponding to the document; semantically segmenting the document to be processed to generate text units carrying time frame identifiers; constructing associated data containing factual data, timestamps, and related entities based on the text units; performing time-series conflict detection based on the timestamps corresponding to each factual data in the associated data, and determining the effective status of each factual data in a dynamic knowledge graph based on the detection results; constructing a dynamic knowledge graph based on the factual data, its corresponding effective status, and timestamps; and generating a query response by coordinating processing based on the dynamic knowledge graph through a multi-agent system composed of multiple functional agents in response to a target query.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of big data processing technology, and in particular to a time-aware intelligent data processing method, apparatus, device and storage medium. Background Technology

[0002] With the rapid development of artificial intelligence technology, retrieval-enhanced generation (RAG) systems have been widely used in fields such as intelligent question answering, decision support, and knowledge management. Traditional RAG systems typically rely on static knowledge bases, generating responses by retrieving relevant facts from pre-built knowledge bases and combining them with large language models.

[0003] However, existing RAG systems face serious challenges in handling information conflicts when processing dynamically updated data such as financial reports, technical documents, or policy guidelines. When newly entered data contradicts older information in the knowledge base (e.g., annual updates to company financial statements or iterations of technical parameters), the system cannot automatically identify and resolve these temporal conflicts. This leads the system to often rely on outdated historical data for retrieval and reasoning when responding to queries, resulting in misleading and misleading responses that severely impact the timeliness and accuracy of the output information. Summary of the Invention

[0004] This application provides a time-aware intelligent data processing method, apparatus, device, and storage medium to solve the technical problem in traditional RAG systems that generates misleading hallucination responses, which seriously affect the timeliness and accuracy of output information.

[0005] A time-aware intelligent data processing method, the method comprising: Obtain the document to be processed and extract the time frame identifier corresponding to the document to be processed; perform semantic segmentation on the document to be processed to generate text units carrying the time frame identifier; Based on the text unit, fact extraction and entity parsing are performed to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and to construct associated data containing the fact data, the timestamp, and the associated entity; Based on the timestamps corresponding to each fact data in the associated data, time sequence conflict detection is performed, and the effective status of each fact data in the dynamic knowledge graph is determined based on the detection results. The dynamic knowledge graph is constructed based on the factual data, its corresponding effective status, and the timestamp. In response to a target query, a multi-agent system consisting of multiple functional agents collaborates on the dynamic knowledge graph to generate a query response.

[0006] In one implementation, the step of performing time-series conflict detection based on the timestamps corresponding to each fact data in the associated data, and determining the effective status of each fact data in the dynamic knowledge graph based on the detection results, includes: The associated data is grouped according to the dimensions of the associated entities, and the fact data within the same group is sorted according to the timestamp. To detect whether there are conflicts between new factual data and existing factual data; If a conflict exists, the existing fact data is marked as invalid, and the new fact data is marked as active.

[0007] In one implementation, constructing the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp includes: The factual data in the active state are assembled into the dynamic knowledge graph, wherein the dynamic knowledge graph includes nodes representing entities and edges representing relationships between entities; Record the corresponding effective status and the timestamp on the node or the edge.

[0008] In one implementation, the collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents includes: The fact agent retrieves active fact data related to the target query from the dynamic knowledge graph. The analytical agent combines the entity's current factual data with historical timeline data constructed based on the timestamps to perform inference and generate analysis results; Using a time-series intelligent agent, entity fact data within the date range specified in the target query are retrieved from the dynamic knowledge graph.

[0009] In one implementation, the collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents further includes: The target query is received by the agent coordinator, and the target query is distributed to the fact agent, the analysis agent, or the time-series agent according to the query type. For analytical queries, the agent coordinator obtains the retrieval results from the fact agent and provides them to the analytical agent.

[0010] In one implementation, the step of performing fact extraction and entity parsing based on the text unit to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and constructing associated data containing the fact data, the timestamp, and the associated entity, includes: Atomic fact data is extracted from the text unit as the fact data, and the corresponding timestamp and associated entity are extracted; the extracted associated entities are normalized to merge duplicate entities and generate the associated data.

[0011] In one implementation, after generating the query response, the method further includes: monitoring updates to the data source, and triggering semantic segmentation, fact extraction, entity parsing, and time sequence conflict detection for the updated data, so as to update the effective status in the dynamic knowledge graph in real time.

[0012] A time-aware intelligent data processing device, the device comprising: The acquisition module is used to acquire the document to be processed and extract the time frame identifier corresponding to the document to be processed; The segmentation module is used to semantically segment the document to be processed and generate text units carrying the time frame identifier. The fact extraction and entity parsing module is used to extract facts and parse entities based on the text unit to obtain fact data, timestamps corresponding to the fact data, and associated entities, and to construct associated data containing the fact data, the timestamps, and the associated entities; The temporal conflict detection module is used to perform temporal conflict detection based on the timestamps corresponding to each fact data in the associated data, and to determine the effective status of each fact data in the dynamic knowledge graph based on the detection results. The knowledge graph construction module is used to construct the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp. The query response module is used to respond to the target query by generating a query response through collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents.

[0013] A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, performs the steps of the method as described in any of the preceding claims.

[0014] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the method as described in any of the preceding claims.

[0015] In one of the solutions provided above, this application realizes automatic processing of dynamic data and real-time updating of the knowledge base. Its technical principle is to use a time-series conflict detection mechanism to dynamically manage the state of atomic facts and combine it with multi-agent collaborative retrieval, which significantly improves the timeliness and accuracy of the response generated by the RAG system, effectively solves the problems of entity redundancy and fact conflict in traditional systems, and provides more in-depth and accurate support for query needs of different dimensions. Attached Figure Description

[0016] To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0017] Figure 1 This is a schematic diagram of an application environment for a time-aware intelligent data processing method according to an embodiment of this application; Figure 2 This is a flowchart of a time-aware intelligent data processing method according to an embodiment of this application; Figure 3 This is a schematic block diagram of a timing-aware intelligent data processing device according to an embodiment of this application; Figure 4 This is a schematic diagram of the structure of a computer device according to one embodiment of this application. Detailed Implementation

[0018] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0019] The time-aware intelligent data processing method provided in this application embodiment can be applied to, for example, Figure 1 In this application environment, the terminal device communicates with the server via a network. The server obtains the document to be processed and extracts the time frame identifier corresponding to the document; it performs semantic segmentation on the document to be processed to generate text units carrying the time frame identifier; based on the text units, it performs fact extraction and entity parsing to obtain fact data, the timestamp corresponding to the fact data, and associated entities, and constructs associated data containing the fact data, the timestamp, and the associated entities. Based on the timestamps corresponding to each fact data in the associated data, time-series conflict detection is performed, and the effective status of each fact data in the dynamic knowledge graph is determined based on the detection results; the dynamic knowledge graph is constructed based on the fact data, its corresponding effective status, and the timestamps; in response to a target query, a multi-agent system composed of multiple functional agents performs collaborative processing based on the dynamic knowledge graph to generate a query response.

[0020] The terminal devices can be, but are not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. Servers can be implemented using standalone servers or server clusters consisting of multiple servers.

[0021] In one embodiment, this application provides a time-aware intelligent data processing method, which is mainly applied to the dynamic knowledge base management scenario of the Retrieval Enhanced Generation (RAG) system. By introducing a time dimension and a multi-agent collaborative mechanism, it aims to solve the technical pain points of traditional RAG systems when processing dynamically updated documents, such as information lag, entity conflict, and single retrieval response dimension.

[0022] In one embodiment, such as Figure 2 As shown, the method specifically includes the following processing steps: S101: Obtain the document to be processed and extract the time frame identifier corresponding to the document to be processed.

[0023] The document to be processed refers to textual materials containing dynamic information about a specific industry, field, or company, such as financial reports, technology evolution documents, or medical diagnostic guidelines. A timeframe identifier is a feature label that characterizes the document's temporal position. In actual processing, the system can load a dynamic dataset and use preprocessing techniques such as regular expressions to extract quarter and year information from the document as its timeframe identifier. For example, the extracted identifier might be "2023-Q3," providing a basic temporal coordinate for all subsequent fact extraction and knowledge evolution. By statistically analyzing metadata such as the number of documents, company distribution, and vocabulary size, the system can establish a macro-level understanding of the dataset to be processed, ensuring the adaptability of subsequent block-based logic.

[0024] In one example implementation, during the process of acquiring the documents to be processed and extracting timeframe identifiers, the system also performs multi-dimensional dynamic data preprocessing and statistical analysis. Specifically, the system performs comprehensive statistical analysis on the dataset, including the total number of documents, the number of unique companies, the document distribution characteristics for each company, and calculates the average vocabulary size for each document. This comprehensive statistical data provides the system with a macro-level profile of the data to be processed, helping to provide reasonable parameter basis for semantic segmentation strategies in subsequent steps based on the density or length of the documents.

[0025] S102: Semantically segment the document to be processed to generate text units carrying the time frame identifier.

[0026] Semantic chunking refers to breaking away from traditional physical segmentation based on character length or simple punctuation marks, and instead dividing long documents into context-meaning text segments based on semantic logic. As an example, this embodiment employs a percentile-based chunking method. By analyzing the distribution of semantic relevance within the text, it dynamically determines the segmentation boundaries, thereby obtaining semantically coherent text units. Each text unit not only contains the chunked content but also automatically includes the original document's metadata, namely the aforementioned time frame identifiers, such as company name, date, quarter, etc. This processing method effectively reduces the break in factual information caused by physical chunking, ensuring a complete contextual environment for subsequent fact extraction.

[0027] S103: Based on the text unit, perform fact extraction and entity parsing to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and construct associated data containing the fact data, the timestamp, and the associated entity.

[0028] Fact data refers to indivisible objective descriptions extracted from text, i.e., atomic facts. The system utilizes large language models combined with structured data models such as Pydantic to automatically parse atomic facts, specific timestamps, and related entities from each text unit. The entity parsing process is executed through the thinking engine unit, first extracting and classifying entities from each statement in the fact list, then using an entity normalization algorithm to identify and merge duplicate entities. For example, Apple Inc. is identified as the same entity. Through this parsing, the system constructs clean, relational data containing entity-fact-time triples. This atomicity and normalization process eliminates entity redundancy in the knowledge base, laying the foundation for efficient retrieval.

[0029] S104: Perform time sequence conflict detection based on the timestamps corresponding to each fact data in the associated data, and determine the effective status of each fact data in the dynamic knowledge graph based on the detection results.

[0030] Temporal conflict detection refers to the process of identifying whether there are temporal contradictions between different facts concerning the same entity. The system groups the cleaned related data by entity dimension and sorts the facts within the same group according to timestamps. The system detects conflicts between newly entered facts and existing facts in the knowledge graph through logical comparison, such as the latest update of a financial indicator. If a conflict is detected, the system automatically marks the existing facts as invalid and the latest fact data as the currently valid fact data. Through this temporal invalidation mechanism, the system can dynamically maintain the timeliness of knowledge, ensuring that the most accurate and valid information is always retained in the knowledge graph.

[0031] S105: Construct the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp.

[0032] A dynamic knowledge graph is a multi-dimensional graph structure capable of recording the evolution of knowledge. The system assembles all extracted entities as nodes and the relationships between entities as edges into a dynamic knowledge graph. During this process, the system records the activation or deactivation status of facts and timestamps within the knowledge graph. The completed dynamic knowledge graph can be stored in a scalable cloud database to achieve reliable storage and real-time updates of the knowledge base. Through this graph structure, the system not only stores current facts but also preserves the historical timeline of entities, enabling the system to trace the evolution of knowledge at different points in time.

[0033] S106: In response to the target query, a multi-agent system consisting of multiple functional agents performs collaborative processing based on the dynamic knowledge graph to generate a query response.

[0034] A multi-agent system refers to an architecture in which multiple AI agents with clearly defined roles collaborate to complete complex tasks. For example, the system in this embodiment includes a fact-based agent, an analytical agent, a temporal agent, and an agent coordinator. When a target query is received from a terminal device, the agent coordinator distributes tasks according to the query type: The fact agent extracts entities from the query, quickly retrieves relevant and valid activation facts from the dynamic knowledge graph, and returns a direct answer.

[0035] The analysis agent combines the current facts of the entity with the historical timeline to perform reasoning processes such as trend analysis and comparative reasoning, and generates analysis results.

[0036] The time-series agent extracts a date range from the query, retrieves entity facts within that range, and returns a time-related response. Furthermore, the agent coordinator combines the results from the fact-based agent to optimize the accuracy of the analysis response. This collaborative mechanism achieves precise responses for different query types.

[0037] As can be seen, in this embodiment, through the above steps S101 to S106, this application realizes automatic processing of dynamic data and real-time updating of the knowledge base. Its technical principle lies in using a time-series conflict detection mechanism to dynamically manage the state of atomic facts, and combining multi-agent collaborative retrieval, which significantly improves the timeliness and accuracy of the response generated by the RAG system, effectively solves the problems of entity redundancy and fact conflict in traditional systems, and provides more in-depth and accurate support for query needs of different dimensions.

[0038] In one embodiment, the timeliness management of knowledge is further refined logically, ensuring that the information in the dynamic knowledge base is always up-to-date through a time-series invalidation mechanism. Specifically, in step S104, the step of performing time-series conflict detection based on the timestamps corresponding to each fact data in the associated data, and determining the effective status of each fact data in the dynamic knowledge graph based on the detection results, specifically includes the following sub-steps: S1041: Group the associated data according to the dimensions of the associated entities, and sort the fact data in the same group according to the timestamp.

[0039] A related entity refers to a unique object identifier that has been normalized during data processing, such as a specific "company name," "product model," or "diagnostic indicator." Grouping refers to aggregating all scattered atomic facts involving the same entity together to form a dedicated list of facts for that entity. Sorting, on the other hand, arranges these facts according to a linear logical order based on their timestamps. As an example, if the system obtains multiple descriptions of "a company's revenue" at different points in time (such as Q1, Q2, and Q3 of 2023), this step allows the system to clearly construct a timeline of the entity's attribute evolution.

[0040] S1042: Detect whether there is a conflict between new factual data and existing factual data.

[0041] A temporal conflict refers to a situation where newly entered information and older information stored in the knowledge graph are logically mutually exclusive or have numerical differences regarding the same attribute of the same entity. In practice, the system compares adjacent facts after sorting. As an example, in a financial audit scenario, a company's "third-quarter forecast revenue" and its subsequent "third-quarter formal audit revenue" would conflict; similarly, in a technical documentation update scenario, the interface definition of software version 1.0 might conflict with that of version 2.0. Through a temporal detection engine, the system can automatically identify such factual changes caused by the passage of time.

[0042] S1043: If a conflict exists, the existing fact data is marked as invalid and the new fact data is marked as active.

[0043] The effective status refers to a logical label used to identify whether factual data can be accepted by the retrieval system, typically including two states: active and inactive. When a conflict is detected, the system follows the principle that later information is preferred over earlier information, marking outdated facts with earlier timestamps as inactive, thus filtering them out in normal knowledge retrieval; while marking new factual data representing the latest reality as active, serving as the system's primary basis for responding to queries. As an example, although outdated information is marked as inactive, it is still retained in the knowledge graph for historical trend analysis by the time-series agent.

[0044] By grouping related data by entity, sorting it chronologically, and detecting conflicts, the system establishes a time-weighted logical consistency verification mechanism at the physical level. This mechanism proactively identifies outdated and superseded knowledge using causal chains, moving away from the passive approach of traditional knowledge bases that either accept all information or rely solely on manual updates. The resulting improvement significantly reduces information redundancy and logical inconsistencies in the knowledge base, ensuring that every response generated by the RAG system is based on the most current and accurate facts. This fundamentally eliminates the AI ​​illusion problem caused by referencing outdated information, greatly enhancing the system's decision-making value when processing dynamically updated documents.

[0045] In one embodiment, the time-aware intelligent data processing method achieves structured accumulation and precise management of massive dynamic facts by constructing a dynamic knowledge graph with multi-dimensional time-series attribute recording capabilities. Specifically, the construction of the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp in step S105 includes the following sub-steps: S1051: Assemble the fact data that is in an active state into the dynamic knowledge graph.

[0046] After completing temporal conflict detection and determining the factual state, the system enters the dynamic knowledge graph assembly stage. This process is not a simple data accumulation, but rather the establishment of a structured relational network through a large language model or specialized logic. The system mainly selects factual data that is currently in a valid state for assembly. The dynamic knowledge graph, at the physical level, consists of nodes representing entities and edges representing relationships between entities. As an example, when processing technical documents, the system uses identified specific technical solutions and corresponding technical features as nodes, and defines the subordinate or application relationship between them as the edges connecting them. This graph structure can intuitively map the complex relationships between entities in the real world and provide semantic paths for subsequent retrieval.

[0047] S1052: Record the corresponding effective status and the timestamp on the node or the edge.

[0048] To support full lifecycle knowledge traceability, the system adds metadata information to the basic units in the graph while constructing topological connections. Specifically, the system synchronously records the corresponding active status (i.e., whether it is active or inactive) and the timestamp of the fact for each node or each relation edge. This means that the knowledge graph not only records the latest facts but also completely preserves the evolutionary traces of historical information. The completed dynamic knowledge graph is then stored in a scalable cloud database to achieve reliable storage and support real-time high-concurrency queries for multi-agent systems.

[0049] As an example, the completed dynamic knowledge graph is ultimately stored in a highly reliable and scalable cloud database. This choice of physical storage not only ensures reliable storage and real-time updates of the knowledge base in the context of massive amounts of data, but more importantly, through the elastic scaling capabilities of the cloud, it can effectively support the concurrent retrieval needs of subsequent multi-agent systems when facing complex target queries, ensuring the knowledge base's responsiveness to dynamic data.

[0050] In this embodiment, through steps S1051 to S1052, this application achieves deep integration of temporal attributes and graph structure at the data storage layer, directly embedding dynamically changing active states and time information into the attributes of knowledge nodes. This not only ensures that the retrieval system has extremely high accuracy and timeliness in responding to fact queries, but also provides a complete data foundation for subsequent cross-time trend analysis and comparative reasoning, effectively solving the problem of the lack of temporal awareness in the knowledge base of traditional RAG systems.

[0051] In one embodiment, the time-aware intelligent data processing method achieves efficient scheduling and precise utilization of multi-dimensional data in dynamic knowledge graphs by constructing a multi-agent system composed of multiple functional agents. Specifically, the collaborative processing based on the dynamic knowledge graph by the multi-agent system composed of multiple functional agents in step S106 includes the following sub-steps: S1061: Use a fact agent to retrieve active fact data related to the target query from the dynamic knowledge graph.

[0052] The fact-finding agent is primarily responsible for handling direct fact retrieval requests. Upon receiving a target query, the agent first extracts related entities from the query statement and accesses the dynamic knowledge graph. By identifying the active status labels on nodes or edges, the agent automatically filters out outdated information marked as invalid, retrieving only currently active, valid fact data. For example, if the query is "the latest version number of a certain product," the agent will retrieve the corresponding active facts under that product entity and return a direct answer, ensuring the factual accuracy and real-time nature of the response.

[0053] S1062: The analytical agent combines the entity's current factual data with historical timeline data constructed based on the timestamps to perform reasoning and generate analysis results.

[0054] The analytical agent aims to solve complex queries involving logical reasoning, trend analysis, or comparative analysis. Instead of being limited to current static facts, the agent combines the entity's latest activation data with historical timelines recorded in the knowledge graph to reconstruct the entity's historical timeline data. As an example, for the query "Analyze the evolution trend of a certain technology over the past two years," the analytical agent can trace back multiple update records of the technology entity along a time-series coordinate, performing cross-time-domain trend analysis and reasoning to generate insightful analytical results.

[0055] S1063: Using a time-series intelligent agent, retrieve entity fact data within the date range from the dynamic knowledge graph based on the date range in the target query.

[0056] The temporal agent is specifically designed to handle queries with explicit time window constraints. When responding to a query, the agent first extracts a specific date range (e.g., the first quarter of 2023) contained in the target query. Then, the agent traverses the attributes of relevant entities in a dynamic knowledge graph, accurately retrieving all entity factual data whose timestamps fall within the preset date range, and returns them as a time-sensitive response. This approach enables the system to accurately reconstruct the information state of specific historical nodes, meeting the needs of time-sensitive businesses.

[0057] As can be seen, in this embodiment, through steps S1061 to S1063, this application achieves high-dimensional analysis of information in a dynamic knowledge base through the division of labor and cooperation among multiple intelligent agents. Its technical principle lies in calling specialized intelligent agents with corresponding processing logic to perform targeted retrieval based on the differences in query intent (factual, analytical, or temporal). This significantly enhances the RAG system's understanding and response capabilities to complex queries, not only solving the problem of single query response in traditional systems but also ensuring comprehensive and accurate support covering current facts, historical evolution, and information from specific time periods when processing dynamically updated data.

[0058] In one embodiment, the time-aware intelligent data processing method achieves accurate identification of complex query intentions and deep collaboration among multiple agents through the central scheduling role of the agent coordinator. Specifically, the collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents in step S106 further includes the following sub-steps: S1064: Receive the target query using the agent coordinator, and distribute the target query to the fact agent, the analysis agent, or the time-series agent according to the query type.

[0059] The agent coordinator, acting as the command center of the multi-agent system, is responsible for the initial analysis of received user queries. In actual processing, the agent coordinator uses intent recognition logic to determine whether the query is fact-retrieval, analytical reasoning, or time-sensitive. For example, if the query involves direct confirmation of atomic facts, it is classified as a fact query and distributed to the fact agent; if the query involves trend analysis or comparative reasoning, it is distributed to the analytical agent; and if the query has a specific date range restriction, it is distributed to the time-sensitive agent. This query-type-based distribution mechanism ensures that different dimensions of query needs can be matched with corresponding specialized agents, achieving accurate responses to different types of queries.

[0060] S1065: For analytical queries, the agent coordinator obtains the retrieval results from the fact agent and provides them to the analytical agent.

[0061] When handling analytical queries requiring in-depth insights, the system does not rely solely on the analytical agent operating independently; instead, it optimizes response quality through a collaborative mechanism. For such queries, the agent coordinator first directs the fact agent to retrieve relevant, currently active, and valid factual data from the dynamic knowledge graph. Subsequently, the agent coordinator provides these deterministic results returned by the fact agent as foundational supporting information to the analytical agent. The analytical agent then uses this information, combined with the entity's historical timeline data, for comprehensive analysis. As an example, when analyzing a company's financial trends, the analytical agent not only references historical data but also directly utilizes the latest audit facts provided by the fact agent as a benchmark, thereby generating more accurate analytical conclusions.

[0062] When executing analytical queries, the agent coordinator does not simply distribute and cascade them, but also optimizes the accuracy of the analytical response based on the results returned by the fact agents.

[0063] In processing analytical queries, the agent coordinator not only handles task distribution and result delivery but also optimizes the response. Once the coordinator receives the currently valid factual results returned by the fact agent, it provides them as a benchmark to the analytical agent. Subsequently, the agent coordinator combines these deterministic search results to optimize the accuracy of the trend or comparative responses generated by the analytical agent. This cascaded optimization strategy ensures that the generated deep analysis results possess both logical depth and extremely high factual support.

[0064] In this embodiment, through steps S1064 to S1065, this application utilizes an agent coordinator to achieve efficient collaboration and result optimization within a multi-agent system. Its technical principle lies in establishing a processing chain of intent classification, targeted distribution, and result cascading, which significantly improves the accuracy of the RAG system in handling complex analytical queries. By using facts provided by fact agents to calibrate the analysis process in real time, it avoids logical fallacies caused by missing or biased basic facts in the reasoning process of a single model, ensuring that the generated response possesses both a deep analytical perspective and high factual accuracy.

[0065] In one embodiment, the time-aware intelligent data processing method provides standardized data input for constructing a high-precision dynamic knowledge graph through atomic fact extraction and standardized entity parsing. Specifically, step S103, which involves performing fact extraction and entity parsing based on the text units to obtain fact data, the timestamps corresponding to the fact data, and associated entities, and constructing associated data containing the fact data, the timestamps, and the associated entities, specifically includes the following sub-steps: S1031: Extract atomic fact data from the text unit as the fact data, and extract the corresponding timestamp and associated entity.

[0066] After acquiring semantically segmented text units, the system utilizes a large language model combined with a Pydantic structured model to perform deep analysis of the information within each text unit. This process aims to break down complex natural language descriptions into indivisible atomic facts. As an example, the system identifies and extracts specific factual statements contained in the text, while also identifying the corresponding timestamps and relevant core entities (such as company names, technical terms, or people). Through the semantic understanding capabilities of the large language model, the system can accurately extract key metadata from unstructured text, forming a preliminary list of facts.

[0067] S1032: Normalize the extracted associated entities to merge duplicate entities and generate the associated data.

[0068] To eliminate data redundancy in the knowledge base, the system activates the thinking engine unit to classify and parse the initially extracted entities. Since the same entity may have different names in different documents or contexts, the system uses an entity normalization algorithm to semantically align these entities. For example, the system identifies and merges Apple Inc. and Apple Corporation into the same standardized entity node. After merging duplicate entities, the system re-maps and associates the cleaned entities with their corresponding atomic facts and timestamps, thereby generating the final associated data. This normalization process ensures that different factual descriptions of the same physical object can be accurately grouped together, providing a unique entity reference for subsequent temporal conflict detection.

[0069] In the entity parsing process for generating associated data, the Thinking Engine unit introduces a hierarchical processing mechanism. The system first extracts the entities contained within each statement from the initially extracted fact list and classifies them according to their actual semantics. Only after accurate entity classification is completed does the system further call an entity normalization algorithm to merge entities with the same meaning but different names. This processing path of classification followed by normalization greatly improves the granularity of entity parsing, resulting in cleaner and logically consistent entity-fact association data.

[0070] In this embodiment, through the aforementioned steps S1031 to S1032, this application achieves the transformation from raw text to clean, related data. The principle lies in using structured extraction methods to ensure the granularity of facts, and combining this with a normalization algorithm to unify the uniqueness of entities, significantly reducing the redundancy of entities in the knowledge base and improving the efficiency and accuracy of data retrieval. Simultaneously, through the extraction of atomic facts, contextual entanglement between information is avoided, laying a solid data foundation for achieving accurate time-series failure management and multi-dimensional intelligent agent queries.

[0071] In one embodiment, the time-aware intelligent data processing method also possesses self-evolution capabilities, ensuring that the knowledge base remains highly synchronized with external data sources, thereby maintaining high information timeliness. Specifically, after generating the query response in step S106, the following processing steps are further included: S107: Monitor updates to the data source and trigger semantic segmentation, fact extraction, entity parsing, and time sequence conflict detection for updated data to update the effective status in the dynamic knowledge graph in real time.

[0072] In this embodiment, monitoring refers to the continuous or periodic inspection process performed by the system on the raw data input. As an example, the data source could be a cloud database storing financial reports, technical documents, or medical diagnostic guidelines. By listening for file system change events, database triggers, or calls to specific API interfaces, the system can detect the access of new documents or the revision of old documents in real time. Once an update to the data source is detected, the system will automatically trigger the aforementioned automated processing pipeline for this newly added or changed data, namely, sequentially executing semantic segmentation, atomic fact extraction, entity normalization parsing, and critical temporal conflict detection.

[0073] In practice, the core objective of the update operation is to dynamically adjust the effective status of existing facts in the knowledge graph. For example, when a newly added document contains the latest description of an entity attribute, after performing a time-series conflict detection, the system automatically marks the outdated data in the dynamic knowledge graph that conflicts with the new facts as invalid and simultaneously enters the active facts with the latest timestamp. This process enables real-time updates of the dynamic knowledge graph at the storage layer, ensuring that subsequent retrieval and queries are always based on the currently accepted truth.

[0074] Through step S107 above, this application achieves a leap from static storage to dynamic adaptive updating of the knowledge base. Its technical principle lies in establishing a closed-loop automated mechanism of monitoring-triggering-streaming processing-state updating. This completely solves the core pain points of traditional RAG systems, such as poor information timeliness and outdated responses caused by reliance on manual intervention for knowledge base updates. By updating the active state in the dynamic knowledge graph in real time, the system ensures that the knowledge base can automatically identify and resolve contradictions and conflicts between old and new information, effectively suppressing the illusion phenomenon caused by AI referencing outdated information during the response process. This automated time-series awareness capability greatly improves the decision-making accuracy and application reliability of the system in fields highly sensitive to dynamic data, such as finance and healthcare.

[0075] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0076] In one embodiment, a time-aware intelligent data processing device is provided, which corresponds one-to-one with the time-aware intelligent data processing method in the above embodiments. For example... Figure 3 As shown, the time-series-aware intelligent data processing device includes an acquisition module 101, a segmentation module 102, a fact extraction and entity parsing module 103, a time-series conflict detection module 104, a knowledge graph construction module 105, and a query response module 106. Detailed descriptions of each functional module are as follows: The acquisition module 101 is used to acquire the document to be processed and extract the time frame identifier corresponding to the document to be processed; The segmentation module 102 is used to semantically segment the document to be processed and generate text units carrying the time frame identifier. The fact extraction and entity parsing module 103 is used to perform fact extraction and entity parsing based on the text unit to obtain fact data, the timestamp corresponding to the fact data and the associated entity, and to construct associated data containing the fact data, the timestamp and the associated entity; The temporal conflict detection module 104 is used to perform temporal conflict detection based on the timestamps corresponding to each fact data in the associated data, and determine the effective status of each fact data in the dynamic knowledge graph based on the detection results. Knowledge graph construction module 105 is used to construct the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp; The query response module 106 is used to respond to the target query by generating a query response through collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents.

[0077] Specific limitations regarding the time-aware intelligent data processing device can be found in the limitations of the time-aware intelligent data processing method described above, and will not be repeated here. Each module in the aforementioned time-aware intelligent data processing device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device in hardware form, or stored in the memory of a computer device in software form, so that the processor can call and execute the operations corresponding to each module.

[0078] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 4 As shown, the computer device includes a processor, memory, network interface, and database connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The network interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements a time-aware intelligent data processing method.

[0079] In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to perform the following steps: Obtain the document to be processed and extract the time frame identifier corresponding to the document to be processed; perform semantic segmentation on the document to be processed to generate text units carrying the time frame identifier; Based on the text unit, fact extraction and entity parsing are performed to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and to construct associated data containing the fact data, the timestamp, and the associated entity; Based on the timestamps corresponding to each fact data in the associated data, time sequence conflict detection is performed, and the effective status of each fact data in the dynamic knowledge graph is determined based on the detection results. The dynamic knowledge graph is constructed based on the factual data, its corresponding effective status, and the timestamp. In response to a target query, a multi-agent system consisting of multiple functional agents collaborates on the dynamic knowledge graph to generate a query response.

[0080] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor: Obtain the document to be processed and extract the time frame identifier corresponding to the document to be processed; perform semantic segmentation on the document to be processed to generate text units carrying the time frame identifier; Based on the text unit, fact extraction and entity parsing are performed to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and to construct associated data containing the fact data, the timestamp, and the associated entity; Based on the timestamps corresponding to each fact data in the associated data, time sequence conflict detection is performed, and the effective status of each fact data in the dynamic knowledge graph is determined based on the detection results. The dynamic knowledge graph is constructed based on the factual data, its corresponding effective status, and the timestamp. In response to a target query, a multi-agent system consisting of multiple functional agents collaborates on the dynamic knowledge graph to generate a query response.

[0081] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), Rambus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

[0082] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is used as an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above.

[0083] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A time-aware intelligent data processing method, characterized in that, The method includes: Obtain the document to be processed and extract the time frame identifier corresponding to the document to be processed; perform semantic segmentation on the document to be processed to generate text units carrying the time frame identifier; Based on the text unit, fact extraction and entity parsing are performed to obtain fact data, the timestamp corresponding to the fact data, and the associated entity, and to construct associated data containing the fact data, the timestamp, and the associated entity; Based on the timestamps corresponding to each fact data in the associated data, time sequence conflict detection is performed, and the effective status of each fact data in the dynamic knowledge graph is determined based on the detection results. The dynamic knowledge graph is constructed based on the factual data, its corresponding effective status, and the timestamp. In response to a target query, a multi-agent system consisting of multiple functional agents collaborates on the dynamic knowledge graph to generate a query response.

2. The method according to claim 1, characterized in that, The step of performing time-series conflict detection based on the timestamps corresponding to each fact data in the associated data, and determining the effective status of each fact data in the dynamic knowledge graph based on the detection results, includes: The associated data is grouped according to the dimensions of the associated entities, and the fact data within the same group is sorted according to the timestamp. To detect whether there are conflicts between new factual data and existing factual data; If a conflict exists, the existing fact data is marked as invalid, and the new fact data is marked as active.

3. The method according to claim 1, characterized in that, The construction of the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp includes: The factual data in the active state are assembled into the dynamic knowledge graph, wherein the dynamic knowledge graph includes nodes representing entities and edges representing relationships between entities; Record the corresponding effective status and the timestamp on the node or the edge.

4. The method according to claim 1, characterized in that, The collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents includes: The fact agent retrieves active fact data related to the target query from the dynamic knowledge graph. The analytical agent combines the entity's current factual data with historical timeline data constructed based on the timestamps to perform inference and generate analysis results; Using a time-series intelligent agent, entity fact data within the date range specified in the target query are retrieved from the dynamic knowledge graph.

5. The method according to claim 4, characterized in that, The method of collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents also includes: The target query is received by the agent coordinator, and the target query is distributed to the fact agent, the analysis agent, or the time-series agent according to the query type. For analytical queries, the agent coordinator obtains the retrieval results from the fact agent and provides them to the analytical agent.

6. The method according to claim 1, characterized in that, The step of extracting facts and parsing entities based on the text units to obtain fact data, the timestamps corresponding to the fact data, and associated entities, and constructing associated data containing the fact data, the timestamps, and the associated entities, includes: Atomic fact data is extracted from the text unit as the fact data, and the corresponding timestamp and associated entity are extracted; the extracted associated entities are normalized to merge duplicate entities and generate the associated data.

7. The method according to claim 1, characterized in that, After generating the query response, the method further includes: monitoring updates to the data source, and triggering semantic segmentation, fact extraction, entity parsing, and time sequence conflict detection for the updated data, so as to update the effective status in the dynamic knowledge graph in real time.

8. A time-aware intelligent data processing device, characterized in that, The device includes: The acquisition module is used to acquire the document to be processed and extract the time frame identifier corresponding to the document to be processed; The segmentation module is used to semantically segment the document to be processed and generate text units carrying the time frame identifier. The fact extraction and entity parsing module is used to extract facts and parse entities based on the text unit to obtain fact data, timestamps corresponding to the fact data, and associated entities, and to construct associated data containing the fact data, the timestamps, and the associated entities; The temporal conflict detection module is used to perform temporal conflict detection based on the timestamps corresponding to each fact data in the associated data, and to determine the effective status of each fact data in the dynamic knowledge graph based on the detection results. The knowledge graph construction module is used to construct the dynamic knowledge graph based on the factual data, its corresponding effective status, and the timestamp. The query response module is used to respond to the target query by generating a query response through collaborative processing based on the dynamic knowledge graph by a multi-agent system composed of multiple functional agents.

9. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the method as described in any one of claims 1 to 7.

10. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method as described in any one of claims 1 to 7.