Conference key information real-time extraction and knowledge pushing method based on large language model

By using real-time speech data processing and knowledge graph construction based on a large language model, the problems of information fragmentation and inaccurate push in the conference system are solved, enabling real-time and personalized key information extraction and push, thereby improving conference efficiency and participant experience.

CN121722908BActive Publication Date: 2026-06-19BEIJING YIZHUANG INTELLIGENT CITY RES INST GRP CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING YIZHUANG INTELLIGENT CITY RES INST GRP CO LTD
Filing Date
2026-02-11
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing meeting support systems lack real-time information processing capabilities, are unable to deeply understand the semantic structure and logical connections of meetings, and lack personalized knowledge services, resulting in fragmented information and inaccurate delivery.

Method used

Based on a large language model, real-time speech data stream is transcribed and semantically parsed. Semantic segments are divided by semantic integrity and time window constraints. Combined with predefined information extraction templates and dynamically learned meeting topic models, key information is identified and a knowledge graph of meeting discussion evolution is constructed to generate personalized knowledge push instructions.

Benefits of technology

It enables real-time, accurate extraction and personalized delivery of meeting information, improving meeting efficiency and participant experience, and ensuring the integrity and relevance of information.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121722908B_ABST
    Figure CN121722908B_ABST
Patent Text Reader

Abstract

This invention provides a method for real-time extraction and knowledge delivery of key meeting information based on a large language model, relating to the fields of artificial intelligence and meeting management technology. The method includes: acquiring real-time voice data streams and performing transcription and semantic parsing; segmenting semantic segments based on semantic integrity constraints; identifying key information using information extraction templates and meeting topic models; constructing a knowledge graph of the meeting discussion evolution; and accurately delivering knowledge based on the graph and meeting flow. This invention can improve meeting efficiency, enhance knowledge sharing, and optimize the decision-making process.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of artificial intelligence and meeting management technology, and in particular to a method for real-time extraction and knowledge push of key meeting information based on a large language model. Background Technology

[0002] With the evolution of enterprise collaboration models and the widespread adoption of remote work, the efficiency and quality of meetings, as a primary means of team communication and decision-making, are receiving increasing attention. In traditional meetings, participants often need to manually record meeting content and summarize key points, which not only distracts them and affects participation but also makes it difficult to ensure the completeness and accuracy of the records. In recent years, with the rapid development of artificial intelligence technology, especially large language models, meeting assistance systems based on speech recognition and natural language processing have begun to be applied in business environments, aiming to improve meeting efficiency and enhance knowledge sharing.

[0003] Current meeting support systems on the market mainly provide basic functions such as meeting minutes and content summaries, but they still have many shortcomings in real-time information processing and intelligent knowledge services. On the one hand, existing technologies mostly adopt a post-processing model, lacking the ability to analyze real-time audio streams during meetings, and thus failing to provide timely information support and knowledge delivery during the meeting. On the other hand, traditional meeting content processing methods often remain at the level of surface text analysis, making it difficult to deeply understand the semantic structure and logical connections of the meeting discussions, resulting in fragmented information extraction lacking contextual relevance. In addition, existing systems generally lack personalized knowledge service mechanisms for different participant roles and needs, failing to provide accurate knowledge delivery based on the dynamic progress of the meeting and the differences in participants' background knowledge, thus reducing the efficiency and value of knowledge sharing. Summary of the Invention

[0004] This invention provides a method for real-time extraction and knowledge push of key meeting information based on a large language model, which can solve the problems in the prior art.

[0005] A first aspect of this invention provides a method for real-time extraction and knowledge push of key meeting information based on a large language model, comprising:

[0006] Acquire real-time audio data streams generated during the meeting;

[0007] The real-time voice data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints;

[0008] Based on a predefined information extraction template and a dynamically learned conference topic model, key information is identified and extracted from each semantic fragment, the importance score of each information element is calculated, and a structured key information fragment is constructed.

[0009] The key information fragments are associated and aggregated, and the argumentation and citation relationships between the key information fragments are deeply reasoned to construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations.

[0010] Based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and the timing of its push are determined, knowledge push instructions for different participants are generated, and the knowledge push instructions are executed to push the knowledge content to be pushed to the participants' terminal devices.

[0011] The real-time speech data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints, including:

[0012] The real-time speech data stream is subjected to streaming speech recognition, and text segments with timestamps are output in real time. The text segments are then spliced ​​and the sentence structure is adjusted to form a continuous text transcription sequence.

[0013] Real-time semantic analysis is performed on the text transcription sequence to identify topic transition markers and semantic boundary markers. The topic transition markers include topic introduction words and transition words, and the semantic boundary markers include semantic closure markers and logical termination markers. Text intervals that simultaneously satisfy the condition of no topic transition and semantic closure are selected as candidate segments under semantic integrity constraints.

[0014] A sliding time window is established based on a preset duration. The sliding time window is moved in chronological order on the text transcription sequence. When the text interval covered by the time window crosses the boundary of the candidate segment, it is determined whether there is a complete candidate segment within the time window. If there is, segmentation is performed at the boundary position of the candidate segment. If there is no complete candidate segment, the sliding continues until the next candidate segment boundary is encountered, resulting in multiple semantic segments.

[0015] Based on predefined information extraction templates and a dynamically learned conference topic model, key information is identified and extracted from each semantic fragment. Importance scores for each information element are calculated, and structured key information fragments are constructed, including:

[0016] Construct a predefined information extraction template library, which contains multiple information extraction templates. Each information extraction template defines the structured field composition of a specific type of key information, the dependencies between fields, and the semantic constraint rules for field values.

[0017] Extract the topic feature vector of the semantic fragment and input it into the dynamically learned conference topic model. The model updates the topic cluster center and probability distribution parameters in real time based on the topic feature distribution of the processed semantic fragments, and outputs the topic category of the semantic fragment and its belonging confidence.

[0018] Information extraction templates are retrieved based on the topic category of semantic fragments. Candidate information elements are extracted based on the field composition and semantic constraint rules defined in the template. The logical consistency of the candidate information elements is verified through the dependency relationship between fields in the template.

[0019] For candidate information elements that pass the logical consistency verification, the preset weights of their corresponding fields in the template and the topic affiliation confidence of their semantic segments are combined to calculate an importance score. Candidate information elements with importance scores exceeding the importance threshold are organized into structured fields and bound to the timestamp and participant identifier of their semantic segments to form key information segments.

[0020] Information extraction templates are retrieved based on the topic category of semantic fragments. Candidate information elements are extracted based on the field composition and semantic constraint rules defined in the templates. The logical consistency of the candidate information elements is verified through the dependencies between fields in the templates, including:

[0021] Using the topic category of the semantic fragment as the index key, locate the associated candidate information extraction template, extract the semantic feature vector of the semantic fragment to calculate the semantic similarity, and select the candidate information extraction template with the highest semantic similarity as the matching template.

[0022] The structured field composition is obtained from the matching template, the semantic constraint rules associated with each field are extracted, and the semantic constraint rules are parsed into semantic pattern descriptors and constraint condition expressions. The semantic pattern descriptor defines the textual representation of the field, and the constraint condition expression defines the semantic boundary conditions for the value.

[0023] Semantic matching is performed in the semantic fragment to identify text intervals that conform to the semantic pattern descriptor, extract entity references and attribute values, determine whether the constraint condition expression is satisfied, and use the values ​​that meet the conditions as candidate field values ​​to form candidate information elements.

[0024] Based on the dependencies between fields in the matching template, a dependency verification graph for the candidate information elements is established. Nodes represent candidate field values, and directed edges represent dependencies and are labeled with constraints. The verification graph is traversed in dependency order to check the dependency constraints on each edge. The source node field values ​​are substituted into the constraints as a premise to determine whether the target node field values ​​meet the constraints. When all constraints are met, the candidate information elements are determined to have logical consistency.

[0025] The key information fragments are associated and aggregated, and deep reasoning is performed on the argumentation and citation relationships between them to construct a knowledge graph of the meeting discussion evolution that includes node importance weights and edge semantic type annotations, including:

[0026] Extract information elements, timestamp information and participant identification from the key information fragments, calculate the time interval decay factor and semantic association strength between the preceding and subsequent fragments, identify causal triggering relationships, establish causal triggering edges and calculate triggering strength weights;

[0027] Extract the claims and supporting evidence from the key information segments, identify the supporting and refuting relationships between the segments, establish argument relationship edges, and label the argument direction type;

[0028] Based on participant identification, the system tracks response sequences and opposing viewpoint sequences between key information fragments, identifies collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant.

[0029] Using the key information fragments as nodes and causal triggering edges and argumentation relationship edges as edges, a heterogeneous knowledge graph is constructed. The triggering intensity weight of the causal triggering edge and the influence weight of the participant are integrated as dual adjustment factors for node importance. The node importance weight is obtained through graph propagation and iterative update.

[0030] The trigger strength weight, argument direction type, and node importance weight are integrated into a heterogeneous knowledge graph to generate a knowledge graph of the evolution of the meeting discussion.

[0031] Based on participant identification, the system tracks response sequences and opposing viewpoint sequences between key information fragments to identify collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant, including:

[0032] A speaking sequence index is established based on the participant's identity. Key information fragments and timestamps of each participant are extracted. By calculating the temporal interval density and topic continuity of adjacent key information fragments of each participant, immediate response fragments and delayed response fragments are identified.

[0033] Extract the position semantic features and argument direction features from key information fragments, calculate the position difference degree and argument conflict degree between key information fragments of different participants, and mark the corresponding information fragment sequence as a viewpoint opposition sequence when both meet the opposition threshold.

[0034] For immediate response segments and delayed response segments, the cumulative support response value and response duration level of the participants are calculated. When both meet the collaboration threshold, it is identified as a collaborative support mode. For opposing viewpoint sequences, the rounds of argumentation and defense and the degree of stance persistence of the opposing sides are extracted. When both meet the adversarial threshold, it is identified as an adversarial debate mode.

[0035] The cumulative value of support responses is used as the contribution of collaboration, the number of rounds of argumentation and defense is used as the debate activity, and the frequency of citation of immediate response fragments is used as the appeal of viewpoints. The influence weight of the participants is obtained by weighting and summing the three components.

[0036] Based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and the timing of its push are determined, and knowledge push instructions for different participants are generated, including:

[0037] Semantic mapping is used to associate and label the historical speaking content of each participant with knowledge nodes in the knowledge graph of the conference discussion evolution, thereby identifying the set of knowledge nodes that each participant has covered and the set of knowledge nodes that have not been covered.

[0038] Based on the preset meeting process and the knowledge graph of the meeting discussion evolution, the predicted meeting process that is about to enter the discussion is determined as the prediction node. Precursor dependency analysis is performed on the predicted meeting process to obtain the necessary prerequisite nodes. The difference operation is performed with the knowledge node set already covered by each participant to obtain the knowledge gap node of each participant for the predicted node.

[0039] Based on the knowledge gaps of each participant, key information fragments and their argumentation links are extracted and packaged into differentiated knowledge content to be pushed out, including supplementary prior knowledge and logical connection explanations.

[0040] When a new node pointing to a predicted node is detected in the knowledge graph of the meeting discussion evolution, or when a participant's speech involves a precursor topic of a knowledge gap node, the timing for pushing is determined, and a knowledge push instruction is generated in combination with the differentiated knowledge content to be pushed.

[0041] A second aspect of the present invention provides a real-time extraction and knowledge push system for key meeting information based on a large language model, comprising:

[0042] The first unit is used to acquire real-time audio data streams generated during the meeting;

[0043] The second unit is used to perform real-time transcription and semantic parsing of the real-time voice data stream, and to divide it into multiple semantic segments based on semantic integrity constraints and time window constraints.

[0044] The third unit is used to identify and extract key information from each semantic fragment based on a predefined information extraction template and a dynamically learned conference topic model, calculate the importance score of each information element, and construct a structured key information fragment.

[0045] The fourth unit is used to associate and aggregate the key information fragments, perform deep reasoning on the argumentation and citation relationships between the key information fragments, and construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations.

[0046] The fifth unit is used to determine the knowledge content to be pushed and the timing of its push based on the knowledge graph of the meeting discussion evolution and the preset meeting process, generate knowledge push instructions for different participants, execute the knowledge push instructions, and push the knowledge content to be pushed to the participants' terminal devices.

[0047] A third aspect of the present invention,

[0048] An electronic device is provided, comprising:

[0049] processor;

[0050] Memory used to store processor-executable instructions;

[0051] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.

[0052] Fourth aspect of the embodiments of the present invention,

[0053] A computer-readable storage medium is provided, having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.

[0054] The beneficial effects of this application are as follows:

[0055] By dividing semantic segments based on semantic integrity constraints and time window constraints, the problem of semantic fragmentation in real-time transcription of conference speech is solved, improving the accuracy and real-time performance of information extraction, and enabling key information to be processed in a timely manner during the conference.

[0056] This invention combines a predefined information extraction template with a dynamically learned meeting topic model, enabling accurate identification and extraction of key meeting information. It can adapt to the information characteristics of different types of meetings and improve information quality by filtering irrelevant information through an importance score calculation mechanism.

[0057] By conducting in-depth reasoning on the argumentation and citation relationships between key information fragments, a knowledge graph of the evolution of meeting discussions with node importance weights and edge semantic type annotations was constructed. This enabled the structured expression of meeting content and the mining of semantic associations, so that meeting information is no longer isolated fragments, but forms an organically connected knowledge network.

[0058] Based on the knowledge graph of meeting discussion evolution and preset meeting process, the system realizes intelligent determination of knowledge content and push timing, and can generate personalized knowledge push instructions for different participants. This solves the problems of blind and delayed information push in traditional meeting systems, and improves meeting efficiency and participant experience. Attached Figure Description

[0059] Figure 1 This is a flowchart illustrating the method for real-time extraction and knowledge push of key meeting information based on a large language model, according to an embodiment of the present invention.

[0060] Figure 2 This is a flowchart illustrating a method for constructing structured key information fragments according to an embodiment of the present invention. Detailed Implementation

[0061] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0062] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.

[0063] Figure 1 This is a flowchart illustrating the method for real-time extraction and knowledge push of key meeting information based on a large language model, as described in an embodiment of the present invention. Figure 1 As shown, the method includes:

[0064] Acquire real-time audio data streams generated during the meeting;

[0065] The real-time voice data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints;

[0066] Based on a predefined information extraction template and a dynamically learned conference topic model, key information is identified and extracted from each semantic fragment, the importance score of each information element is calculated, and a structured key information fragment is constructed.

[0067] The key information fragments are associated and aggregated, and the argumentation and citation relationships between the key information fragments are deeply reasoned to construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations.

[0068] Based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and the timing of its push are determined, knowledge push instructions for different participants are generated, and the knowledge push instructions are executed to push the knowledge content to be pushed to the participants' terminal devices.

[0069] In one optional implementation, the real-time speech data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints, including:

[0070] The real-time speech data stream is subjected to streaming speech recognition, and text segments with timestamps are output in real time. The text segments are then spliced ​​and the sentence structure is adjusted to form a continuous text transcription sequence.

[0071] Real-time semantic analysis is performed on the text transcription sequence to identify topic transition markers and semantic boundary markers. The topic transition markers include topic introduction words and transition words, and the semantic boundary markers include semantic closure markers and logical termination markers. Text intervals that simultaneously satisfy the condition of no topic transition and semantic closure are selected as candidate segments under semantic integrity constraints.

[0072] A sliding time window is established based on a preset duration. The sliding time window is moved in chronological order on the text transcription sequence. When the text interval covered by the time window crosses the boundary of the candidate segment, it is determined whether there is a complete candidate segment within the time window. If there is, segmentation is performed at the boundary position of the candidate segment. If there is no complete candidate segment, the sliding continues until the next candidate segment boundary is encountered, resulting in multiple semantic segments.

[0073] In this specific embodiment, streaming speech recognition is required for the real-time voice data stream. A deep learning-based speech recognition model is used to process the input audio data. This model employs a hybrid attention mechanism to generate text segments with start and end timestamps in real time during processing. For example, when a user utters the voice phrase "The weather is really nice today," the recognition engine will output a segment containing that text content and mark its time position in the original audio, such as [00:02.351-00:05.725]. These timestamped text segments are temporarily stored in a buffer for subsequent processing.

[0074] The identified text segments are spliced ​​and formatted. The splicing process uses the continuity of timestamps to determine the relationship between segments; segments with similar timestamps (intervals less than a preset threshold, such as 200 milliseconds) are spliced ​​directly. The formatting stage addresses common issues in speech recognition, such as removing duplicate words, correcting punctuation errors, and standardizing informal expressions. This process preserves the original timestamp information to ensure a temporal correspondence between the generated continuous text transcription sequence and the original speech data. Through this step, the previous example will be formatted as "The weather is really nice today," retaining the time information [00:02.351-00:05.725].

[0075] Real-time semantic parsing of the transcribed text sequence involves two core tasks: identifying topic transition markers and identifying semantic boundary markers. The identification of topic transition markers adopts a combination of rule-based and statistical methods, mainly detecting two types of markers: topic introduction words (such as "speaking of," "about," "talk about," etc.) and transition conjunctions (such as "but," "however," "however," etc.). Meanwhile, the identification of semantic boundary markers focuses on semantic closure markers (such as the completion of a question-and-answer pair, the end of the expression of an opinion) and logical termination markers (such as summarizing statements, concluding expressions).

[0076] During semantic parsing, a sliding window mechanism is used for dynamic analysis. The window size is set to 5 lexical units before and after. For each window position, the topic coherence score and semantic integrity score are calculated. Topic coherence is measured by word embedding similarity. When the cosine of the angle between the detected word embedding vectors is lower than the threshold of 0.4, it is marked as a potential topic change point. Semantic integrity is determined by syntactic structure analysis and semantic relations. When the syntactic tree structure is complete and there are no dangling dependencies, it is marked as a potential semantic closure point. Combining these two indicators, text intervals that simultaneously satisfy "no topic transition" (i.e., no topic transition markers are detected) and "have semantic closure" (i.e., there are semantic boundary markers) are marked as candidate segments under semantic integrity constraints.

[0077] A sliding time window is established based on a preset duration, which is usually set to 10-30 seconds and can be flexibly adjusted according to the application scenario. The time window moves in chronological order on the text transcription sequence, with each movement step being a preset value (e.g., 2 seconds). When the text interval covered by the time window crosses the boundary of a candidate segment, the segmentation judgment logic is triggered. Specifically, it is determined whether there is a complete candidate segment within the time window. If there is, segmentation is performed at the boundary of the candidate segment. If not, the sliding continues until the next candidate segment boundary is encountered.

[0078] In the segmentation process, a priority mechanism is introduced to handle complex situations. When multiple candidate segments are contained within the time window, the segment with the highest semantic integrity score is selected as the segmentation point. When the time window is about to exceed the maximum preset length (e.g., 60 seconds) and no ideal segmentation point has been found, a forced segmentation mechanism is triggered, which selects the position with the least semantic loss within the current window for segmentation.

[0079] Through the above steps, the resulting semantic fragments satisfy both semantic integrity requirements and time length constraints. Each fragment contains relatively complete semantic information and has a suitable duration, facilitating subsequent processing and analysis. These fragments are stored in a data structure of {text content, start time, end time, semantic type}, where the semantic type is categorized into "statement," "question," and "response" based on content characteristics.

[0080] In practical applications, such as online meeting recording systems, this method can effectively divide lengthy meeting audio into multiple semantically independent segments in real time. For example, when a participant transitions from product presentation to market analysis, the method can accurately segment the audio at the topic change point and add appropriate tags to each segment, greatly improving the efficiency of subsequent meeting content retrieval and understanding.

[0081] Figure 2 This is a flowchart illustrating a method for constructing structured key information fragments according to an embodiment of the present invention. In one optional implementation, based on a predefined information extraction template and a dynamically learned conference topic model, key information is identified and extracted from each semantic fragment, the importance score of each information element is calculated, and a structured key information fragment is constructed, including:

[0082] Construct a predefined information extraction template library, which contains multiple information extraction templates. Each information extraction template defines the structured field composition of a specific type of key information, the dependencies between fields, and the semantic constraint rules for field values.

[0083] Extract the topic feature vector of the semantic fragment and input it into the dynamically learned conference topic model. The model updates the topic cluster center and probability distribution parameters in real time based on the topic feature distribution of the processed semantic fragments, and outputs the topic category of the semantic fragment and its belonging confidence.

[0084] Information extraction templates are retrieved based on the topic category of semantic fragments. Candidate information elements are extracted based on the field composition and semantic constraint rules defined in the template. The logical consistency of the candidate information elements is verified through the dependency relationship between fields in the template.

[0085] For candidate information elements that pass the logical consistency verification, the preset weights of their corresponding fields in the template and the topic affiliation confidence of their semantic segments are combined to calculate an importance score. Candidate information elements with importance scores exceeding the importance threshold are organized into structured fields and bound to the timestamp and participant identifier of their semantic segments to form key information segments.

[0086] In this specific embodiment, constructing a predefined information extraction template library is a fundamental step in achieving efficient meeting information extraction. This template library contains multiple information extraction templates, each specifically defining the structured field composition of a particular type of key information, the dependencies between fields, and the semantic constraints on field values. The template library design process requires analyzing common key information types in meeting scenarios, such as decision-making, task allocation, risk warnings, and technical solutions. For each information type, its structured representation is defined in detail, including core fields, optional fields, and their attribute definitions.

[0087] Taking task assignment information as an example, its template includes fields such as "task name", "responsible person", "collaborator", "deadline", "priority", "prerequisites", and "acceptance criteria". Each field has a set data type, value range and format requirements. For example, the "deadline" field is of date and time type and the format can include absolute date and relative time expressions; the "priority" field is of enumeration type and the value range is limited to three levels: "high", "medium" and "low".

[0088] The template design defines the dependencies between fields and builds a complete constraint network. Dependencies are divided into two categories: necessary dependencies and conditional dependencies. Necessary dependencies mean that a certain field must appear together with another field. For example, if the "Collaborator" field appears, the "Responsible Person" field must exist. Conditional dependencies mean that the value of a field is affected by the value of other fields. For example, when "Priority" is "High", "Deadline" must specify a specific date rather than a vague expression. These dependencies are encoded through a rule description language and stored in the metadata part of the template.

[0089] To ensure the semantic correctness of the extracted information, semantic constraint rules are formulated for each field. The rules include value range validation, format validation, and semantic consistency checks. Value range validation ensures that the field value is within a predetermined range, such as "priority" cannot have a value outside the predefined range; format validation ensures that the field value conforms to specific format requirements, such as date and time standardization; semantic consistency checks ensure that the logic between related fields is reasonable, such as "deadline" should not be earlier than the current date. These rules are described in a declarative language, which facilitates subsequent expansion and maintenance.

[0090] The template library is organized in a hierarchical structure. The top layer distinguishes the main information categories, and the second layer subdivides the specific template types. It supports inheritance and combination between templates. Through the template inheritance mechanism, common fields and rules can be defined by the parent template. Child templates in specific fields inherit these definitions and add exclusive content. For example, R&D meetings and marketing meetings share the basic task template, but the R&D meeting task template additionally includes the "technology dependency" field, while the marketing meeting task template adds the "target customer" field.

[0091] After the template library is built, the real-time audio stream of the conference is connected and converted into a text stream through speech recognition technology. The text stream undergoes preprocessing, including noise filtering, sentence segmentation, and pronoun resolution, to form semantically complete text segments. Each semantic segment is analyzed for its semantic features through a large language model to extract topic feature vectors. The topic feature vectors are low-dimensional representations of semantic segments in the topic space, reflecting the distribution of topic information contained in the segments.

[0092] The process of extracting topic feature vectors utilizes the deep representation capabilities of a pre-trained large language model. Semantic segments are input into the large language model to obtain their hidden layer representations. Through a feature projection layer, the high-dimensional hidden layer representations are mapped to the topic feature space to form fixed-dimensional feature vectors. To enhance feature representation capabilities, contextual information is integrated during the feature extraction stage, so that the feature vectors of each segment not only reflect their own semantics but also contain contextual information during the meeting.

[0093] The extracted topic feature vectors are input into a dynamically learned conference topic model. This model is an online topic model that can update topic cluster centers and probability distribution parameters in real time based on the topic feature distribution of processed semantic fragments. During the model initialization phase, the initial number of topics and their distribution parameters are preset based on historical conference data. During operation, these parameters are continuously optimized through incremental learning to adapt to the evolution of topics in the current conference.

[0094] The conference topic model uses a Gaussian mixture distribution to represent the topic space, with each topic corresponding to a Gaussian distribution. The mean vector represents the topic center, and the covariance matrix represents the dispersion of the topics. When a new feature vector is input, the Mahalanobis distance between it and each topic center is calculated to determine the most likely topic affiliation and confidence level. At the same time, a Bayesian update rule is applied to adjust the topic distribution parameters according to the new samples to achieve dynamic optimization of the model.

[0095] The model has the ability to adapt to the number of topics. By monitoring the topic cohesion and dispersion indices, it dynamically determines whether to merge similar topics or create new topics. When the samples within a topic are too scattered or there are difficult-to-classify sample clusters, a topic splitting mechanism is triggered to create a new topic. When the centers of two topics are too close, a topic merging mechanism is triggered to reduce redundant representations. This dynamic adjustment ensures that the topic model accurately reflects the semantic structure of the conference content.

[0096] The topic model outputs the topic category and its attribution confidence of semantic fragments, which serve as an important basis for subsequent information extraction. Based on the topic category, the corresponding information extraction template is selected for refined processing. For example, fragments identified as "task allocation" will be extracted using a task allocation template; fragments identified as "decision achievement" will be extracted using a decision template.

[0097] In a practical application, during a weekly meeting of a research and development project to discuss the progress of feature development, the real-time transcribed semantic fragment was: "The interface module needs to be completed by next Wednesday. Engineer Zhang is in charge. The quality must meet the design specifications. It has a high priority." Through feature extraction, the topic feature vector of this fragment was obtained. After inputting it into the dynamically learned meeting topic model, its topic category was determined to be "task allocation", with a confidence level of 0.92.

[0098] Based on the topic category of semantic fragments, matching information extraction templates are retrieved from a predefined template library. The matching process employs a two-stage strategy: directly matching the corresponding template set based on the topic category, and calculating the similarity with each candidate template based on the semantic fragment content features. The template with the highest similarity is selected, and information extraction is performed on the semantic fragment using the selected template. Candidate information elements that match the fields defined in the template are identified based on techniques such as named entity recognition and relation extraction. For example, from the sentence "Xiao Li is responsible for the market research report, which needs to be completed by next Friday," information elements such as "Xiao Li" (responsible person), "market research report" (task content), and "next Friday" (deadline) are extracted according to the task allocation template.

[0099] Based on the field dependencies defined in the template, the logical consistency of candidate information elements is verified. This includes verifying whether the extracted time expression conforms to chronological constraints and whether the responsible person appears in the participant list. For candidate information elements that fail verification, attempts are made to supplement information from the context or mark them as uncertain. For candidate information elements that pass verification, an importance score is calculated based on the preset weight of the corresponding field in the template and the topic affiliation confidence of the semantic segment. The calculation method is: Importance Score = Preset Field Weight × Topic Affiliation Confidence × Extraction Confidence. Here, the preset field weight reflects the relative importance of the field in a specific template, the topic affiliation confidence indicates the probability that the semantic segment belongs to the current topic, and the extraction confidence indicates the accuracy of the information element matching the corresponding field.

[0100] Candidate information elements whose importance scores exceed a predefined importance threshold are retained and organized into structured fields according to the template's defined structure. These structured fields are then bound to the timestamp and participant identifier of their respective semantic segments to form complete key information segments. For example, a structured information segment could be represented as: {Timestamp: "00:15:32", Speaker: "General Manager Zhang", Topic Category: "Task Assignment", Structured Fields: {Task: "Market Research Report", Responsible Person: "Xiao Li", Deadline: "July 15, 2023"}}.

[0101] During the information extraction process, the topic model parameters are dynamically updated to adapt to changes in the meeting content. When a new topic cluster or an existing topic evolution is detected, the topic model parameters are dynamically adjusted based on the new semantic features, triggering the activation or weight adjustment of the corresponding templates in the template library. Simultaneously, based on the extracted set of key information fragments, a structured representation of the meeting content is constructed to support subsequent applications such as automatic generation of meeting minutes and decision tracking.

[0102] In addition, an adaptive importance threshold adjustment mechanism has been implemented, which dynamically adjusts the importance threshold based on factors such as meeting progress and topic density. This ensures that more key information is extracted in information-dense sections and unnecessary information is filtered out in redundant discussion sections. In this way, the technical solution can efficiently extract structured key information from lengthy meeting content and maintain the logical connection between information, greatly improving the efficiency of meeting content understanding and knowledge accumulation.

[0103] In one optional implementation, a matching information extraction template is retrieved based on the topic category of the semantic fragment; candidate information elements are extracted based on the field composition and semantic constraint rules defined in the template; and the logical consistency of the candidate information elements is verified through the dependencies between fields in the template, including:

[0104] Using the topic category of the semantic fragment as the index key, locate the associated candidate information extraction template, extract the semantic feature vector of the semantic fragment to calculate the semantic similarity, and select the candidate information extraction template with the highest semantic similarity as the matching template.

[0105] The structured field composition is obtained from the matching template, the semantic constraint rules associated with each field are extracted, and the semantic constraint rules are parsed into semantic pattern descriptors and constraint condition expressions. The semantic pattern descriptor defines the textual representation of the field, and the constraint condition expression defines the semantic boundary conditions for the value.

[0106] Semantic matching is performed in the semantic fragment to identify text intervals that conform to the semantic pattern descriptor, extract entity references and attribute values, determine whether the constraint condition expression is satisfied, and use the values ​​that meet the conditions as candidate field values ​​to form candidate information elements.

[0107] Based on the dependencies between fields in the matching template, a dependency verification graph for the candidate information elements is established. Nodes represent candidate field values, and directed edges represent dependencies and are labeled with constraints. The verification graph is traversed in dependency order to check the dependency constraints on each edge. The source node field values ​​are substituted into the constraints as a premise to determine whether the target node field values ​​meet the constraints. When all constraints are met, the candidate information elements are determined to have logical consistency.

[0108] In this specific embodiment, after obtaining the topic category of the semantic fragment, the category is used as the index key to locate the associated candidate information extraction template in the predefined information extraction template library. The template library adopts a multi-level index structure, with the topic category as the first-level index and the sub-category as the second-level index to achieve fast query. For example, when the semantic fragment is classified as the topic of "task allocation", the query returns all templates related to the topic, such as "R&D task template", "market task template", "regular task template", etc., which constitute a candidate template set.

[0109] For each candidate template, the semantic feature vector of the semantic segment is extracted and the semantic similarity with the standard sample of the template is calculated. The semantic feature vector is extracted based on the encoder of the pre-trained large language model and obtained by weighted fusion of the last few hidden layer states of the model. The cosine distance metric is used to calculate the similarity. The semantic segment feature vector is compared with the standard sample feature vector of each candidate template to quantify the degree of semantic similarity. The candidate template with the highest similarity is selected as the final matching template for subsequent information extraction process.

[0110] In a practical application, during a research and development meeting, a semantic fragment appeared stating, "The front-end team needs to complete the login page optimization by Thursday. Xiao Wang is responsible for coordination, prioritizing the display effect on mobile devices." After topic classification, it was identified as the "task allocation" topic, and three candidate templates were retrieved: "Research and Development Task Template," "Design Task Template," and "Regular Task Template." Through semantic similarity calculation, the similarity between this fragment and "Research and Development Task Template" was 0.86, which was significantly higher than the other candidate templates. Therefore, "Research and Development Task Template" was selected as the matching template.

[0111] The structured fields are extracted from the matching template. These fields define the key information elements that need to be extracted from the semantic fragments. Taking the R&D task template as an example, its structured fields include "task name", "responsible person", "deadline", "priority", "related teams", and "focus". Each field is associated with specific semantic constraint rules to guide the information extraction process.

[0112] Extract the semantic constraint rules associated with each field and parse them into two parts: a semantic pattern descriptor and a constraint expression. The semantic pattern descriptor defines the textual representation of the field, describing how the field is expressed in natural language. For example, the semantic pattern descriptor of the "deadline" field includes patterns such as "before...", "until...", and "before...", which are formally described using regular expressions or lexical rules. The constraint expression defines the semantic boundary conditions for the value, specifying the semantic constraints that the field value must satisfy. For example, the constraint expression for "deadline" stipulates that the time point must be a future time point and should have a clear time precision.

[0113] Semantic pattern descriptors are represented using a specific description language and support various text matching modes, including exact matching, fuzzy matching, and context-dependent matching. Constraint expressions are written using a conditional logic language and include comparison operators, logical operators, and specific semantic functions to validate the validity of field values. These rule expressions are pre-compiled into executable validation functions to improve runtime efficiency.

[0114] The semantic matching process is performed in semantic segments, and a multi-stage matching strategy is adopted. Based on semantic pattern descriptors, the pattern matching algorithm is used to identify text intervals that match the description in the semantic segments. For highly structured expressions, rule matching is applied directly; for implicit expressions, semantic-level matching is performed in combination with the understanding ability of the large language model. The matching process takes into account the context and handles the phenomena of referential resolution and omission.

[0115] Entity references and attribute values ​​are extracted from the matched text range. Named entity recognition technology and attribute value extraction rules are applied to convert unstructured text into structured information. For the aforementioned R&D meeting example, "login page optimization" is identified as the task name, "Xiao Wang" as the person in charge, "before Thursday" as the deadline, "front-end team" as the relevant team, and "mobile display effect" as the focus.

[0116] Each extracted value needs to be checked to see if it meets the constraint expression. The extracted value is substituted into the constraint for verification to check if it meets the semantic requirements. For example, for the deadline "before Thursday", the relative time expression is first converted into an absolute date to verify whether it meets the constraint of "future time point". Values ​​that meet the conditions are retained as valid candidate field values; values ​​that do not meet the conditions are marked as invalid and the non-compliant constraint is recorded for subsequent processing.

[0117] Based on the dependencies between fields defined in the matching template, a dependency verification graph for candidate information elements is established. In this graph structure, nodes represent candidate field values, directed edges represent dependencies and are labeled with constraints. Dependencies can be divided into strong dependencies and weak dependencies. Strong dependencies represent fields that must exist simultaneously, while weak dependencies represent optional conditional relationships. For example, in the task template, "focus" depends on the existence of "task name," which is a strong dependency; while "priority" should have a clear "deadline" when it is high, which is a weak dependency.

[0118] After the dependency verification graph is constructed, it is traversed and verified according to the dependency order. The traversal uses a topological sorting algorithm to ensure that the source node is processed before the target node. For each edge in the graph, the dependency constraint it represents is checked. The field value of the source node is substituted into the constraint as a premise to determine whether the field value of the target node meets the constraint requirements.

[0119] In the aforementioned R&D meeting example, we verify whether "mobile display effect" (focus) is semantically related to "login page optimization" (task name); check whether "before Thursday" (deadline) provides sufficient time precision; verify whether "Xiao Wang" (responsible person) is a team member, etc. When all the constraints on the dependency edges are met, we determine that the candidate information elements have logical consistency and form effective structured information.

[0120] For issues discovered during dependency validation, such as missing necessary fields or logical conflicts between fields, a detailed description of the problem should be recorded for subsequent information completion or conflict resolution. For example, if dependency validation finds that the "priority" field is missing, it will be marked that the information needs to be further clarified in terms of priority; if it finds that the "deadline" conflicts with the overall project schedule, a potential risk warning will be marked.

[0121] After dependency verification is completed, the verified candidate information elements are combined to form the final structured information, which includes topic categories, values ​​of each field and their credibility scores. This structured information serves as the result of extracting key information from the meeting and enters the subsequent knowledge push stage. Based on the importance and relevance of the information, it is pushed to the corresponding meeting participants and relevant personnel.

[0122] The above process, through precise template matching, rigorous semantic constraint verification, and comprehensive dependency checks, ensures that the key information extracted from conference audio is highly accurate and structured, providing a reliable foundation for subsequent knowledge management and task tracking. Simultaneously, the semantic understanding capabilities based on a large language model enable the system to handle conference content in various expressions, adapting to the information extraction needs of different conference scenarios.

[0123] In one optional implementation, the key information fragments are associated and aggregated, and deep reasoning is performed on the argumentation and citation relationships between the key information fragments to construct a knowledge graph of the evolution of meeting discussions, including node importance weights and edge semantic type annotations, comprising:

[0124] Extract information elements, timestamp information and participant identification from the key information fragments, calculate the time interval decay factor and semantic association strength between the preceding and subsequent fragments, identify causal triggering relationships, establish causal triggering edges and calculate triggering strength weights;

[0125] Extract the claims and supporting evidence from the key information segments, identify the supporting and refuting relationships between the segments, establish argument relationship edges, and label the argument direction type;

[0126] Based on participant identification, the system tracks response sequences and opposing viewpoint sequences between key information fragments, identifies collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant.

[0127] Using the key information fragments as nodes and causal triggering edges and argumentation relationship edges as edges, a heterogeneous knowledge graph is constructed. The triggering intensity weight of the causal triggering edge and the influence weight of the participant are integrated as dual adjustment factors for node importance. The node importance weight is obtained through graph propagation and iterative update.

[0128] The trigger strength weight, argument direction type, and node importance weight are integrated into a heterogeneous knowledge graph to generate a knowledge graph of the evolution of the meeting discussion.

[0129] In this specific embodiment, it is necessary to identify the core theme, participant identity, and timestamp information in each key information segment. Specifically, named entity recognition method is used to extract the participant's name, position, and other identity identifiers; time series analysis method is used to extract the speaking time point; and core topic words and key concepts in the segment are extracted through topic modeling. For example, in a product development meeting, the participants, time, and core viewpoints of the information segment "Zhang Gong raised the issue of performance bottleneck in product A at 14:05" can be identified.

[0130] Based on the extracted timestamp information, the time interval decay factor between adjacent segments is calculated. The time interval decay factor adopts an exponential decay function, that is, the larger the time interval between two information segments, the more obvious the decay of their association strength. At the same time, the semantic similarity between segments is calculated through a word vector model, and the comprehensive association strength is obtained by combining the time decay factor. For information segments that are semantically similar and close in time, their association strength is high; otherwise, it is low.

[0131] Identify causal triggering relationships between fragments. Through a causal relationship extraction model, analyze causal markers (such as "cause", "incident", "prompt") and logical structures in information fragments to determine whether there is a causal relationship between fragments. For the identified causal relationships, calculate the triggering strength weight based on the semantic association strength and logical determinism. For example, when discussing the causes of system failures, a causal triggering edge can be established between "excessive server load" and "slow system response". Its triggering strength depends on the logical determinism and semantic association between the two.

[0132] This process extracts the claims and supporting evidence from information fragments. Using an argument identification model, it analyzes the rhetorical structure of the fragments to identify the argument statement and supporting evidence. Based on the consistency and conflict of the claims, it identifies the supporting and refuting relationships between different fragments. Supporting relationships are labeled as "supplementary," "strengthened," or "expanded," while refuting relationships are labeled as "questioning," "negating," or "restricting." For example, when participant A suggests product feature improvement and participant B provides relevant market data, a "strengthening" supporting edge is established between the two fragments.

[0133] Based on participant identification, the system tracks response sequences and viewpoint interaction patterns during the meeting. By analyzing the order of speeches and content relevance, it identifies collaborative support patterns and adversarial debate patterns among participants. Collaborative support patterns are characterized by participants complementing each other and progressively developing viewpoints. Adversarial debate patterns are characterized by participants questioning each other and proposing counterexamples. Based on the frequency of citations, the degree to which viewpoints are adopted, and the weight of speeches, the system calculates the influence weight of each participant. Participants who are frequently cited and whose viewpoints are often adopted by other participants receive higher influence weights.

[0134] In the knowledge graph construction phase, key information fragments are used as nodes, and causal triggering relationships and argumentation relationships are used as edges to construct an initial heterogeneous knowledge graph. Node attributes include information content, speaker, timestamp, etc.; edge attributes include relationship type, trigger strength, or argumentation direction, etc. By integrating the trigger strength weight of causal triggering edges and the influence weight of participants, a dual adjustment factor for node importance is used.

[0135] The PageRank algorithm is used for graph propagation calculation, iteratively updating node importance weights. Specifically, each node is initially assigned the same importance value. In each iteration, a node receives importance propagation from neighboring nodes through incoming edges, and the propagation amount is determined by both the source node's importance and the edge weights. The influence of participants' influence on propagation efficiency is also considered. The iterative process continues until the node importance values ​​converge or the preset number of iterations is reached.

[0136] The calculated trigger strength weights, argument direction types, and node importance weights are integrated into a heterogeneous knowledge graph to generate a knowledge graph of the evolution of meeting discussions. This graph intuitively shows the shift of focus, evolution of viewpoints, and decision-making process in the meeting discussion. For example, in a product strategy discussion meeting, the graph can clearly identify key decision points, core viewpoints that affect the decision, and the contribution weights of participants, providing a structured foundation for subsequent interpretation of meeting minutes and decision tracking.

[0137] In practical applications, the discussion content of a technology company's quarterly product planning meeting can be used to construct a knowledge graph using this method. During the meeting, the product manager puts forward user needs, the technical director analyzes technical feasibility, and the marketing specialist assesses market acceptance, forming a product iteration plan. Through the constructed knowledge graph, it is clearly shown that the data provided by the marketing specialist supports the product manager's suggestions on feature priority, while the technical director's assessment of the implementation difficulty triggers the adjustment of the solution, ultimately forming a decision path that balances user needs and technical feasibility. In addition, the graph also shows that the chief technology officer's opinion has a high influence weight, and his speech often becomes a key turning point in the decision.

[0138] The knowledge graph of meeting discussion evolution constructed in this way not only records the content of the meeting, but also captures the evolution of relationships between viewpoints, the interaction patterns of participants, and the path of decision formation, providing structured support for meeting summaries, decision retrospection, and experience accumulation.

[0139] In one optional implementation, the system tracks response sequences and opposing viewpoint sequences between key information fragments based on participant identification, identifies collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant, including:

[0140] A speaking sequence index is established based on the participant's identity. Key information fragments and timestamps of each participant are extracted. By calculating the temporal interval density and topic continuity of adjacent key information fragments of each participant, immediate response fragments and delayed response fragments are identified.

[0141] Extract the position semantic features and argument direction features from key information fragments, calculate the position difference degree and argument conflict degree between key information fragments of different participants, and mark the corresponding information fragment sequence as a viewpoint opposition sequence when both meet the opposition threshold.

[0142] For immediate response segments and delayed response segments, the cumulative support response value and response duration level of the participants are calculated. When both meet the collaboration threshold, it is identified as a collaborative support mode. For opposing viewpoint sequences, the rounds of argumentation and defense and the degree of stance persistence of the opposing sides are extracted. When both meet the adversarial threshold, it is identified as an adversarial debate mode.

[0143] The cumulative value of support responses is used as the contribution of collaboration, the number of rounds of argumentation and defense is used as the debate activity, and the frequency of citation of immediate response fragments is used as the appeal of viewpoints. The influence weight of the participants is obtained by weighting and summing the three components.

[0144] In this specific embodiment, participant identity information is linked through pre-registered user accounts in the conference system. Each participant is assigned a unique identifier. During the conference, when a participant speaks, the speech recognition module converts the speech into text and appends the participant identifier and timestamp to the text, forming a basic dataset. Based on the timestamp data, a participant speaking sequence is constructed. For any two adjacent key information segments i and j, if they are proposed by different participants, their temporal interval density T is calculated. i,j The calculation method involves subtracting the end time of segment i from the start time of segment j. If the resulting time difference is less than a preset density threshold (e.g., 10 seconds), a temporal close relationship is considered to exist. Simultaneously, a word vector model is used to calculate the cosine similarity S between the two segments. i,j , indicating the continuity of the theme, if S i,j If the threshold for topic continuation is exceeded (e.g., 0.6), then a topic continuation relationship is considered to exist.

[0145] When key information fragment j meets both the requirements of temporal closeness and topic continuity with fragment i, j is marked as an immediate response fragment to i. If it only meets the requirements of topic continuity but the time interval is long (e.g., more than 10 seconds but less than 60 seconds), it is marked as a delayed response fragment. This constructs a response sequence relationship diagram among the participants.

[0146] To identify opposing viewpoint sequences, the semantic features of each key information segment are extracted. Specifically, a position dictionary containing affirmative, negative, and adversative conjunctions is pre-constructed, and each word is assigned a position vector value. The overall position vector V of the segment is calculated by statistically analyzing the frequency and positional weights of position words within the segment. i For two adjacent key information fragments i and j presented by different participants, calculate the degree of difference in positions D. i,j Let be the Euclidean distance between the position vectors of the two.

[0147] Simultaneously, argument extraction technology is used to identify the core arguments and their directions in each information segment. Argument direction features are extracted by analyzing action words, modal words, and modifiers in the arguments, forming an argument direction vector P. i Calculate the degree of argument conflict C between adjacent segments. i,j It is the negative value of the cosine of the angle between the direction vectors of the two arguments. When D i,j Exceeding the position difference threshold (e.g., 0.7) and C i,jWhen the argument conflict threshold is exceeded (e.g., 0.6), the pair of information fragments are marked as opposing viewpoints, and related fragments are tracked to form an opposing viewpoint sequence.

[0148] For the identified response sequences, the number of immediate response types received by each participant is counted and denoted as the cumulative support response value R. i For each response chain, perform in-depth analysis to calculate the chain depth from the initial statement to the final response, denoted as the response continuation level L. i When participant A's speech triggers a response chain that satisfies R i Greater than the collaboration support threshold and L i When the threshold for the continuation level is greater than the threshold, it is determined that a collaborative support mode has been formed between A and the respondent.

[0149] For the sequence of opposing viewpoints, the number of times each participant speaks in the sequence is counted, denoted as the round of argumentation A. i By analyzing the stability of the participants' position vectors throughout the entire opposition sequence, the position persistence degree S is calculated. i , when A i Exceeding the attack and defense round threshold and S i When the threshold for maintaining a position is exceeded, the interaction of participants in the opposing sequence is marked as an adversarial debate mode.

[0150] Supports response accumulation value R i After standardization, it is used as a component of collaborative contribution. Demonstration of attack and defense round A. i After standardization, this is used as a component of debate activity. The frequency F of each participant's immediate response segments being cited in subsequent speeches is calculated. i After standardization, this is used as the opinion appeal component. The three components are weighted and summed according to preset weights α, β, and γ (α+β+γ=1) to obtain the participant's overall influence weight W. i = α·R i + β·A i + γ·F i .

[0151] In real-world meeting scenarios, this method can effectively identify key participants who lead discussions, discover opinion leaders, detect potential conflicts, and identify team member combinations that work well together, providing data support for meeting management and team optimization.

[0152] In one optional implementation, based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and its push timing are determined, and knowledge push instructions for different participants are generated, including:

[0153] Semantic mapping is used to associate and label the historical speaking content of each participant with knowledge nodes in the knowledge graph of the conference discussion evolution, thereby identifying the set of knowledge nodes that each participant has covered and the set of knowledge nodes that have not been covered.

[0154] Based on the preset meeting process and the knowledge graph of the meeting discussion evolution, the predicted meeting process that is about to enter the discussion is determined as the prediction node. Precursor dependency analysis is performed on the predicted meeting process to obtain the necessary prerequisite nodes. The difference operation is performed with the knowledge node set already covered by each participant to obtain the knowledge gap node of each participant for the predicted node.

[0155] Based on the knowledge gaps of each participant, key information fragments and their argumentation links are extracted and packaged into differentiated knowledge content to be pushed out, including supplementary prior knowledge and logical connection explanations.

[0156] When a new node pointing to a predicted node is detected in the knowledge graph of the meeting discussion evolution, or when a participant's speech involves a precursor topic of a knowledge gap node, the timing for pushing is determined, and a knowledge push instruction is generated in combination with the differentiated knowledge content to be pushed.

[0157] In this embodiment, it is necessary to obtain the knowledge graph of the current meeting discussion evolution. Semantic mapping technology is used to associate the participants' historical speaking content with nodes in the knowledge graph. Natural language processing is performed on the participants' speaking text, including word segmentation, part-of-speech tagging, and named entity recognition, to extract key concepts and semantic elements. Syntactic analysis and semantic understanding technologies are then used to map and match the extracted semantic elements with nodes in the knowledge graph. The matching process uses a vector space model to calculate semantic similarity. When the similarity exceeds a preset threshold (e.g., 0.75), it is considered that the participant's speech covers the corresponding knowledge node, forming a set of covered nodes and a set of uncovered nodes for each participant. For example, if the marketing director mentions "competitive pricing analysis" and "market positioning" multiple times in their speech, these nodes are marked as covered nodes for that participant.

[0158] Based on the pre-set meeting flow and current meeting progress, the system predicts the upcoming meeting flow. This prediction uses a time-series prediction model, combining the position of currently discussed content within the knowledge graph to predict the knowledge nodes involved in the next stage. Precursor dependency analysis is performed on the predicted nodes, traversing all incoming edges pointing to those nodes in the knowledge graph to extract a set of nodes with prerequisite dependencies. These nodes constitute the essential knowledge foundation for understanding and discussing the predicted nodes. A difference operation is performed on each participant, subtracting the set of nodes already covered by that participant from the set of essential prerequisite nodes to obtain a personalized set of knowledge gap nodes. For example, if the predicted next stage will discuss "price elasticity analysis," its prerequisite nodes include "historical sales data analysis" and "market segmentation model." If the finance manager did not cover "market segmentation model" in their presentation, that node is identified as a knowledge gap.

[0159] For identified knowledge gap nodes, relevant information fragments are extracted from the knowledge base. Knowledge extraction employs a path-based subgraph retrieval algorithm, starting from the knowledge gap node and tracing back along the edges of the graph to obtain related prior knowledge and argumentation links. The extracted content includes concept definitions, key data, argumentation relationships, and conclusion justifications. Knowledge is then reorganized and personalized, and the depth of information presentation and the use of professional terminology are adjusted according to the background and role of the participants. For participants with technical backgrounds, professional terminology and technical details are retained; for management participants, the impact on decision-making is emphasized, resulting in differentiated knowledge content to be pushed, including supplementary prior knowledge and logical connections.

[0160] Real-time monitoring of meeting discussion progress determines the timing of knowledge dissemination. Two trigger mechanisms are used to determine this timing: first, graph triggering, where a newly added node in the knowledge graph pointing to a predicted node indicates the discussion is about to shift to the predicted topic; second, semantic triggering, where a participant's speech relates to a preceding topic of a knowledge gap node, indicating the relevant discussion has begun. When either trigger condition is met, a specific dissemination instruction is generated based on the differentiated knowledge content to be disseminated. This instruction includes elements such as the target audience, content, method, and priority. For example, when the CTO mentions "the need to consider product pricing strategy," this is detected as a preceding topic for pricing strategy discussion, and basic pricing theory knowledge is disseminated to sales managers lacking relevant background information.

[0161] In a real-world application scenario, during a company's quarterly product planning meeting, attendees included product managers, marketing directors, technical leads, and finance representatives. The meeting progressed to the market strategy discussion stage, and it was predicted that the next stage would focus on "channel expansion strategies." Through semantic mapping, it was discovered that the technical lead had not covered essential prerequisites such as "channel cost structure" and "channel cooperation models" in his previous presentations. When the marketing director mentioned "how to optimize sales channels," the system recognized the opportune moment to push this information to the technical lead's device. This included knowledge content integrating channel type analysis, cost structure charts, and case study data, enabling the technical lead to understand the business logic behind channel expansion decisions in the subsequent discussion and providing more targeted support from a technical perspective.

[0162] The above methods enable intelligent and differentiated knowledge delivery based on knowledge graphs and meeting processes, effectively bridging the knowledge gap among participants and improving meeting efficiency and decision-making quality.

[0163] This invention relates to a real-time meeting key information extraction and knowledge push system based on a large language model, comprising:

[0164] The first unit is used to acquire real-time audio data streams generated during the meeting;

[0165] The second unit is used to perform real-time transcription and semantic parsing of the real-time voice data stream, and to divide it into multiple semantic segments based on semantic integrity constraints and time window constraints.

[0166] The third unit is used to identify and extract key information from each semantic fragment based on a predefined information extraction template and a dynamically learned conference topic model, calculate the importance score of each information element, and construct a structured key information fragment.

[0167] The fourth unit is used to associate and aggregate the key information fragments, perform deep reasoning on the argumentation and citation relationships between the key information fragments, and construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations.

[0168] The fifth unit is used to determine the knowledge content to be pushed and the timing of its push based on the knowledge graph of the meeting discussion evolution and the preset meeting process, generate knowledge push instructions for different participants, execute the knowledge push instructions, and push the knowledge content to be pushed to the participants' terminal devices.

[0169] A third aspect of the present invention provides an electronic device, comprising:

[0170] processor;

[0171] Memory used to store processor-executable instructions;

[0172] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.

[0173] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.

[0174] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.

[0175] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for real-time extraction of key information and knowledge pushing of a conference based on a large language model, characterized in that, include: Acquire real-time audio data streams generated during the meeting; The real-time voice data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints; Based on predefined information extraction templates and a dynamically learned conference topic model, key information is identified and extracted from each semantic fragment. Importance scores for each information element are calculated, and structured key information fragments are constructed, including: Construct a predefined information extraction template library, which contains multiple information extraction templates. Each information extraction template defines the structured field composition of a specific type of key information, the dependencies between fields, and the semantic constraint rules for field values. Extract the topic feature vector of the semantic fragment and input it into the dynamically learned conference topic model. The model updates the topic cluster center and probability distribution parameters in real time based on the topic feature distribution of the processed semantic fragments, and outputs the topic category of the semantic fragment and its belonging confidence. Information extraction templates are retrieved based on the topic category of semantic fragments. Candidate information elements are extracted based on the field composition and semantic constraint rules defined in the template. The logical consistency of the candidate information elements is verified through the dependency relationship between fields in the template. For candidate information elements that pass the logical consistency verification, the preset weight of their corresponding fields in the template and the topic affiliation confidence of their semantic fragments are combined to calculate the importance score. Candidate information elements whose importance scores exceed the importance threshold are organized into structured fields and bound to the timestamp and participant identifier of their semantic fragments to form key information fragments. The key information fragments are associated and aggregated, and the argumentation and citation relationships between the key information fragments are deeply reasoned to construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations. Based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and the timing of its push are determined, knowledge push instructions for different participants are generated, and the knowledge push instructions are executed to push the knowledge content to be pushed to the participants' terminal devices.

2. The method of claim 1, wherein, The real-time speech data stream is transcribed and semantically parsed in real time, and divided into multiple semantic segments based on semantic integrity constraints and time window constraints, including: The real-time speech data stream is subjected to streaming speech recognition, and text segments with timestamps are output in real time. The text segments are then spliced ​​and the sentence structure is adjusted to form a continuous text transcription sequence. Real-time semantic analysis is performed on the text transcription sequence to identify topic transition markers and semantic boundary markers. The topic transition markers include topic introduction words and transition words, and the semantic boundary markers include semantic closure markers and logical termination markers. Text intervals that simultaneously satisfy the condition of no topic transition and semantic closure are selected as candidate segments under semantic integrity constraints. A sliding time window is established based on a preset duration. The sliding time window is moved in chronological order on the text transcription sequence. When the text interval covered by the time window crosses the boundary of the candidate segment, it is determined whether there is a complete candidate segment within the time window. If there is, segmentation is performed at the boundary position of the candidate segment. If there is no complete candidate segment, the sliding continues until the next candidate segment boundary is encountered, resulting in multiple semantic segments.

3. The method of claim 1, wherein, Information extraction templates are retrieved based on the topic category of semantic fragments. Candidate information elements are extracted based on the field composition and semantic constraint rules defined in the templates. The logical consistency of the candidate information elements is verified through the dependencies between fields in the templates, including: Using the topic category of the semantic fragment as the index key, locate the associated candidate information extraction template, extract the semantic feature vector of the semantic fragment to calculate the semantic similarity, and select the candidate information extraction template with the highest semantic similarity as the matching template. The structured field composition is obtained from the matching template, the semantic constraint rules associated with each field are extracted, and the semantic constraint rules are parsed into semantic pattern descriptors and constraint condition expressions. The semantic pattern descriptor defines the textual representation of the field, and the constraint condition expression defines the semantic boundary conditions for the value. Semantic matching is performed in the semantic fragment to identify text intervals that conform to the semantic pattern descriptor, extract entity references and attribute values, determine whether the constraint condition expression is satisfied, and use the values ​​that meet the conditions as candidate field values ​​to form candidate information elements. Based on the dependencies between fields in the matching template, a dependency verification graph for the candidate information elements is established. Nodes represent candidate field values, and directed edges represent dependencies and are labeled with constraints. The verification graph is traversed in dependency order to check the dependency constraints on each edge. The source node field values ​​are substituted into the constraints as a premise to determine whether the target node field values ​​meet the constraints. When all constraints are met, the candidate information elements are determined to have logical consistency.

4. The method of claim 1, wherein, The key information fragments are associated and aggregated, and deep reasoning is performed on the argumentation and citation relationships between them to construct a knowledge graph of the meeting discussion evolution that includes node importance weights and edge semantic type annotations, including: Extract information elements, timestamp information and participant identification from the key information fragments, calculate the time interval decay factor and semantic association strength between the preceding and subsequent fragments, identify causal triggering relationships, establish causal triggering edges and calculate triggering strength weights; Extract the claims and supporting evidence from the key information segments, identify the supporting and refuting relationships between the segments, establish argument relationship edges, and label the argument direction type; Based on participant identification, the system tracks response sequences and opposing viewpoint sequences between key information fragments, identifies collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant. Using the key information fragments as nodes and causal triggering edges and argumentation relationship edges as edges, a heterogeneous knowledge graph is constructed. The triggering intensity weight of the causal triggering edge and the influence weight of the participant are integrated as dual adjustment factors for node importance. The node importance weight is obtained through graph propagation and iterative update. The trigger intensity weight, argument direction type, and node importance weight are integrated into a heterogeneous knowledge graph to generate a knowledge graph of the evolution of the meeting discussion.

5. The method of claim 4, wherein, Based on participant identification, the system tracks response sequences and opposing viewpoint sequences between key information fragments to identify collaborative support patterns and adversarial debate patterns among participants, and calculates the influence weight of each participant, including: A speaking sequence index is established based on the participant's identity. Key information fragments and timestamps of each participant are extracted. By calculating the temporal interval density and topic continuity of adjacent key information fragments of each participant, immediate response fragments and delayed response fragments are identified. Extract the position semantic features and argument direction features from key information fragments, calculate the position difference degree and argument conflict degree between key information fragments of different participants, and mark the corresponding information fragment sequence as a viewpoint opposition sequence when both meet the opposition threshold. For immediate response segments and delayed response segments, the cumulative support response value and response duration level of the participants are calculated. When both meet the collaboration threshold, it is identified as a collaborative support mode. For opposing viewpoint sequences, the rounds of argumentation and defense and the degree of stance persistence of the opposing sides are extracted. When both meet the adversarial threshold, it is identified as an adversarial debate mode. The cumulative value of support responses is used as the contribution of collaboration, the number of rounds of argumentation and defense is used as the debate activity, and the frequency of citation of immediate response fragments is used as the appeal of viewpoints. The influence weight of the participants is obtained by weighting and summing the three components.

6. The method according to claim 1, characterized in that, Based on the knowledge graph of the meeting discussion evolution and the preset meeting process, the knowledge content to be pushed and the timing of its push are determined, and knowledge push instructions for different participants are generated, including: Semantic mapping is used to associate and label the historical speaking content of each participant with knowledge nodes in the knowledge graph of the conference discussion evolution, thereby identifying the set of knowledge nodes that each participant has covered and the set of knowledge nodes that have not been covered. Based on the preset meeting process and the knowledge graph of the meeting discussion evolution, the predicted meeting process that is about to enter the discussion is determined as the prediction node. Precursor dependency analysis is performed on the predicted meeting process to obtain the necessary prerequisite nodes. The difference operation is performed with the knowledge node set already covered by each participant to obtain the knowledge gap node of each participant for the predicted node. Based on the knowledge gaps of each participant, key information fragments and their argumentation links are extracted and packaged into differentiated knowledge content to be pushed out, including supplementary prior knowledge and logical connection explanations. When a new node pointing to a predicted node is detected in the knowledge graph of the meeting discussion evolution, or when a participant's speech involves a precursor topic of a knowledge gap node, the timing for pushing is determined, and a knowledge push instruction is generated in combination with the differentiated knowledge content to be pushed.

7. A real-time extraction and knowledge push system for key meeting information based on a large language model, used to implement the method as described in any one of claims 1-6, characterized in that, include: The first unit is used to acquire real-time audio data streams generated during the meeting; The second unit is used to perform real-time transcription and semantic parsing of the real-time voice data stream, and to divide it into multiple semantic segments based on semantic integrity constraints and time window constraints. The third unit is used to identify and extract key information from each semantic fragment based on a predefined information extraction template and a dynamically learned conference topic model, calculate the importance score of each information element, and construct a structured key information fragment. The fourth unit is used to associate and aggregate the key information fragments, perform deep reasoning on the argumentation and citation relationships between the key information fragments, and construct a knowledge graph of the evolution of meeting discussions that includes node importance weights and edge semantic type annotations. The fifth unit is used to determine the knowledge content to be pushed and the timing of its push based on the knowledge graph of the meeting discussion evolution and the preset meeting process, generate knowledge push instructions for different participants, execute the knowledge push instructions, and push the knowledge content to be pushed to the participants' terminal devices.

8. An electronic device, comprising: include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 6.

9. A computer-readable storage medium having stored thereon computer program instructions, wherein, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 6.