A post-pancreatic surgery complication detection system and apparatus
By normalizing postoperative pancreatic surgery medical record data and implementing multi-agent collaborative diagnosis, combined with physician feedback optimization, the problems of heterogeneous cross-institutional medical record data and limitations of single models were solved, enabling efficient and accurate diagnosis of postoperative pancreatic complications.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING DIGITAL CHINA CLOUD COMPUTING CO LTD
- Filing Date
- 2026-02-28
- Publication Date
- 2026-06-12
Smart Images

Figure CN122201712A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and in particular to a system and device for detecting postoperative complications of pancreatic surgery. Background Technology
[0002] Early and accurate diagnosis of postoperative complications after pancreatic surgery (such as pancreatic fistula, hemorrhage, and infection) is crucial for improving patient prognosis. Clinically, this requires comprehensive judgment based on multi-source information, including surgical records, disease progression descriptions, and laboratory indicators. Currently, with the advancement of hospital information technology, massive amounts of medical record data are accumulated through Hospital Information Systems (HIS). However, these records are mostly in the form of unstructured text and scanned images, and differences in writing habits exist in medical records from different institutions, posing challenges to the standardized processing of complication diagnoses.
[0003] To improve diagnostic efficiency, existing technologies have begun to incorporate Large Language Models (LLMs) for medical record information extraction and diagnostic reasoning, attempting to replace some manual work. These systems typically guide the model to extract key information from medical records using prompts, and then generate diagnostic results based on preset rules, demonstrating some practicality in simple case scenarios. However, with the deepening of clinical applications, the adaptability of existing systems to complex clinical scenarios has gradually become insufficient, making it difficult to meet the actual needs of accurate diagnosis.
[0004] Existing intelligent diagnostic systems for postoperative pancreatic complications still suffer from low diagnostic accuracy. Summary of the Invention
[0005] This application provides a system and device for detecting postoperative complications of pancreatic surgery, which can improve the accuracy of diagnosis.
[0006] To achieve the above objectives, this application adopts the following technical solution: In a first aspect, this application provides a system for detecting postoperative complications of pancreatic surgery, comprising: The acquisition module is used to acquire the original medical record data of users after pancreatic surgery, and to perform normalization processing on the original medical record data to obtain a structured input semantic space. The normalization processing includes at least text format conversion, medical terminology standardization and timeline standardization. The extraction module is used to construct prompt words based on the structured input semantic space, input the prompt words into the large language model, drive the large language model to extract key clinical information from the structured input semantic space, and generate structured JSON data; The scoring module is used to obtain the diagnostic conclusions and clinical causal reasoning chains obtained by multiple agents after processing the structured JSON data, and to score the clinical causal reasoning chains in multiple dimensions to obtain the scores of each clinical causal reasoning chain. The detection module is used to perform conflict detection on multiple diagnostic conclusions. In the absence of conflict, the diagnostic conclusions are fused according to the scores of each clinical causal reasoning chain to obtain a detection report on postoperative pancreatic complications.
[0007] Optionally, the acquisition module is specifically used to convert the original medical record data into parsable text using optical character recognition technology if the original medical record data is in non-text format, and to retain the correspondence between the paragraph structure, date and key clinical information of the original medical record through layout analysis technology.
[0008] Optionally, the acquisition module is specifically used to map non-standard medical terms in the original medical record data to standard medical terms and generate a term mapping log. The term mapping log records non-standard medical terms, standard medical terms, the SNOMED-CT code of the medical system nomenclature-clinical term associated with the standard medical terms, and the position span of the SNOMED-CT code in the text.
[0009] Optionally, the acquisition module is specifically used to identify non-standard dates in the original medical record data and convert the non-standard dates into standard dates in a preset format.
[0010] Optionally, the scoring module is specifically used to determine a first sub-score based on clinical criteria compliance, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, low redundancy, and the weights of each dimension; and to correct the first sub-score based on contradictory penalties and hallucination penalties to obtain a multi-dimensional score. The contradictory penalty is determined based on the number of mutually exclusive conclusions, while the hallucination penalty is determined based on the number of cited non-existent key clinical information. The multi-dimensional assessment includes clinical guideline compliance, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, and low redundancy.
[0011] Optionally, the detection module is also used to eliminate conflicting conclusions in the event of a conflict.
[0012] Optionally, the detection module is specifically used to calculate the score difference of the clinical causal reasoning chain corresponding to the conflicting conclusions: if the score difference is greater than or equal to the difference threshold, the conflicting conclusion with the lower score is removed; it is also used to issue a review instruction to the agent that outputs the conflicting conclusion if the score difference is less than the difference threshold, the review instruction is used to instruct the agent that outputs the conflicting conclusion to regenerate the diagnostic conclusion and the clinical causal reasoning chain.
[0013] Optionally, the system further includes a feedback module, which is used to collect feedback information from doctors on the test reports. The feedback information includes the type of diagnostic error, deviations in the processing of key clinical information, and the basis for correct diagnosis.
[0014] Optionally, the extraction module is also used to adjust the prompt words based on feedback information.
[0015] Secondly, this application provides a device for detecting postoperative pancreatic complications, which includes the postoperative pancreatic complications detection system described above.
[0016] As can be seen from the above technical solution, this application has at least the following beneficial effects: First, through normalization processing such as text format conversion, medical terminology standardization, and timeline standardization, combined with SNOMED-CT encoding mapping and terminology log recording, the writing differences and format heterogeneity of medical records across institutions are effectively resolved, transforming unstructured medical records into a unified structured semantic space. This provides a high-quality, unambiguous data foundation for subsequent diagnostic reasoning and significantly reduces diagnostic bias caused by data inconsistency.
[0017] Secondly, the end-to-end extraction mode of prompt words based on structured context, coupled with a multi-dimensional scoring system such as clinical criterion compliance and logical consistency, not only leverages the information extraction advantages of large language models, but also constrains the reasoning process through contradiction punishment and hallucination punishment mechanisms. This ensures that diagnostic conclusions not only have clear evidence, but also quantifiable rationality, improving the accuracy and interpretability of diagnosing complex complications and breaking through the trust barrier of traditional black box models.
[0018] Furthermore, the multi-agent parallel diagnosis and conflict resolution mechanism, through the determination of scoring differences and the issuance of review instructions, achieves collaborative decision-making similar to expert consultation, effectively avoiding the cognitive limitations of a single model and improving the system's adaptability to difficult cases. Meanwhile, the dynamic adjustment of prompts driven by doctor feedback forms a closed loop of diagnosis, feedback, and optimization, enabling the system to continuously align with actual clinical needs, gradually reduce the risk of missed diagnoses and misdiagnoses, and adapt to changes in different clinical scenarios.
[0019] Finally, the entire system process balances automation efficiency with clinical applicability. It reduces the workload of manual data processing through standardized processing and intelligent extraction, while providing effective support for doctors' decision-making through structured reports and inference chain presentation. It improves diagnostic efficiency while ensuring the safety of clinical decision-making, and provides reliable technical support for early and accurate intervention of postoperative pancreatic complications.
[0020] It should be understood that the descriptions of technical features, technical solutions, beneficial effects, or similar language in this application do not imply that all features and advantages can be achieved in any single embodiment. Rather, it is understood that the description of a feature or beneficial effect means that a specific technical feature, technical solution, or beneficial effect is included in at least one embodiment. Therefore, the descriptions of technical features, technical solutions, or beneficial effects in this specification do not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions, and beneficial effects described in this embodiment can be combined in any suitable manner. Those skilled in the art will understand that embodiments can be implemented without one or more specific technical features, technical solutions, or beneficial effects of a particular embodiment. In other embodiments, additional technical features and beneficial effects may be identified in specific embodiments that do not embody all embodiments. Attached Figure Description
[0021] Figure 1 This is a schematic diagram of a pancreatic postoperative complication detection system provided in an embodiment of this application. Detailed Implementation
[0022] The terms "first," "second," and "third," etc., used in this application specification and accompanying drawings are used to distinguish different objects, not to limit a specific order.
[0023] In the embodiments of this application, the terms "exemplary" or "for example" are used to indicate that something is an example, illustration, or description. Any embodiment or design that is described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other embodiments or design. Specifically, the use of the terms "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.
[0024] To ensure clarity and conciseness in the description of the following embodiments, a brief introduction to the related technologies is given first: Postoperative complications of pancreatic surgery are common adverse clinical events following pancreatic surgery (such as pancreaticoduodenectomy, pancreatic tumor resection, etc.), mainly including pancreatic fistula, bleeding, infection, bile fistula, etc. Early and accurate identification of these complications directly affects the adjustment of the patient's treatment plan and prognosis.
[0025] The existing intelligent diagnostic systems for postoperative pancreatic complications have technical problems in meeting the clinical needs for accuracy, reliability, and adaptability. Specifically, cross-institutional medical record data is difficult to process uniformly, diagnostic conclusions lack traceable reasoning, single models are prone to cognitive limitations in complex cases, and the system cannot dynamically adapt to changes in clinical diagnostic standards and doctors' decision-making habits, ultimately leading to a high risk of missed diagnoses and misdiagnoses, making it difficult to support early and accurate intervention for postoperative pancreatic complications.
[0026] The reasons for the above problems can be summarized into four points: First, the heterogeneity of medical records across institutions (such as images and text), differences in expression (such as abbreviations and full names of medical terms), and inconsistent date formats create data silos, leading to inconsistent and ambiguous input information to the model; Second, traditional systems lack effective constraints and verification of the reasoning process, making the model prone to illusions or logical contradictions, and diagnostic conclusions lack clear reasoning links, making it difficult to trace the source of errors; Third, a single model cannot fully cover the diagnostic logic of various complications, and lacks an effective mechanism for handling conflicts between multiple conclusions, failing to resolve contradictions between different reasoning paths; Fourth, most systems are statically designed, failing to establish a dynamic optimization closed loop driven by physician feedback, making it difficult to continuously align with actual clinical needs as medical advancements progress.
[0027] In view of this, and addressing the shortcomings of existing intelligent diagnostic systems for pancreatic postoperative complications, such as insufficient accuracy and reliability due to data heterogeneity, unconstrained reasoning, limitations of single models, and lack of dynamic adaptation mechanisms, this application breaks down the barriers between medical record data from different institutions through normalization processing, constrains the model reasoning process with structured prompts and multi-dimensional scoring, overcomes the cognitive limitations of single models through collaboration of multiple intelligent agents, and constructs a dynamic optimization closed loop by combining physician feedback. Ultimately, this achieves intelligent detection of pancreatic postoperative complications that is efficient, accurate, interpretable, and adaptable, providing technical support for early and precise clinical intervention.
[0028] To make the technical solution of this application clearer and easier to understand, the following description, in conjunction with the accompanying drawings, introduces a pancreatic postoperative complication detection system provided by an embodiment of this application. Figure 1 As shown in the figure, this is a schematic diagram of a pancreatic postoperative complication detection system provided in an embodiment of this application. The system includes an acquisition module 101, an extraction module 102, a scoring module 103, and a detection module 104.
[0029] The acquisition module 101 is used to acquire the original medical record data of users after pancreatic surgery, and to perform normalization processing on the original medical record data to obtain a structured input semantic space. The normalization processing includes at least text format conversion, medical terminology standardization and timeline standardization.
[0030] Post-pancreatic surgery users are patients who have undergone pancreatic-related surgeries (such as pancreaticoduodenectomy, pancreatic tumor resection, etc.) and are the target group of the system's services.
[0031] Raw medical record data is unprocessed patient medical record information extracted from the hospital information system (HIS). It includes scanned images, PDF documents, free text medical records, etc., and contains key clinical information such as surgical records, laboratory indicators, symptom descriptions, and medical orders.
[0032] Normalization is the process of converting heterogeneous, non-standardized raw data into standardized data with a unified format and semantics. It aims to eliminate ambiguity caused by data differences and ensure data consistency and usability.
[0033] The structured input semantic space is a set of data that has been normalized, resulting in a logically clear, uniformly formatted, and semantically unambiguous dataset. It provides a high-quality input foundation for subsequent large language models to extract information and perform reasoning and diagnosis.
[0034] Text format conversion is the process of converting non-textual original medical records (such as scanned images or uneditable PDFs) into a parsable and processable text format.
[0035] Medical terminology standardization involves mapping different expressions of the same medical concept in original medical records (such as abbreviations, common names, and full names) to unified standard terms and associating them with internationally recognized codes (such as SNOMED-CT) to eliminate terminological ambiguity.
[0036] Timeline standardization converts date and time information in medical records in different formats, such as relative time "day 3 post-surgery" and non-standard absolute time "April 3", into a preset unified standard format, such as YYYY-MM-DD, to clarify the time sequence of each clinical event.
[0037] The role of module 101 is to provide standardized, high-quality data support for system diagnosis. The specific workflow is divided into two steps: The first step is data collection, which extracts the original medical record data of patients after pancreatic surgery from the hospital information system. This data may be in various forms such as images, PDFs, and text, and contains various clinical information related to the patient's surgery. The second step is data standardization processing, which converts non-text data into parsable text through text format conversion, eliminates differences in the expression of the same concept through the standardization of medical terminology, and standardizes date and time formats through timeline standardization. Finally, the heterogeneous and messy original medical record data is transformed into a structured input semantic space with a unified format, consistent semantics, and clear logic, thus clearing data obstacles for information extraction and diagnostic reasoning in subsequent modules.
[0038] The following are the methods for standardizing non-text formats, non-standard medical terminology, and non-standard dates: The acquisition module 101 is specifically used to convert the original medical record data into parsable text using optical character recognition technology if the original medical record data is in non-text format, and to retain the correspondence between the paragraph structure, date and key clinical information of the original medical record through layout analysis technology.
[0039] Non-text formats refer to medical record formats that cannot be directly parsed and have their information extracted by a computer. These include scanned images, uneditable PDF documents, and photographs of handwritten medical records.
[0040] Optical Character Recognition (OCR) is a technology that can convert text information in images into editable and parsable text. It can accurately recognize printed or handwritten text in medical record images, realizing the conversion from image to text.
[0041] Parsable text refers to text formats that computers can recognize, extract, and process, such as TXT and editable documents. Subsequent large language models can directly read clinical information from this text.
[0042] Page layout analysis technology is a technique that assists OCR recognition. It can analyze the page layout of the original medical record, identify structural information such as paragraph division, heading level, table position, and date annotation, and restore the original layout logic of the medical record.
[0043] Key clinical information refers to data related to the diagnosis of postoperative complications of pancreatic surgery, including surgical records, laboratory indicators (such as amylase levels), symptom descriptions (such as abdominal pain and fever), and drainage status.
[0044] When processing raw medical record data, module 101 ensures data availability through two steps, particularly for non-text formats (such as scanned images or non-editable PDFs): The first step uses Optical Character Recognition (OCR) technology to accurately extract text from images and convert it into a computer-readable text format, solving the problem of images not being directly readable. The second step uses layout analysis technology to restore the original medical record's paragraph divisions, heading levels, and other structures, while accurately preserving the correspondence between dates and key clinical information such as surgical records and test results, avoiding information misalignment caused by format conversion, such as a certain test indicator being out of sync with its corresponding test date.
[0045] This design not only makes non-text medical records readable, but also ensures the integrity and relevance of information, providing a clear and accurate foundation text for subsequent processing such as the standardization of medical terminology and timelines.
[0046] The acquisition module 101 is specifically used to map non-standard medical terms in the original medical record data to standard medical terms and generate a term mapping log. The term mapping log records the non-standard medical terms, standard medical terms, the SNOMED-CT codes of the medical system nomenclature-clinical terms associated with the standard medical terms, and the position span of the SNOMED-CT codes in the text.
[0047] Non-standard medical terminology refers to the inconsistent expression of the same medical concept in the original medical record, including abbreviations (such as "AMY" and "Hb"), common names, and shortened forms (such as "ALT↑"), which arise from differences in writing habits among different hospitals and doctors.
[0048] Standard medical terminology refers to medical concepts that conform to medical norms and are uniformly used. For example, elevated levels of amylase, hemoglobin, and alanine aminotransferase serve as a unified benchmark for clinical diagnosis and data exchange.
[0049] A terminology mapping log is a structured log file that records the correspondence between non-standard medical terms and standard medical terms, serving as a traceable record of the terminology standardization process.
[0050] SNOMED-CT coding is an internationally recognized medical system nomenclature-clinical terminology coding system. Each code uniquely corresponds to a standard medical concept, enabling semantic consistency and mutual recognition of medical information from different sources.
[0051] Location span refers to the start and end positions (such as character index range) of the standard medical term corresponding to the SNOMED-CT encoding in the parsable text, which is used to accurately locate the specific source of the term in the original medical record.
[0052] The task of module 101 in the medical terminology standardization process is to eliminate terminological ambiguity and achieve semantic uniformity. This process involves two steps: The first step is terminology mapping and conversion. Through the system's built-in medical terminology knowledge base, non-standard medical terms such as "AMY", "Hb", and "ALT↑" in the original medical records are accurately mapped to standard medical terms such as "amylase", "hemoglobin", and "elevated alanine aminotransferase". Each standard term is associated with a unique SNOMED-CT code to ensure the semantic consistency of medical concepts across institutions and writing conventions. The second step is to generate a terminology mapping log, which records in detail the correspondence between each set of non-standard terms, standard terms, SNOMED-CT codes, and text position spans. This provides an auditable and traceable basis for the terminology standardization process and facilitates quick location of the original term position for verification when errors occur.
[0053] This design fundamentally solves the problem of model comprehension bias caused by confusing terminology, and provides high-quality data support with semantic consistency for subsequent diagnostic reasoning.
[0054] The acquisition module 101 is specifically used to identify non-standard dates in the original medical record data and convert the non-standard dates into standard dates in a preset format.
[0055] Non-standard dates refer to date and time information in the original medical record that is not recorded in a uniform format. These include relative time descriptions such as "3 days post-surgery," "yesterday," and "5 days after admission," as well as abbreviated absolute times such as "April 3" and "2025-4-3," which are caused by differences in doctors' writing habits.
[0056] The default format of the standard date is a uniform and standardized date expression form set in advance by the system, such as YYYY-MM-DD format, like 2025-04-03, which can ensure the consistency and calculability of date information.
[0057] The goal of module 101 in the timeline standardization process is to unify the date format and clarify the time nodes of clinical events. The specific operation procedure is as follows: First, using a built-in date recognition algorithm, non-standard date expressions such as "3rd day post-surgery" and "April 3rd" are accurately filtered from the original medical record data. Then, by combining anchor information such as the surgery date and admission date from the medical record context, logical deduction is performed to convert relative times into absolute times. Simultaneously, abbreviated absolute times are completed into the preset YYYY-MM-DD standard format. For example, if the surgery date is 2025-04-03, then "3rd day post-surgery" will be converted to 2025-04-06; and "April 3rd" will be completed to 2025-04-03 based on the medical record year.
[0058] This design solves the obstacles to time series analysis caused by chaotic date formats, ensuring that the time nodes of various clinical events such as abnormal test indicators and symptom onset are clear and traceable, providing accurate time basis for subsequent time-based complication diagnosis reasoning.
[0059] The extraction module 102 is used to construct prompt words based on the structured input semantic space, input the prompt words into the large language model, drive the large language model to extract key clinical information from the structured input semantic space, and generate structured JSON data.
[0060] The structured input semantic space is a set of medical record data with a unified format and consistent semantics, formed after normalization processing by the acquisition module. It contains standardized text content, medical terminology, and time information.
[0061] The prompt words are instruction texts specifically designed based on the features and extraction requirements of the structured input semantic space, used to guide the large language model to accurately locate and extract key information.
[0062] Large Language Models (LLMs) are artificial intelligence models with powerful natural language understanding and generation capabilities. They can parse text based on prompts, identify key content, and output results in a specified format.
[0063] Structured JSON data is structured information organized in JSON format. It stores extracted key clinical information according to preset fields (such as "operation date", "laboratory indicators", and "symptom description") to facilitate subsequent module parsing and processing.
[0064] The function of extraction module 102 is to transform standardized medical record data into structured key information required for diagnosis. The specific process is as follows: First, prompt words containing extraction scope and data format requirements are constructed based on the structured input semantic space. The extraction scope clearly defines the types and boundaries of key clinical information to be extracted, such as pancreatic fistula-related indicators within 3 days postoperatively. The data format requirements specify the field structure of the structured JSON data. Then, the prompt words are input into the large language model, which drives the model to accurately extract key clinical information from the structured input semantic space strictly according to the extraction scope, and generate preliminary structured JSON data according to the data format requirements.
[0065] To further ensure data quality, the extraction module 102 also employs a two-level verification mechanism consisting of JSON Schema validation and a self-healing pipeline to accurately correct the structured JSON data. First, the JSON Schema validation process is initiated. Based on a pre-defined standardized JSON Schema, including a list of required fields, field data type constraints, time format specifications, and numerical range thresholds, it automatically detects formatting errors in the data. For missing fields, it fills in the missing fields using preset default values or marks missing fields as a reminder. For issues such as inconsistent time formats (including non-YYYY-MM-DD HH:MM format) and incorrect field types (including storing numeric fields as text values), it performs mandatory standardization conversion according to the Schema definition. Then, a self-healing pipeline is triggered to perform in-depth verification and calibration of content deviations. By associating standard medical terminology, dates, and key clinical values from the original structured input semantic space, it cross-validates the extracted content from the structured JSON data. If discrepancies are detected between the extracted values and the original medical records, or mismatches between terminology encoding and text descriptions, it automatically backtracks to the original data for secondary extraction and calibration. Deviations that cannot be automatically corrected are marked with anomaly tags and logged. Finally, after double verification and correction, high-quality structured JSON data with complete fields, compliant format, and accurate content is output.
[0066] This design, through the dual constraints of prompt words and the corrective effect of the verification mechanism, ensures both the integrity and relevance of key clinical information and improves the accuracy and standardization of structured data, providing reliable data support for subsequent diagnostic reasoning.
[0067] The scoring module 103 is used to obtain the diagnostic conclusions and clinical causal reasoning chains obtained by multiple agents after processing structured JSON data, and to score the clinical causal reasoning chains in multiple dimensions to obtain the scores of each clinical causal reasoning chain.
[0068] Multiple intelligent agents consist of multiple sub-models that focus on the diagnosis of different postoperative complications of pancreatic surgery. Each model has a division of labor and cooperates to output diagnostic conclusions and reasoning basis for specific complications, simulating the clinical expert consultation mode.
[0069] The diagnostic conclusion is the final judgment result derived by the intelligent agent based on structured JSON data, such as pancreatic fistula grade B, no postoperative infection, etc., which directly points to whether there are complications and the specific type and grade of the complications.
[0070] The clinical causal reasoning chain is the logical path by which an intelligent agent derives diagnostic conclusions. It clearly presents the relationship between key clinical information, intermediate inferences, and diagnostic conclusions. For example, the presence of 500 U / L amylase in the drainage fluid on the 5th postoperative day, signs of pancreatic juice leakage, combined with abdominal pain symptoms, leads to the determination of pancreatic fistula grade B, which explains the basis for the diagnosis.
[0071] Multidimensional scoring is a quantitative assessment of the rationality and reliability of clinical causal reasoning chains from multiple dimensions, including compliance with clinical guidelines, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, and low redundancy. The final output is a comprehensive score.
[0072] Clinical compliance is an assessment of whether the clinical causal reasoning chain conforms to authoritative guidelines, standards or clinical consensus for the diagnosis of postoperative pancreatic complications. It examines whether the reasoning process covers the key indicators, judgment thresholds and grading conditions required for diagnosis, and ensures that the reasoning is consistent with clinical norms.
[0073] Logical consistency measures whether the causal relationships between each link in the reasoning chain are rigorous and without contradictions. This includes the absence of mutually exclusive conclusions, the absence of leaps in reasoning steps, and the consistency in the interpretation of the same clinical information, ensuring that the derivation process conforms to medical logic.
[0074] The adequacy of evidence utilization is used to determine whether the reasoning chain fully utilizes key clinical information in the structured JSON data, such as test indicators, symptoms, and surgical records. It examines whether each step of the reasoning is supported by corresponding clinical evidence to avoid unfounded subjective inferences.
[0075] The completeness of key factors is an assessment of whether the reasoning chain covers all factors required for the diagnosis of the complication, including the time dimension (such as a specific period after surgery), symptoms and signs, test results, interventions, and other key information, without any omissions of important diagnostic steps.
[0076] Uncertainty expression examines whether the reasoning chain appropriately reflects the uncertainty of the diagnosis when faced with marginal data (such as indicators approaching the diagnostic threshold) or scenarios with insufficient information. It adopts rigorous and prudent expressions, such as "further observation is needed" or "high probability", avoiding overly absolute judgments.
[0077] Low redundancy measures the simplicity of the reasoning chain, assessing whether there are no duplicate statements, irrelevant information, or redundant reasoning steps, ensuring that the reasoning process focuses on diagnostic logic and presents causal relationships in the most concise way.
[0078] The function of scoring module 103 is to screen out reliable diagnostic reasoning results through quantitative assessment. The specific process is as follows: First, the system receives output data from multiple agents, including diagnostic conclusions generated by each agent for structured JSON data, and clinical causal reasoning chains supporting those conclusions. Then, a multi-dimensional scoring mechanism is activated to comprehensively evaluate each clinical causal reasoning chain from multiple dimensions and calculate a comprehensive score. Finally, a quantitative score for each agent's corresponding reasoning chain is obtained, providing a quantifiable decision-making basis for subsequent conflict detection and conclusion fusion.
[0079] This design breaks through the limitations of traditional approaches that only focus on the diagnostic conclusion itself. It improves the reliability of the diagnosis by evaluating the rationality of the reasoning process, while providing clear standards for subsequent screening of high-quality diagnostic conclusions and elimination of unreasonable reasoning.
[0080] The specific calculation of the overall score is as follows: The scoring module 103 is specifically used to determine the first sub-score based on clinical criteria compliance, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, low redundancy, and the weight of each dimension; the first sub-score is then corrected based on contradictory penalties and hallucination penalties to obtain a multi-dimensional score; among which, contradictory penalties are determined based on the number of mutually exclusive conclusions, and hallucination penalties are determined based on the number of non-existent key clinical information references.
[0081] The first sub-score is a preliminary composite score calculated based on the scores and corresponding weights of the six rating dimensions, reflecting the basic performance of the reasoning chain in the evaluation dimensions.
[0082] Contradictory penalties are applied to items in a chain of reasoning that contain mutually exclusive conclusions (such as the same indicator both supporting and denying pancreatic fistula). The more mutually exclusive conclusions there are, the stronger the penalty.
[0083] The illusion penalty is a penalty for referencing key clinical information (such as fabricated unrecorded test indicators) that does not exist in the structured JSON data in the inference chain. The more non-existent information is referenced, the stronger the penalty.
[0084] The multidimensional score is the final comprehensive score obtained after correction of the first sub-score. It is the final quantitative assessment result of the rationality and reliability of the clinical causal reasoning chain.
[0085] The expression for calculating the first subfraction is:
[0086] in, Indicates the first subfraction. This indicates the score for compliance with clinical guidelines. Indicates the first weight. Indicates the logical consistency score. Indicates the second weight. The score indicates the sufficiency of evidence utilization. Indicates the third weight. Indicates the score for the completeness of key factors. Indicates the fourth weight. The score indicates the expression of uncertainty. Indicates the fifth weight. Indicates a low degree of redundancy. This indicates the sixth weight.
[0087] The calculation expression for correcting the first subfraction is:
[0088] in, This indicates a multi-dimensional score, i.e., the final comprehensive score. Indicates punishment for contradictions. Indicates the coefficient of contradiction punishment. This indicates the number of mutually exclusive conclusions. Indicates punishment for hallucinations. Indicates the penalty coefficient for hallucination. This indicates the number of key clinical information references that do not exist.
[0089] The detection module 104 is used to perform conflict detection on multiple diagnostic conclusions. In the absence of conflict, the diagnostic conclusions are fused according to the scores of each clinical causal reasoning chain to obtain a detection report on postoperative pancreatic complications.
[0090] Conflict detection verifies the consistency of diagnostic conclusions output by multiple agents, identifying whether there are contradictory judgments, such as one model determining the presence of pancreatic fistula while another model determines that there is no pancreatic fistula.
[0091] Conclusion fusion involves combining the multi-dimensional scores of each agent's reasoning chain after conflict-free or conflict-resolved conclusions to form a unified and reliable final judgment.
[0092] The test report is a structured clinical diagnostic document output by the system, containing information such as the final complication determination results, core reasoning basis, and scoring references, providing decision support for clinicians.
[0093] The function of the detection module 104 is to integrate the diagnostic results of multiple agents and generate an accurate and unified detection report. The specific process is as follows: First, a conflict detection mechanism is activated to compare the diagnostic conclusions output by all agents one by one to determine whether there are any contradictions between the conclusions of different models. If the detection confirms that there are no conflicts, that is, the conclusions of each model are consistent or there are no mutually exclusive judgments, then a weighted fusion strategy is adopted based on the multi-dimensional scores of the clinical causal reasoning chain corresponding to each model. The higher the score of the model, the greater the weight of the conclusion, and a final unified diagnostic conclusion is formed. Finally, this conclusion, along with the core reasoning basis, score summary and other information, is compiled into a structured pancreatic postoperative complication detection report to provide clinicians with a clear and reliable diagnostic reference.
[0094] This design leverages the collaborative advantages of multiple agents through multi-conclusion cross-validation and weighted scoring, effectively enhancing the authority and credibility of the final diagnostic results.
[0095] The detection module 104 is also used to eliminate conflicting conclusions in the event of a conflict.
[0096] When the detection module 104 detects contradictory diagnostic conclusions from multiple agents during the conflict detection phase, it initiates a conflict resolution mechanism to eliminate conflicting conclusions. Specifically, the detection module 104 combines multi-dimensional scores of the clinical causal reasoning chains corresponding to each conflicting conclusion, prioritizing conclusions with higher scores and more reliable reasoning, while discarding conflicting conclusions with lower scores and insufficient reasoning basis. For example, if model A determines the presence of pancreatic fistula with a reasoning chain score of 0.9, while model B determines the absence of pancreatic fistula with a reasoning chain score of 0.4, a conflict exists, and model B's conclusion is eliminated. This design uses quantitative scoring to screen high-quality conclusions, quickly resolving diagnostic contradictions.
[0097] The detection module 104 is specifically used to calculate the score difference of the clinical causal reasoning chain corresponding to the conflicting conclusions: if the score difference is greater than or equal to the difference threshold, the conflicting conclusion with the lower score is removed; it is also used to issue a review instruction to the agent that outputs the conflicting conclusion if the score difference is less than the difference threshold. The review instruction is used to instruct the agent that outputs the conflicting conclusion to regenerate the diagnostic conclusion and the clinical causal reasoning chain.
[0098] When processing conflicting conclusions, the detection module 104 employs logic of quantifying differences and grading processing. The specific process is as follows: First, calculate the score difference of the clinical causal reasoning chain corresponding to all conflicting conclusions, that is, the score difference between contradictory conclusions; then compare this difference with a preset difference threshold: If the score difference is greater than or equal to the difference threshold, for example, if the score difference is 0.4 and the score threshold is 0.3, it indicates that there is a significant gap in the reasoning quality of the conflicting conclusions. In this case, the conflicting conclusion with the lower score is directly eliminated, and the conclusion with the higher score and more reliable reasoning is retained to avoid the low-quality conclusions affecting the final judgment.
[0099] If the score difference is less than the difference threshold, for example, the score difference is 0.2 and the score threshold is 0.3, it means that the reasoning quality of the conflicting conclusions is similar, and it is difficult to directly determine the superiority or inferiority based on the existing scores. At this time, a review instruction is issued to the agent that outputs these conflicting conclusions, requiring the model to re-analyze the data, verify the reasoning logic, and generate new diagnostic conclusions and clinical causal reasoning chains. The conflict caused by the bias of a single analysis is reduced through secondary reasoning.
[0100] This design not only quickly processes obviously inferior conflicting conclusions through quantitative standards, but also initiates a review mechanism for conflicting conclusions with similar reasoning quality, further improving the reliability of the final diagnostic results.
[0101] A pancreatic postoperative complication detection system also includes a feedback module, which is used to collect feedback information from doctors on the test reports. The feedback information includes the types of errors in the diagnostic conclusions, deviations in the processing of key clinical information, and the basis for correct diagnosis.
[0102] Feedback information consists of doctors' evaluations and corrections to the test reports based on their clinical experience and the actual condition of the patient. It is a key basis for system optimization.
[0103] Error types in diagnostic conclusions refer to the categories of deviations in the final diagnostic conclusions of test reports as pointed out by doctors, such as missing a pancreatic fistula, misdiagnosing a mild infection as a severe infection, or misjudging the type of complication.
[0104] Key clinical information processing bias refers to errors in the extraction or interpretation of key clinical information by the system identified by the physician, such as missing abnormal peak values of amylase in postoperative drainage fluid, misinterpreting bleeding time, and failing to correctly associate symptoms with test indicators.
[0105] The basis for a correct diagnosis is the diagnostic support information provided by the doctor that is consistent with the patient's actual condition, including key clinical data that should be adopted, the application logic of authoritative diagnostic standards, and the reasoning path that is consistent with the condition.
[0106] The newly added feedback module in this pancreatic postoperative complication detection system serves to build a two-way communication bridge between the system and clinical applications, continuously collecting professional feedback from doctors on the test reports.
[0107] Specifically, the feedback module collects three key types of feedback: first, types of diagnostic errors, clarifying the discrepancies between the final diagnosis in the test report and the actual condition; second, deviations in the processing of key clinical information, pinpointing specific errors in the system's information extraction and interpretation stages; and third, correct diagnostic basis, obtaining professional diagnostic logic and supporting data that align with the patient's condition. This feedback information is retained by the system and used as a basis for subsequent optimization, providing precise direction for adjusting prompts, optimizing scoring dimension weights, and improving the agent's reasoning logic, thus driving continuous system iteration and upgrades.
[0108] The extraction module 102 is also used to adjust the prompt words based on feedback information.
[0109] The extraction module 102 has dynamic optimization capabilities, and adjusts the prompt words it constructs in a targeted manner based on the doctor feedback information collected by the feedback module.
[0110] Specifically, if feedback indicates that the system has missed key clinical information, such as failing to extract key values of postoperative drainage fluid, the extraction scope of the prompt words will be adjusted, and the extraction requirements for this type of information will be clarified. If feedback indicates that the structured JSON data has a disordered format or missing fields, the data format requirements in the prompt words will be optimized, and clear field specifications and format standards will be added. If feedback indicates that there are interpretation biases in information extraction, such as misreading the units of test indicators, the interpretation explanations and verification prompts for this type of indicator will be added to the prompt words.
[0111] Through this feedback and adjustment loop, the prompt words can continuously adapt to actual clinical needs, guide the large language model to extract key clinical information more accurately, generate high-quality structured JSON data, and reduce diagnostic bias from the source.
[0112] Based on the above description, this application has the following beneficial effects: First, through normalization processing such as text format conversion, medical terminology standardization, and timeline standardization, combined with SNOMED-CT encoding mapping and terminology log recording, the writing differences and format heterogeneity of medical records across institutions are effectively resolved, transforming unstructured medical records into a unified structured semantic space. This provides a high-quality, unambiguous data foundation for subsequent diagnostic reasoning and significantly reduces diagnostic bias caused by data inconsistency.
[0113] Secondly, the end-to-end extraction mode of prompt words based on structured context, coupled with a multi-dimensional scoring system such as clinical criterion compliance and logical consistency, not only leverages the information extraction advantages of large language models, but also constrains the reasoning process through contradiction punishment and hallucination punishment mechanisms. This ensures that diagnostic conclusions not only have clear evidence, but also quantifiable rationality, improving the accuracy and interpretability of diagnosing complex complications and breaking through the trust barrier of traditional black box models.
[0114] Furthermore, the multi-agent parallel diagnosis and conflict resolution mechanism, through the determination of scoring differences and the issuance of review instructions, achieves collaborative decision-making similar to expert consultation, effectively avoiding the cognitive limitations of a single model and improving the system's adaptability to difficult cases. Meanwhile, the dynamic adjustment of prompts driven by doctor feedback forms a closed loop of diagnosis, feedback, and optimization, enabling the system to continuously align with actual clinical needs, gradually reduce the risk of missed diagnoses and misdiagnoses, and adapt to changes in different clinical scenarios.
[0115] Finally, the entire system process balances automation efficiency with clinical applicability. It reduces the workload of manual data processing through standardized processing and intelligent extraction, while providing effective support for doctors' decision-making through structured reports and inference chain presentation. It improves diagnostic efficiency while ensuring the safety of clinical decision-making, and provides reliable technical support for early and accurate intervention of postoperative pancreatic complications.
[0116] This invention also provides a device for detecting postoperative complications of pancreatic surgery. The device includes any of the detection systems described in the foregoing embodiments, and the specific scheme is consistent with the above embodiments.
Claims
1. A system for detecting postoperative complications of pancreatic surgery, characterized in that, include: The acquisition module is used to acquire the original medical record data of users after pancreatic surgery, and to perform normalization processing on the original medical record data to obtain a structured input semantic space. The normalization processing includes at least text format conversion, medical terminology standardization and timeline standardization. The extraction module is used to construct prompt words based on the structured input semantic space, input the prompt words into the large language model, drive the large language model to extract key clinical information from the structured input semantic space, and generate structured JSON data; The scoring module is used to obtain the diagnostic conclusions and clinical causal reasoning chains obtained by multiple agents after processing the structured JSON data, and to score the clinical causal reasoning chains in multiple dimensions to obtain the scores of each clinical causal reasoning chain. The detection module is used to perform conflict detection on multiple diagnostic conclusions. In the absence of conflict, the diagnostic conclusions are fused according to the scores of each clinical causal reasoning chain to obtain a detection report on postoperative pancreatic complications.
2. The system according to claim 1, characterized in that, The acquisition module is specifically used to convert the original medical record data into parsable text using optical character recognition technology if the original medical record data is in non-text format, and to retain the correspondence between the paragraph structure, date and key clinical information of the original medical record through layout analysis technology.
3. The system according to claim 1, characterized in that, The acquisition module is specifically used to map non-standard medical terms in the original medical record data to standard medical terms and generate a term mapping log. The term mapping log records non-standard medical terms, standard medical terms, the SNOMED-CT code of the medical system nomenclature-clinical term associated with the standard medical terms, and the position span of the SNOMED-CT code in the text.
4. The system according to claim 1, characterized in that, The acquisition module is specifically used to identify non-standard dates in the original medical record data and convert the non-standard dates into standard dates in a preset format.
5. The system according to claim 1, characterized in that, The multi-dimensional criteria include clinical guidelines compliance, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, and low redundancy. The scoring module is specifically used to determine the first sub-score based on the clinical guidelines compliance, logical consistency, adequacy of evidence utilization, completeness of key factors, expression of uncertainty, low redundancy, and the weights of each dimension. The first sub-score is modified based on the contradiction penalty and the hallucination penalty to obtain a multi-dimensional score; The contradictory penalty is determined based on the number of mutually exclusive conclusions, and the hallucination penalty is determined based on the number of non-existent key clinical information references.
6. The system according to claim 1, characterized in that, The detection module is also used to eliminate conflicting conclusions in the event of a conflict.
7. The system according to claim 6, characterized in that, The detection module is specifically used to calculate the score difference of the clinical causal reasoning chain corresponding to the conflicting conclusions: if the score difference is greater than or equal to the difference threshold, the conflicting conclusion with the lower score is removed; it is also used to issue a review instruction to the agent that output the conflicting conclusion if the score difference is less than the difference threshold, the review instruction is used to instruct the agent that output the conflicting conclusion to regenerate the diagnostic conclusion and the clinical causal reasoning chain.
8. The system according to claim 1, characterized in that, The system also includes a feedback module, which is used to collect feedback information from doctors on the test reports. The feedback information includes the types of errors in the diagnostic conclusions, deviations in the processing of key clinical information, and the basis for correct diagnosis.
9. The system according to claim 8, characterized in that, The extraction module is also used to adjust the prompt words based on feedback information.
10. A device for detecting postoperative complications of pancreatic surgery, characterized in that, Includes the system described in any one of claims 1-9.