Drug and instrument evaluation method and device based on atlas annotation collaboration and storage medium

By constructing a medical ontology knowledge graph and collaborating with intelligent annotation, we can achieve unified collection and standardization of multi-source medical data. Combined with the comprehensive clinical representation function of patients, we can conduct multi-dimensional evaluation, which solves the problems of inconsistent data formats, limited entity recognition capabilities, and incomplete evaluation in existing technologies. This enables comprehensive evaluation of drugs and medical devices and continuous optimization of the model.

CN122201631APending Publication Date: 2026-06-12WUXI HEALTH STATISTICS & INFORMATION CENTER

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WUXI HEALTH STATISTICS & INFORMATION CENTER
Filing Date
2026-04-17
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies suffer from problems such as inconsistent data formats, limited entity recognition capabilities, and incomplete evaluation in multi-source medical data integration, unstructured text processing, and drug and medical device evaluation methods. They also lack continuous feedback and update capabilities, leading to inaccurate evaluation results and model aging.

Method used

By constructing a medical ontology knowledge graph and collaborating with intelligent annotation, we can achieve unified collection, desensitization, and standardization of multi-source medical data. We can then combine this with a comprehensive clinical representation function of patients for multidimensional evaluation and optimize the model through a feedback update mechanism, forming a closed-loop iterative optimization.

🎯Benefits of technology

It improves the availability and relevance of heterogeneous medical data, enhances the accuracy of entity recognition and relation extraction, strengthens the comprehensiveness and interpretability of drug and medical device evaluation, and solves the problems of data dispersion and model aging in existing technologies.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122201631A_ABST
    Figure CN122201631A_ABST
Patent Text Reader

Abstract

The application discloses a drug and equipment evaluation method and device based on atlas annotation cooperation, equipment and a storage medium, and belongs to the technical field of medical big data processing. The method comprises the following steps: collecting, desensitizing, standardizing and quality screening of multi-source medical data, forming a standardized research sample based on a sample availability evaluation function; extracting medical entities and entity relationships, and constructing a medical ontology knowledge graph through a medical ontology edge confidence function; obtaining an optimized annotation result by guiding a small amount of manual correction through an annotation sample priority function; completing patient stratification by fusing multi-dimensional medical features through a patient comprehensive clinical feature function; comprehensively evaluating the target drug and equipment through a drug and equipment real world comprehensive evaluation index function and identifying risk signals; and feeding back abnormal evaluation results to the knowledge graph and the annotation model, and continuously optimizing through a system iterative gain function. The application can improve the data utilization efficiency, evaluation accuracy and system evolution capability of real world research of drugs and equipment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of medical big data processing technology, and in particular to a method, device, equipment and storage medium for drug and medical device evaluation based on graph annotation collaboration. Background Technology

[0002] With the continuous development of medical informatization and regional medical big data platforms, a large amount of data, including electronic medical records, laboratory examinations, medication records, medical device usage records, and follow-up data, has been accumulated. Conducting clinical evaluations of innovative drugs and medical devices based on real-world data has become an important direction for pharmaceutical R&D and medical management. Related technologies are evolving from simple statistical analysis to the collaborative application of medical knowledge graphs, natural language processing, machine learning, and multi-source data fusion. The development trend is to improve the utilization rate of unstructured medical data and enhance the ability to identify the efficacy, safety, and applicable populations of drugs and medical devices.

[0003] Existing technical solutions primarily rely on structured data from hospital information systems, electronic medical record systems, and laboratory testing systems. They employ rule extraction, keyword matching, traditional statistical analysis, or single natural language processing models to analyze drug or medical device usage, adverse events, and some efficacy indicators. Some solutions introduce pre-trained language models to perform entity recognition and relation extraction from medical record text, or construct local medical knowledge graphs to assist in terminology standardization, but these typically only serve a single aspect.

[0004] However, existing technologies have significant technical shortcomings: First, multi-source medical data is scattered across multiple systems such as electronic medical records, laboratory tests, imaging, pharmaceuticals, and medical devices, with inconsistent data formats, field definitions, and terminology systems, leading to difficulties in cross-system association and low efficiency in structured integration. Second, a large amount of medical record text, report text, and explanatory documents exist in unstructured form, and existing solutions have limited capabilities in entity recognition, relation extraction, and terminology normalization for complex medical texts, resulting in a large workload and high cost for manual annotation. Third, existing technologies typically process knowledge extraction, text annotation, and clinical evaluation separately, lacking a collaborative mechanism, making it difficult to promptly correct front-end extraction errors, resulting in insufficient overall accuracy and stability. Fourth, existing evaluation methods often focus on single efficacy indicators or single risk indicators, making it difficult to simultaneously consider efficacy, safety, applicable population, and the impact of combined treatments, resulting in incomplete evaluation results. Fifth, existing systems generally lack continuous feedback and update capabilities, making it difficult to adapt to scenarios such as the launch of new drugs and medical devices and changes in diagnostic and treatment terminology, and prone to problems such as model aging, graph lag, and evaluation distortion during long-term operation.

[0005] Therefore, there is an urgent need for a real-world clinical comprehensive evaluation method for pharmaceuticals and medical devices that can be applied to multi-source medical data and achieve coordinated linkage of knowledge construction, intelligent annotation, patient stratification, comprehensive evaluation, and continuous feedback updates, so as to improve the efficiency of innovative drug and medical device research and development and the level of clinical application decision-making. Summary of the Invention

[0006] The present invention aims to provide a method, apparatus, terminal device and storage medium for compensating for location loss, so as to overcome the shortcomings of the prior art. The technical problem to be solved by the present invention is achieved through the following technical solutions.

[0007] According to the first aspect of this application, a drug-device evaluation method based on map annotation collaboration is provided, comprising the following steps:

[0008] S1. Collect, desensitize, standardize, and screen the quality of multi-source medical data from the hospital. Based on the sample availability evaluation function, score the comprehensive availability of each piece of medical data to form a standardized research sample.

[0009] S2. Based on the standardized research samples, extract medical entities and entity relationships, and perform comprehensive confidence calculation and screening of candidate relationship edges based on the medical ontology edge confidence function to construct a medical ontology knowledge graph.

[0010] S3. Perform initial prediction annotation on the medical text based on the medical ontology knowledge graph, manually correct and score the text fragments based on the annotation sample priority function, and send the text fragments with priority higher than the preset threshold to the manual correction queue for correction to obtain the optimized annotation result.

[0011] S4. Based on the medical ontology knowledge graph and the optimized annotation results, the patient's disease information, laboratory test information, drug use information, medical device use information, adverse event information and follow-up information are fused through the patient comprehensive clinical representation function to form a patient comprehensive clinical representation vector and complete the patient stratification.

[0012] S5. Based on the patient's comprehensive clinical representation vector and patient stratification results, the efficacy score, safety risk score, suitable population matching score and R&D guidance gain score of the target drug and device are comprehensively calculated using the drug and device real-world comprehensive evaluation index function to obtain the comprehensive evaluation result, and risk signal identification is performed based on the comprehensive evaluation result.

[0013] S6. Feed back the abnormal evaluation results and risk signals obtained in step S5 and the manual correction information in step S3 to the medical ontology knowledge graph in step S2 and the intelligent annotation model in step S3. Evaluate the update effect based on the system iterative gain function to form a closed-loop iterative optimization mechanism.

[0014] Preferably, step S1 includes:

[0015] S110. Collect multi-source medical data from hospital information systems, electronic medical record systems, laboratory systems, imaging report systems, drug usage record systems, medical device usage record systems, and adverse event record systems, and establish a unified data access layer.

[0016] S120. Perform patient master index mapping on the collected multi-source medical data, mapping one or more of the outpatient number, inpatient number, and examination number to the same research object, so as to realize cross-system association of patients;

[0017] S130. Perform terminology standardization processing on the collected multi-source medical data, construct a standard vocabulary of diseases, drugs, medical devices, test items and operation behaviors, and map the original names to unified standard terms;

[0018] S140. Perform time axis alignment processing on the collected multi-source medical data, unify the format of the time fields in each system, and reconstruct the patient-level event chain to correct the order of diagnosis and treatment events, examination events, medication events, device use events, and follow-up events.

[0019] S150. Perform desensitization processing on the collected multi-source medical data, replace sensitive information in structured fields according to preset rules, and clean up privacy information in free text by combining keyword matching and pattern recognition.

[0020] S160. The quality of each piece of medical data is assessed based on a sample availability evaluation function, wherein the sample availability evaluation function is:

[0021] Qd = wd1Cs + wd2Ct + wd3Cn + wd4Ca

[0022] Where Qd represents the comprehensive usability score of a single medical data point, Cs represents the structural integrity index, Ct represents the time consistency index, Cn represents the terminology standardization index, Ca represents the desensitization compliance index, and wd1, wd2, wd3, and wd4 represent the weight parameters of structural integrity, time consistency, terminology standardization, and desensitization compliance, respectively, and wd1+wd2+wd3+wd4=1.

[0023] S170. Based on the comprehensive availability score Qd, medical data are classified and processed. High-availability samples directly enter the subsequent process, repairable samples enter the correction queue, and low-availability samples are temporarily excluded from the subsequent research process.

[0024] Preferably, step S2 includes:

[0025] S210. Segment, segment, and contextually slice the medical record texts, test instructions, examination reports, drug instructions, and device usage records in the standardized research samples to form medical text units to be parsed.

[0026] S220. Utilize a medical pre-trained model and sequence labeling network to perform entity recognition on the medical text unit to be parsed, extract one or more medical entities from diseases, symptoms, drugs, devices, test indicators, adverse reactions, indications and contraindications, and map the medical entities to a unified terminology database;

[0027] S230. Using one or more of the following as candidate relation edges: co-existing entity pairs in the same sentence, adjacent entity pairs across sentences, and co-existing entity pairs within the same patient's time window, perform relation identification to obtain one or more candidate relations among treatment relations, concurrent relations, risk relations, adaptive relations, joint relations, and time-related relations between entities;

[0028] S240. Calculate the comprehensive confidence of candidate relation edges based on the medical ontology edge confidence function, wherein the medical ontology edge confidence function is:

[0029] Re(i,j) = arSe(i,j) + brSr(i,j) + crSt(i,j) + drSk(i,j)

[0030] Where Re(i,j) represents the comprehensive confidence of the candidate relation edge between entity i and entity j, Se(i,j) represents the semantic similarity score of the entity context, Sr(i,j) represents the relation probability score output by the relation recognition model, St(i,j) represents the temporal co-occurrence score, Sk(i,j) represents the knowledge constraint matching score, and ar, br, cr, and dr represent the weight parameters of semantic similarity, relation probability, temporal co-occurrence, and knowledge constraint, respectively.

[0031] S250. When Re(i,j) is greater than the preset threshold, the relationship between entity i and entity j is written into the medical ontology knowledge graph; when Re(i,j) is less than or equal to the preset threshold, the candidate relationship edge is retained as a low-confidence candidate relationship or sent to the manual review queue.

[0032] Preferably, step S3 includes:

[0033] S310. Segment the electronic medical records, test instructions, examination reports, adverse event records, and follow-up texts into sentence-level segments or short segment-level segments;

[0034] S320. Use a pre-trained medical language model to perform initial prediction on the text segment to obtain entity recognition results, relation recognition results, and term mapping results;

[0035] S330. Perform consistency verification between the initial prediction results and the medical ontology knowledge graph to identify one or more of the following: entity type conflict, relation direction conflict, terminology mapping conflict, and ontology constraint conflict.

[0036] S340. Calculate the priority score for each text segment based on the annotation sample priority function, whereby the annotation sample priority function is:

[0037] Pa(x) = auU(x) + avV(x) + agG(x) - ahH(x)

[0038] Where Pa(x) represents the priority score of manual correction for text segment x, U(x) represents the prediction uncertainty of the model for text segment x, V(x) represents the medical information density of text segment x, G(x) represents the knowledge graph conflict degree, H(x) represents the historical annotation repetition degree, and au, av, ag, and ah represent the weight parameters corresponding to prediction uncertainty, medical information density, knowledge graph conflict degree, and historical annotation repetition degree, respectively.

[0039] S350. Sort the text segments according to the manual correction priority score Pa(x), and send the text segments with a priority higher than the preset threshold to the manual correction queue.

[0040] S360. The annotation personnel correct the text fragments in the manual correction queue to obtain the corrected entity boundaries, entity categories, relation types and term mapping results.

[0041] S370. The corrected high-value text fragments are fed back to the training set for incremental training, and the entity aliases, relation edges and terminology mapping rules in the medical ontology knowledge graph are updated simultaneously.

[0042] Preferably, step S4 includes:

[0043] S410. Based on the patient master index, diseases, symptoms, signs, drugs, devices and related clinical objects in the medical ontology knowledge graph and annotation results are grouped by patient and encoded as medical entity feature vectors Ep.

[0044] S420. Construct a patient-level event sequence based on one or more of the following: admission time, medication time, device usage time, follow-up time, and re-examination time. Extract the event sequence, event interval, and trend of change to form a time series feature vector Tp.

[0045] S430. Extract key indicators from the testing system, imaging reports and pathology results, implement unified unit procedures, outlier cleanup and result coding processing to form the test and examination feature vector Lp;

[0046] S440. Extract the drug administration route, dosage changes, combined drug regimens, device models and durations from prescription records, medical order records and device usage records to form a drug-device intervention feature vector Mp.

[0047] S450. Extract the type, severity and occurrence time of adverse reactions from adverse event records and readmission records to form an adverse event and safety feature vector Ap;

[0048] S460. Ep, Tp, Lp, Mp, and Ap are fused based on a comprehensive patient clinical representation function, wherein the comprehensive patient clinical representation function is:

[0049] Fp = bp1Ep + bp2Tp + bp3Lp + bp4Mp + bp5Ap

[0050] Where Fp represents the comprehensive clinical representation vector of patient p, and bp1, bp2, bp3, bp4, and bp5 represent the fusion weight parameters of Ep, Tp, Lp, Mp, and Ap, respectively.

[0051] S470. Based on the comprehensive clinical representation vector Fp, perform similarity comparison, rule grouping or feature clustering on patients to complete patient stratification, and output the main features and sources of difference of each patient subgroup.

[0052] Preferably, step S5 includes:

[0053] S510. Based on the patient's comprehensive clinical representation vector Fp, patients who receive the same drug or device or the same drug or device regimen are grouped into the target study subjects, and patients are subdivided according to one or more of the following: age, underlying disease, combined treatment method, device usage method and follow-up period.

[0054] S520. For the target drug / device d, calculate the real-world efficacy score Ef(d) based on one or more of the following: symptom improvement, changes in laboratory indicators, improvement in imaging results, decrease in recurrence rate, and improvement in follow-up outcomes.

[0055] S530. For the target drug or medical device d, calculate the safety risk score Rs(d) based on one or more of the following: adverse event incidence, proportion of serious adverse events, reasons for discontinuation of drug or medical device, abnormal test results, and medical device-related complications.

[0056] S540. Compare the real-world performance of each patient subgroup with the characteristics of the target population and calculate the population matching score Ad(d).

[0057] S550. Calculate the R&D guidance gain score Pg(d) based on the degree of support for one or more of the following: indication optimization, trial inclusion / exclusion criterion adjustment, and combined protocol optimization, based on real-world evaluation results.

[0058] S560. Based on the real-world comprehensive evaluation index function for pharmaceuticals and medical devices, Ef(d), Rs(d), Ad(d), and Pg(d) are comprehensively calculated. The real-world comprehensive evaluation index function for pharmaceuticals and medical devices is as follows:

[0059] Ce(d) = ce1Ef(d) - ce2Rs(d) + ce3Ad(d) + ce4Pg(d)

[0060] Where Ce(d) represents the real-world comprehensive evaluation index of the target drug / device d, and ce1, ce2, ce3, and ce4 represent the weight parameters corresponding to efficacy, safety risk, population matching, and R&D guidance gain, respectively.

[0061] S570. When Rs(d) is higher than the preset risk threshold, Ad(d) is lower than the preset matching threshold in a specific subgroup, or Ef(d) fluctuates abnormally within a preset time window, risk signal identification and population adaptive review are triggered, and an explanation of the source of the anomaly is output.

[0062] Preferably, step S6 includes:

[0063] S610. Establish a feedback collection channel to summarize the abnormal evaluation results, risk signals, low-matching subgroup results, and manual review and correction records generated in step S5, and classify them according to terminology issues, relationship issues, model recognition issues, and evaluation stability issues.

[0064] S620. When the feedback information involves new drug names, new medical device aliases, or new abbreviations, update the terminology mapping relationships and entity alias sets in the medical ontology knowledge graph.

[0065] S630. When the feedback information involves incorrect relation direction, missing relation type, or incomplete ontology rules, the relation edges, ontology constraints, and logical rules in the medical ontology knowledge graph shall be corrected.

[0066] S640. When the feedback information involves entity omission, relationship misjudgment, or terminology normalization error, the feedback information is written into the incremental training sample pool, and the intelligent annotation model is incrementally trained.

[0067] S650. First, a candidate update set is formed, and consistency verification and local manual confirmation are performed on the candidate update set. Then, the updated content that passes the verification is written into the formal medical ontology knowledge graph.

[0068] S660. Evaluate the effect of this round of feedback update based on the system iterative gain function, wherein the system iterative gain function is:

[0069] Gi(t) = gs1Da(t) + gs2Ga(t) + gs3Ra(t) + gs4Sa(t)

[0070] Where Gi(t) represents the system's overall iterative gain after the t-th feedback update, Da(t) represents the improvement in structured extraction accuracy, Ga(t) represents the improvement in knowledge graph consistency, Ra(t) represents the improvement in real-world evaluation stability, Sa(t) represents the improvement in security risk identification sensitivity, and gs1, gs2, gs3, and gs4 represent the weight parameters corresponding to structured accuracy, knowledge graph consistency, evaluation stability, and risk identification sensitivity, respectively.

[0071] S670. When Gi(t) is greater than the preset gain threshold, retain the update result of this round; when Gi(t) is less than or equal to the preset gain threshold, correct or roll back the update strategy.

[0072] According to a second aspect of this application, a drug-device evaluation device based on map annotation collaboration, employing the aforementioned drug-device evaluation method based on map annotation collaboration, is provided, comprising:

[0073] The data preprocessing module is used to collect, desensitize, standardize, and screen the quality of multi-source medical data from the hospital. Based on the sample availability evaluation function, it performs a comprehensive availability score on each piece of medical data to form a standardized research sample.

[0074] The graph construction module is used to extract medical entities and entity relationships based on the standardized research samples, and to perform comprehensive confidence calculation and screening of candidate relationship edges based on the medical ontology edge confidence function to construct a medical ontology knowledge graph.

[0075] The intelligent annotation module is used to perform initial predictive annotation on medical text based on the medical ontology knowledge graph, perform manual priority sorting based on the annotation sample priority function, and synchronously update the optimized annotation results to the medical ontology knowledge graph.

[0076] The patient representation module is used to fuse multidimensional medical information of patients based on the medical ontology knowledge graph and the optimized annotation results, through the patient comprehensive clinical representation function, to form a patient comprehensive clinical representation vector and complete patient stratification.

[0077] The comprehensive evaluation module is used to comprehensively evaluate the target drugs and medical devices based on the patient's comprehensive clinical representation vector and patient stratification results, and to identify risk signals through the drug-medical device real-world comprehensive evaluation index function.

[0078] The feedback update module is used to feed back abnormal evaluation results and manual correction information to the map construction module and the intelligent annotation module, evaluate the update effect based on the system iterative gain function, and form a closed-loop iterative optimization mechanism.

[0079] According to a third aspect of this application, an electronic device is provided, comprising: a memory and a processor; the memory stores a computer program, and the processor executes the computer program to implement the above-described drug-device evaluation method based on map annotation collaboration.

[0080] According to a fourth aspect of this application, a computer-readable storage medium is provided, on which a computer program is stored, which, when executed by a processor, implements the above-described drug-device evaluation method based on map annotation collaboration.

[0081] The embodiments of the present invention have the following advantages:

[0082] First, by unifying the collection of multi-source medical data, mapping patient master indexes, standardizing terminology, aligning timelines, and performing compliant de-identification processing, and by quantitatively classifying data quality based on a sample availability evaluation function, the correlation and usability of heterogeneous medical data are directly improved, solving the technical problems of data dispersion, inconsistent standards, and difficulty in direct use for research in existing technologies.

[0083] Second, through a two-way collaborative mechanism between the medical ontology knowledge graph and the few-sample intelligent annotation, the graph provides entity categories, relation constraints and ontology rules to the annotation model. The manual correction results of the annotation model update the graph aliases, relation edges and term mappings in reverse. The two reinforce each other, directly improving the accuracy of entity recognition, relation extraction and term normalization of unstructured medical text, while significantly reducing the cost of manual annotation.

[0084] Third, by prioritizing text fragments using a sample priority function, limited human resources are concentrated on high-value samples with high model uncertainty, high medical information density, and significant conflicts with knowledge graphs, significantly improving annotation efficiency and model generalization ability. This solves the problems of high cost and slow iteration caused by blind full-scale manual annotation in existing technologies.

[0085] Fourth, by integrating the characteristics of five dimensions—disease, time series, laboratory tests, drug and medical device interventions, and safety—through a comprehensive clinical representation function for patients, a unified patient representation vector is formed. This allows patients to be compared and stratified within the same feature space, directly improving the comprehensiveness and accuracy of efficacy comparisons and risk analyses.

[0086] Fifth, by using a real-world comprehensive evaluation index function for pharmaceuticals and medical devices to simultaneously quantify four dimensions—efficacy, safety risks, matching of applicable populations, and R&D guidance gains—a unified framework for evaluating benefits and risks can be achieved. Furthermore, by linking a risk signal identification mechanism, the comprehensiveness and interpretability of the comprehensive evaluation can be directly improved.

[0087] VI. By using the system iterative gain function, the effect of each round of feedback updates is quantified from four dimensions: structured extraction accuracy, map consistency, evaluation stability, and risk identification sensitivity. This forms a continuously evolving closed-loop optimization mechanism, solving the problems of existing technical model aging, knowledge lag, and evaluation distortion. Attached Figure Description

[0088] Figure 1 This is a schematic diagram of the overall process of a drug and medical device evaluation method based on map annotation collaboration according to the present invention;

[0089] Figure 2 This is a structural block diagram of a drug and medical device evaluation device based on map annotation collaboration according to the present invention;

[0090] Figure 3 This is a schematic diagram of the hardware structure of an electronic device according to the present invention;

[0091] Figure 4 This is a schematic diagram of the structure of a computer storage medium according to the present invention. Detailed Implementation

[0092] It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other. The present invention will now be described in detail with reference to the accompanying drawings and embodiments.

[0093] It should be noted that the above detailed descriptions are exemplary and intended to provide further explanation of this application. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains.

[0094] Example 1

[0095] This embodiment provides a collaborative drug and medical device evaluation method based on knowledge graph annotation, applied to a server. This method constructs a complete technical chain from multi-source data governance, medical knowledge graph construction, small-sample intelligent annotation, multi-dimensional patient feature fusion, comprehensive evaluation and risk identification to feedback closed-loop updates. The six steps are interconnected, mutually supportive, and cyclically reinforced, forming a collaborative closed loop.

[0096] like Figure 1 As shown, the method includes the following steps:

[0097] Step S1 involves collecting, desensitizing, standardizing, and quality-screening the multi-source medical data from the hospital. Based on the sample usability evaluation function, a comprehensive usability score is given for each piece of medical data to form a standardized research sample.

[0098] The specific implementation process of this step includes the following sub-steps:

[0099] Step S110 involves aggregating multi-source medical data from the Hospital Information System (HIS), Electronic Medical Record System (EMR), Laboratory Information System (LIS), Picture Archiving and Communication System (PACS / RIS), Medication Usage Record System, Medical Device Usage Record System, and Adverse Event Record System to establish a unified data access layer. The multi-source medical data includes structured data, semi-structured data, and unstructured data. Corresponding parsing processes are performed on database tables, interface messages, PDF reports, image report texts, and scanned documents.

[0100] Step S120 involves performing patient master index mapping on the aggregated multi-source medical data, mapping one or more of the outpatient number, inpatient number, and examination number to the same research subject to achieve cross-system patient association. This approach combines deterministic rules and probabilistic matching to address the difficulty of association caused by the use of different identifiers for the same patient in different systems.

[0101] Step S130: Perform terminology standardization processing on the collected multi-source medical data, construct a standard thesaurus covering diseases, drugs, medical devices, test items and operation behaviors, and map the original names to unified standard terms to eliminate statistical splitting problems caused by synonyms, abbreviations and historical names.

[0102] Step S140: Perform time axis alignment processing on the collected multi-source medical data, unify the format of the time fields in each system, and reconstruct the patient-level event chain to correct the order of diagnosis and treatment events, examination events, medication events, device use events and follow-up events, and ensure the logical rationality of the event chain.

[0103] Step S150: Desensitize the collected multi-source medical data, replace sensitive information such as name, ID number, telephone number, and address in the structured fields according to preset rules, and clean up privacy information in the free text using keyword matching and pattern recognition.

[0104] Step S160: The quality of each piece of medical data is assessed based on the sample availability evaluation function. The sample availability evaluation function is:

[0105] Qd = wd1Cs + wd2Ct + wd3Cn + wd4Ca

[0106] Wherein, Qd represents the comprehensive usability score of a single piece of medical data; Cs represents the structural integrity index, used to measure whether key fields such as patient mapping identifier, time field, diagnosis field, medication or device field are complete; Ct represents the temporal consistency index, used to measure whether the sequence logic of events such as diagnosis, examination, medication, device use and follow-up is reasonable; Cn represents the terminology standardization index, used to measure whether the names of diseases, drugs, devices, test items, etc. have been mapped to a unified standard terminology system; Ca represents the desensitization compliance index, used to measure whether patient identity de-identification and sensitive information cleanup have been completed; wd1, wd2, wd3, and wd4 represent the weight parameters of structural integrity, temporal consistency, terminology standardization and desensitization compliance, respectively, and wd1+wd2+wd3+wd4=1.

[0107] Step S170: The medical data is graded according to the comprehensive usability score Qd: High-usability samples (Qd is higher than the first preset threshold) directly enter the subsequent knowledge extraction and evaluation process; repairable samples (Qd is between the first preset threshold and the second preset threshold) enter the correction queue, and are re-scored after completion and correction; low-usability samples (Qd is lower than the second preset threshold) are not included in the subsequent research process, and the reasons for unusability are recorded for manual review.

[0108] In some embodiments, wd1 ranges from 0.25 to 0.35, wd2 ranges from 0.20 to 0.30, wd3 ranges from 0.20 to 0.30, and wd4 ranges from 0.15 to 0.25. The specific values ​​of each weight can be dynamically adjusted according to the compliance requirements of the research scenario.

[0109] Step S2: Extract medical entities and entity relationships based on the standardized research samples, calculate and screen candidate relationship edges based on the medical ontology edge confidence function, and construct a medical ontology knowledge graph.

[0110] The specific implementation process of this step includes the following sub-steps:

[0111] Step S210 involves segmenting, paragraphing, and contextually slicing the medical records, discharge summaries, progress notes, laboratory test results, examination reports, medication instructions, medical device usage records, and medical literature in the standardized research sample into medical text units to be parsed. A sliding window mechanism is used during slicing, with the window size dynamically adjusted according to the text type to ensure the complete preservation of entity relationship context.

[0112] Step S220: Entity recognition is performed on the medical text unit to be parsed using a medical pre-trained model and sequence labeling network. One or more medical entities are extracted from diseases, symptoms, signs, drugs, medical devices, test indicators, adverse reactions, indications, and contraindications. The medical entities are then mapped to a unified terminology database to complete entity standardization.

[0113] Step S230: Using one or more of the following as candidate relation edges: co-existing entity pairs in the same sentence, adjacent entity pairs across sentences, and co-existing entity pairs within the same patient's time window, perform relation identification on the candidate relation edges to obtain one or more candidate relations among treatment relations, concurrent relations, risk relations, adaptation relations, joint relations, and time-related relations between entities.

[0114] Step S240: Calculate the comprehensive confidence of candidate relation edges based on the medical ontology edge confidence function. The medical ontology edge confidence function is:

[0115] Re(i,j) = arSe(i,j) + brSr(i,j) + crSt(i,j) + drSk(i,j)

[0116] Where Re(i,j) represents the overall confidence of the candidate relation edge between entity i and entity j; Se(i,j) represents the semantic similarity score of the entity context, used to measure the strength of the association between two entities in the text context; Sr(i,j) represents the relation probability score output by the relation recognition model, used to represent the credibility of the model in believing that two entities have a target relation; St(i,j) represents the temporal co-occurrence score, used to measure whether two entities co-occur in the same clinical process, the same time window, or in a reasonable sequence of logic; Sk(i,j) represents the knowledge constraint matching score, used to measure whether the candidate relation satisfies the type constraint, semantic constraint, and logical constraint in the ontology; ar, br, cr, and dr represent the weight parameters of semantic similarity, relation probability, temporal co-occurrence, and knowledge constraint, respectively, and the sum of each weight is 1.

[0117] Step S250: Candidate relation edges are filtered based on the comparison result between Re(i,j) and a preset threshold. When Re(i,j) is greater than the preset threshold, the relationship between entity i and entity j is written into the medical ontology knowledge graph. When Re(i,j) is less than or equal to the preset threshold, the candidate relation edge is retained as a low-confidence candidate relation or sent to the manual review queue. The source text, frequency of occurrence, time range, and confidence level information of the entity nodes and relation edges written into the medical ontology knowledge graph are recorded to facilitate subsequent tracing, updating, and manual verification.

[0118] Step S3: Perform initial prediction annotation on the medical text based on the medical ontology knowledge graph, manually correct the priority score of the text fragments based on the annotation sample priority function, and send the text fragments with priority higher than the preset threshold to the manual correction queue for correction to obtain the optimized annotation result.

[0119] The specific implementation process of this step includes the following sub-steps:

[0120] Step S310: Segment one or more medical texts from electronic medical records, discharge summaries, laboratory test results, examination reports, adverse event records, and follow-up texts to form sentence-level segments or short segment-level segments.

[0121] Step S320: Use a pre-trained medical language model to perform initial prediction on the text fragment to obtain the corresponding entity recognition results, relation recognition results, and term mapping results.

[0122] Step S330: The initial prediction result is checked for consistency with the medical ontology knowledge graph constructed in step S2 to identify one or more of entity type conflicts, relationship direction conflicts, terminology mapping conflicts, and ontology constraint conflicts, providing input for subsequent manual correction priority scoring.

[0123] Step S340: Calculate the manual correction priority score for each text segment based on the labeled sample priority function. The labeled sample priority function is:

[0124] Pa(x) = auU(x) + avV(x) + agG(x) - ahH(x)

[0125] Wherein, Pa(x) represents the priority score of manual correction for text fragment x; U(x) represents the prediction uncertainty of the model for text fragment x, reflecting the degree of ambiguity in the current model's judgment on this sample, such as unstable entity boundaries, similar category probabilities, or uncertain relationship judgment; V(x) represents the medical information density of text fragment x, reflecting whether it simultaneously contains multiple high-value information objects such as diseases, drugs, medical devices, test indicators, operational behaviors, and adverse events; G(x) represents the knowledge graph conflict degree, reflecting the degree of inconsistency between the automatic prediction result of the current sample and existing medical ontology knowledge; H(x) represents the historical annotation duplication degree, reflecting the similarity between the current text fragment and existing labeled samples. The higher the similarity, the lower the marginal value of adding manual annotations, so it is subtracted in the formula; au, av, ag, and ah represent the weight parameters corresponding to prediction uncertainty, medical information density, knowledge graph conflict degree, and historical annotation duplication degree, respectively.

[0126] Step S350: Sort the text segments in descending order according to the manual correction priority score Pa(x), and send the text segments with a priority higher than the preset threshold to the manual correction queue.

[0127] In step S360, the annotators correct the text fragments in the manual correction queue to obtain the corrected entity boundaries, entity categories, relation types, and term mapping results. Annotators only need to correct the system's pre-annotated results, rather than starting from scratch, effectively reducing the annotation workload.

[0128] In step S370, the corrected high-value text fragments are fed back into the training set for incremental training, and the entity aliases, relation edges, and terminology mapping rules in the medical ontology knowledge graph are updated simultaneously to continuously improve the quality of subsequent automatic annotation.

[0129] Step S4: Based on the medical ontology knowledge graph and the optimized annotation results, the patient's disease information, laboratory test information, drug use information, medical device use information, adverse event information and follow-up information are fused through the patient comprehensive clinical representation function to form a patient comprehensive clinical representation vector and complete the patient stratification.

[0130] The specific implementation process of this step includes the following sub-steps:

[0131] Step S410: Based on the patient master index, diseases, symptoms, signs, surgeries, drugs, devices and related clinical objects in the medical ontology knowledge graph and annotation results are grouped by patient to form a medical entity feature set, which is then encoded as a medical entity feature vector Ep.

[0132] Step S420: Construct a patient-level event sequence based on one or more of the following: admission time, examination time, medication time, device usage time, re-examination time, and follow-up time. Extract one or more of the following: event sequence, event interval, duration, trend of change, and follow-up rhythm to form a time series feature vector Tp.

[0133] Step S430: Extract key indicators from one or more data sources, including the testing system, imaging reports, and pathology results; perform unit unification, outlier cleanup, and result coding on the key indicators to form a test and examination feature vector Lp.

[0134] Step S440: Extract the drug administration route, dosage changes, combined drug regimen, device model, device usage method and duration from one or more of the prescription record, medical order record and device usage record to form a drug-device intervention feature vector Mp.

[0135] Step S450: Extract the type, severity, occurrence time and recovery status of adverse reactions from one or more of the following: adverse event records, abnormal test changes, reasons for discontinuation of medication or medical devices, and readmission records, to form an adverse event and safety feature vector Ap.

[0136] Step S460: Ep, Tp, Lp, Mp, and Ap are fused based on the patient comprehensive clinical representation function. The patient comprehensive clinical representation function is:

[0137] Fp = bp1Ep + bp2Tp + bp3Lp + bp4Mp + bp5Ap

[0138] Wherein, Fp represents the comprehensive clinical representation vector of patient p, which is a unified structured expression of all key medical information of the patient within a certain observation window; bp1, bp2, bp3, bp4, and bp5 represent the fusion weight parameters of Ep, Tp, Lp, Mp, and Ap, respectively, which are used to reflect the importance of various features in the current research task.

[0139] Step S470: Normalize and align the dimensions of Ep, Tp, Lp, Mp and Ap, and perform similarity comparison, rule grouping or feature clustering of patients based on the comprehensive clinical representation vector Fp to complete patient stratification, and output the main characteristics and sources of difference of each patient subgroup, providing a stratification basis for subsequent comparison of drug and device efficacy and safety analysis.

[0140] Step S5: Based on the patient's comprehensive clinical representation vector and patient stratification results, the efficacy score, safety risk score, suitable population matching score, and R&D guidance gain score of the target drug and device are comprehensively calculated using the drug and device real-world comprehensive evaluation index function to obtain a comprehensive evaluation result, and risk signal identification is performed based on the comprehensive evaluation result.

[0141] The specific implementation process of this step includes the following sub-steps:

[0142] Step S510: Based on the patient's comprehensive clinical representation vector Fp, patients who receive the same drug or device or the same drug or device regimen are grouped into the target study subjects, and patients are subdivided according to one or more of the following: age, underlying disease, combined treatment method, device usage method and follow-up period.

[0143] Step S520: For the target drug / device d, calculate the real-world efficacy score Ef(d) of the target drug / device d based on one or more of the following: symptom improvement, changes in laboratory indicators, improvement in imaging results, decrease in recurrence rate, change in readmission rate, and improvement in follow-up outcome.

[0144] Step S530: For the target drug / device d, calculate the safety risk score Rs(d) of the target drug / device d based on one or more of the following: adverse event incidence, proportion of serious adverse events, reasons for discontinuation of drug / device, abnormal test results, and device-related complications.

[0145] Step S540: Compare the real-world performance of each patient subgroup with the characteristics of the target population, and calculate the population matching score Ad(d) of the target drug / device d in a specific age group, comorbidity group, risk stratification group, or combination therapy scenario.

[0146] Step S550: Based on the degree of support for one or more of the following: indication optimization, trial inclusion and exclusion criteria adjustment, endpoint design adjustment, combination therapy optimization, and post-marketing re-evaluation, calculate the R&D guidance gain score Pg(d) for the target drug / device d.

[0147] Step S560: Ef(d), Rs(d), Ad(d), and Pg(d) are comprehensively calculated based on the real-world comprehensive evaluation index function for pharmaceuticals and medical devices. The real-world comprehensive evaluation index function for pharmaceuticals and medical devices is:

[0148] Ce(d) = ce1Ef(d) - ce2Rs(d) + ce3Ad(d) + ce4Pg(d)

[0149] Where Ce(d) represents the real-world comprehensive evaluation index of the target drug / device d; ce1, ce2, ce3, and ce4 represent the weight parameters corresponding to efficacy, safety risk, population matching, and R&D guidance gain, respectively, used to control the degree of influence of each sub-item in the comprehensive evaluation. The higher Ce(d), the greater the overall benefit of the drug / device, the stronger its applicability, the higher its R&D reference value, and the relatively controllable risk.

[0150] Step S570: Establish risk monitoring rules based on the comprehensive evaluation index Ce(d) and its sub-items: when Rs(d) is higher than the preset risk threshold, trigger a safety risk warning; when Ad(d) is lower than the preset matching threshold in a specific subgroup, trigger a population adaptation review; when Ef(d) fluctuates abnormally within a preset time window, trigger an abnormal efficacy tracing; and output an explanation of the source of the abnormality to provide targeted support for clinical decision-making and drug and device research and development optimization.

[0151] Step S6: The abnormal evaluation results and risk signals obtained in step S5, as well as the manual correction information in step S3, are fed back to the medical ontology knowledge graph in step S2 and the intelligent annotation model in step S3. The update effect is evaluated based on the system iterative gain function, forming a closed-loop iterative optimization mechanism.

[0152] The specific implementation process of this step includes the following sub-steps:

[0153] Step S610: Establish a feedback collection channel to summarize the abnormal evaluation results, risk signals, low-matching subgroup results, and manual review and correction records generated in step S5, and classify them into one or more of the following categories: terminology issues, relationship issues, model recognition issues, and evaluation stability issues.

[0154] Step S620: When the feedback information involves a new drug name, a new medical device alias, or a new abbreviation, the feedback information is sent to the terminology mapping and alias extension module to update the terminology mapping relationship and entity alias set in the medical ontology knowledge graph.

[0155] Step S630: When the feedback information involves incorrect relationship direction, missing relationship type, or incomplete ontology rules, the feedback information is sent to the graph revision module to correct the relationship edges, ontology constraints, and logical rules in the medical ontology knowledge graph.

[0156] Step S640: When the feedback information involves entity omission, relationship misjudgment, or terminology normalization error, the feedback information is written into the incremental training sample pool, and the intelligent annotation model is incrementally trained or fine-tuned with small steps based on the incremental training sample pool.

[0157] In step S650, during the knowledge graph update process, a candidate update set is first formed, and consistency verification and local manual confirmation are performed on the candidate update set. Then, the updated content that passes the verification is written into the formal medical ontology knowledge graph to avoid error feedback directly polluting the knowledge base.

[0158] Step S660: After completing the knowledge graph update and intelligent annotation model update, the effect of this round of feedback update is evaluated based on the system iterative gain function. The system iterative gain function is:

[0159] Gi(t) = gs1Da(t) + gs2Ga(t) + gs3Ra(t) + gs4Sa(t)

[0160] Where Gi(t) represents the overall iterative gain of the system after the t-th feedback update; Da(t) represents the improvement in structured extraction accuracy, reflecting the degree of improvement in entity recognition, relation extraction, and terminology normalization capabilities compared to before the update; Ga(t) represents the improvement in knowledge graph consistency, reflecting improvements such as reduced conflict edges, increased logical constraint satisfaction rate, and more complete alias merging; Ra(t) represents the improvement in real-world evaluation stability, reflecting the degree of consistency enhancement of the overall evaluation results across different batches and time periods; Sa(t) represents the improvement in security risk identification sensitivity, reflecting the degree of enhancement in the system's ability to detect potential adverse reactions and abnormal usage patterns; gs1, gs2, gs3, and gs4 represent the weight parameters corresponding to structured accuracy, knowledge graph consistency, evaluation stability, and risk identification sensitivity, respectively.

[0161] Step S670: Determine whether the current round of feedback update is effective based on the system's comprehensive iterative gain Gi(t): When Gi(t) is greater than the preset gain threshold, retain the update result of this round and retain the version record and difference comparison; when Gi(t) is less than or equal to the preset gain threshold, modify, roll back or limit the expansion range of the update strategy to prevent invalid updates from reducing the overall system performance.

[0162] Example 2

[0163] This embodiment provides a drug and medical device evaluation device based on map annotation collaboration. For example... Figure 2 As shown, the device includes:

[0164] The data preprocessing module is used to collect, desensitize, standardize, and screen the quality of multi-source medical data from hospitals. Based on the sample availability evaluation function Qd=wd1Cs+wd2Ct+wd3Cn+wd4Ca, it performs a comprehensive availability score on each piece of medical data and divides the samples into three categories: high availability, repairable, and low availability according to the score results, thus forming standardized research samples.

[0165] The graph construction module is used to extract medical entities and entity relationships based on the standardized research samples, using the medical ontology edge confidence function:

[0166] Re(i,j)=arSe(i,j)+brSr(i,j)+crSt(i,j)+drSk(i,j) performs comprehensive confidence calculation and filtering on candidate relation edges to construct a medical ontology knowledge graph containing entity nodes, relation edges, source information and confidence information.

[0167] The intelligent annotation module is used to perform initial prediction annotation on medical text based on the medical ontology knowledge graph. It performs manual correction and priority sorting on text fragments based on the annotation sample priority function Pa(x)=auU(x)+avV(x)+agG(x)-ahH(x), sends high-priority fragments to the manual correction queue, and feeds the corrected high-value text fragments back to the training set for incremental training. It also updates entity aliases, relation edges, and term mapping rules in the medical ontology knowledge graph in a synchronous manner.

[0168] The patient representation module is used to form a comprehensive clinical representation vector for patients and complete patient stratification based on the medical ontology knowledge graph and the optimized annotation results, by fusing the patient's medical entity feature vector Ep, time series feature vector Tp, test and examination feature vector Lp, drug and medical device intervention feature vector Mp, and adverse event and safety feature vector Ap through the comprehensive clinical representation function Fp=bp1Ep+bp2Tp+bp3Lp+bp4Mp+bp5Ap.

[0169] The comprehensive evaluation module is used to evaluate the patients' comprehensive clinical representation vector and patient stratification results using a real-world comprehensive evaluation index function for pharmaceuticals and medical devices.

[0170] Ce(d) = ce1Ef(d) - ce2Rs(d) + ce3Ad(d) + ce4Pg(d) is used to comprehensively evaluate the target drug and medical device, and establish risk monitoring rules based on the sub-item scoring results to identify risk signals and verify population adaptability.

[0171] The feedback update module categorizes abnormal evaluation results, risk signals, and manual correction information into terminology issues, relationship issues, model recognition issues, and evaluation stability issues, and then sends them to the map construction module and intelligent annotation module respectively to perform corresponding updates, based on the system iterative gain function.

[0172] Gi(t) = gs1Da(t) + gs2Ga(t) + gs3Ra(t) + gs4Sa(t) evaluates the effect of each round of updates, forming a quantifiable and evaluable closed-loop iterative optimization mechanism.

[0173] Example 3

[0174] This embodiment provides an electronic device. For example... Figure 3As shown, the electronic device includes a processor, a memory, a communication interface, and a bus. The processor, memory, and communication interface are interconnected via the bus to communicate with each other. The processor can be a general-purpose processor, such as a central processing unit (CPU), or a dedicated processor, such as a graphics processing unit (GPU). The memory is used to store programs and specifically includes random access memory (RAM) and read-only memory (ROM). When the processor executes the computer program stored in the memory, it implements the drug-device evaluation method based on map annotation collaboration as described in Embodiment 1.

[0175] Example 4

[0176] like Figure 4 As shown, this application also provides a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it implements the drug-device evaluation method based on map annotation collaboration as described in Embodiment 1. The computer-readable storage medium can be a non-transitory computer-readable storage medium, such as a USB flash drive, a portable hard drive, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, or any other medium capable of storing program code.

[0177] It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the exemplary embodiments according to this application. As used herein, the singular form is intended to include the plural form as well, unless the context clearly indicates otherwise. Furthermore, it should be understood that when the terms "comprising" and / or "including" are used in this specification, they indicate the presence of features, steps, operations, devices, components, and / or combinations thereof.

[0178] It should be noted that the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such terms can be used interchangeably where appropriate so that the embodiments of this application described herein can be implemented in sequences other than those illustrated or described herein.

[0179] Furthermore, the terms “comprising” and “having”, and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not necessarily limited to those steps or units that are explicitly listed, but may include other steps or units that are not explicitly listed or that are inherent to such process, method, product, or apparatus.

[0180] For ease of description, spatial relative terms such as "above," "on top of," "on the upper surface of," "above," etc., are used herein to describe the spatial positional relationship of a device or feature as shown in the figures to other devices or features. It should be understood that spatial relative terms are intended to encompass different orientations in use or operation beyond the orientation of the device as described in the figures. For example, if the device in the figures were inverted, a device described as "above" or "on top of" other devices or structures would subsequently be positioned as "below" or "under" other devices or structures. Thus, the exemplary term "above" can include both "above" and "below." The device may also be positioned in other different ways, such as rotated 90 degrees or in other orientations, and the spatial relative descriptions used herein will be interpreted accordingly.

[0181] In the detailed description above, reference has been made to the accompanying drawings, which form part of this document. In the drawings, similar symbols typically identify similar parts unless the context otherwise indicates otherwise. The illustrated embodiments described in the detailed specification, drawings, and claims are not intended to be limiting. Other embodiments may be used and other changes may be made without departing from the spirit or scope of the subject matter presented herein.

[0182] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A drug-device evaluation method based on map annotation collaboration, characterized in that, Includes the following steps: S1. Collect, desensitize, standardize, and screen the quality of multi-source medical data from the hospital. Based on the sample availability evaluation function, score the comprehensive availability of each piece of medical data to form a standardized research sample. S2. Based on the standardized research samples, extract medical entities and entity relationships, and perform comprehensive confidence calculation and screening of candidate relationship edges based on the medical ontology edge confidence function to construct a medical ontology knowledge graph. S3. Perform initial prediction annotation on the medical text based on the medical ontology knowledge graph, manually correct and score the text fragments based on the annotation sample priority function, and send the text fragments with priority higher than the preset threshold to the manual correction queue for correction to obtain the optimized annotation result. S4. Based on the medical ontology knowledge graph and the optimized annotation results, the patient's disease information, laboratory test information, drug use information, medical device use information, adverse event information and follow-up information are fused through the patient comprehensive clinical representation function to form a patient comprehensive clinical representation vector and complete the patient stratification. S5. Based on the patient's comprehensive clinical representation vector and patient stratification results, the efficacy score, safety risk score, suitable population matching score and R&D guidance gain score of the target drug and device are comprehensively calculated using the drug and device real-world comprehensive evaluation index function to obtain the comprehensive evaluation result, and risk signal identification is performed based on the comprehensive evaluation result. S6. Feed back the abnormal evaluation results and risk signals obtained in step S5 and the manual correction information in step S3 to the medical ontology knowledge graph in step S2 and the intelligent annotation model in step S3. Evaluate the update effect based on the system iterative gain function to form a closed-loop iterative optimization mechanism.

2. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S1 includes: S110. Collect multi-source medical data from hospital information systems, electronic medical record systems, laboratory systems, imaging report systems, drug usage record systems, medical device usage record systems, and adverse event record systems, and establish a unified data access layer. S120. Perform patient master index mapping on the collected multi-source medical data, mapping one or more of the outpatient number, inpatient number, and examination number to the same research object, so as to realize cross-system association of patients; S130. Perform terminology standardization processing on the collected multi-source medical data, construct a standard vocabulary of diseases, drugs, medical devices, test items and operation behaviors, and map the original names to unified standard terms; S140. Perform time axis alignment processing on the collected multi-source medical data, unify the format of the time fields in each system, and reconstruct the patient-level event chain to correct the order of diagnosis and treatment events, examination events, medication events, device use events, and follow-up events. S150. Perform desensitization processing on the collected multi-source medical data, replace sensitive information in structured fields according to preset rules, and clean up privacy information in free text by combining keyword matching and pattern recognition. S160. The quality of each piece of medical data is assessed based on a sample availability evaluation function, wherein the sample availability evaluation function is: Qd = wd1Cs + wd2Ct + wd3Cn + wd4Ca Where Qd represents the comprehensive usability score of a single medical data point, Cs represents the structural integrity index, Ct represents the time consistency index, Cn represents the terminology standardization index, Ca represents the desensitization compliance index, and wd1, wd2, wd3, and wd4 represent the weight parameters of structural integrity, time consistency, terminology standardization, and desensitization compliance, respectively, and wd1+wd2+wd3+wd4=1. S170. Based on the comprehensive availability score Qd, medical data are classified and processed. High-availability samples directly enter the subsequent process, repairable samples enter the correction queue, and low-availability samples are temporarily excluded from the subsequent research process.

3. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S2 includes: S210. Segment, segment, and contextually slice the medical record texts, test instructions, examination reports, drug instructions, and device usage records in the standardized research samples to form medical text units to be parsed. S220. Utilize a medical pre-trained model and sequence labeling network to perform entity recognition on the medical text unit to be parsed, extract one or more medical entities from diseases, symptoms, drugs, devices, test indicators, adverse reactions, indications and contraindications, and map the medical entities to a unified terminology database; S230. Using one or more of the following as candidate relation edges: co-existing entity pairs in the same sentence, adjacent entity pairs across sentences, and co-existing entity pairs within the same patient's time window, perform relation identification to obtain one or more candidate relations among treatment relations, concurrent relations, risk relations, adaptive relations, joint relations, and time-related relations between entities; S240. Calculate the comprehensive confidence of candidate relation edges based on the medical ontology edge confidence function, wherein the medical ontology edge confidence function is: Re(i,j) = arSe(i,j) + brSr(i,j) + crSt(i,j) + drSk(i,j) Where Re(i,j) represents the comprehensive confidence of the candidate relation edge between entity i and entity j, Se(i,j) represents the semantic similarity score of the entity context, Sr(i,j) represents the relation probability score output by the relation recognition model, St(i,j) represents the temporal co-occurrence score, Sk(i,j) represents the knowledge constraint matching score, and ar, br, cr, and dr represent the weight parameters of semantic similarity, relation probability, temporal co-occurrence, and knowledge constraint, respectively. S250. When Re(i,j) is greater than the preset threshold, the relationship between entity i and entity j is written into the medical ontology knowledge graph; when Re(i,j) is less than or equal to the preset threshold, the candidate relationship edge is retained as a low-confidence candidate relationship or sent to the manual review queue.

4. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S3 includes: S310. Segment the electronic medical records, test instructions, examination reports, adverse event records, and follow-up texts into sentence-level segments or short segment-level segments; S320. Use a pre-trained medical language model to perform initial prediction on the text segment to obtain entity recognition results, relation recognition results, and term mapping results; S330. Perform consistency verification between the initial prediction results and the medical ontology knowledge graph to identify one or more of the following: entity type conflict, relation direction conflict, terminology mapping conflict, and ontology constraint conflict. S340. Calculate the priority score for each text segment based on the annotation sample priority function, whereby the annotation sample priority function is: Pa(x) = auU(x) + avV(x) + agG(x) - ahH(x) Where Pa(x) represents the priority score of manual correction for text segment x, U(x) represents the prediction uncertainty of the model for text segment x, V(x) represents the medical information density of text segment x, G(x) represents the knowledge graph conflict degree, H(x) represents the historical annotation repetition degree, and au, av, ag, and ah represent the weight parameters corresponding to prediction uncertainty, medical information density, knowledge graph conflict degree, and historical annotation repetition degree, respectively. S350. Sort the text segments according to the manual correction priority score Pa(x), and send the text segments with a priority higher than the preset threshold to the manual correction queue. S360. The annotation personnel correct the text fragments in the manual correction queue to obtain the corrected entity boundaries, entity categories, relation types and term mapping results. S370. The corrected high-value text fragments are fed back to the training set for incremental training, and the entity aliases, relation edges and terminology mapping rules in the medical ontology knowledge graph are updated simultaneously.

5. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S4 includes: S410. Based on the patient master index, diseases, symptoms, signs, drugs, devices and related clinical objects in the medical ontology knowledge graph and annotation results are grouped by patient and encoded as medical entity feature vectors Ep. S420. Construct a patient-level event sequence based on one or more of the following: admission time, medication time, device usage time, follow-up time, and re-examination time. Extract the event sequence, event interval, and trend of change to form a time series feature vector Tp. S430. Extract key indicators from the testing system, imaging reports and pathology results, implement unified unit procedures, outlier cleanup and result coding processing to form the test and examination feature vector Lp; S440. Extract the drug administration route, dosage changes, combined drug regimens, device models and durations from prescription records, medical order records and device usage records to form a drug-device intervention feature vector Mp. S450. Extract the type, severity and occurrence time of adverse reactions from adverse event records and readmission records to form an adverse event and safety feature vector Ap; S460. Ep, Tp, Lp, Mp, and Ap are fused based on a comprehensive patient clinical representation function, wherein the comprehensive patient clinical representation function is: Fp = bp1Ep + bp2Tp + bp3Lp + bp4Mp + bp5Ap Where Fp represents the comprehensive clinical representation vector of patient p, and bp1, bp2, bp3, bp4, and bp5 represent the fusion weight parameters of Ep, Tp, Lp, Mp, and Ap, respectively. S470. Based on the comprehensive clinical representation vector Fp, perform similarity comparison, rule grouping or feature clustering on patients to complete patient stratification, and output the main features and sources of difference of each patient subgroup.

6. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S5 includes: S510. Based on the patient's comprehensive clinical representation vector Fp, patients who receive the same drug or device or the same drug or device regimen are grouped into the target study subjects, and patients are subdivided according to one or more of the following: age, underlying disease, combined treatment method, device usage method and follow-up period. S520. For the target drug / device d, calculate the real-world efficacy score Ef(d) based on one or more of the following: symptom improvement, changes in laboratory indicators, improvement in imaging results, decrease in recurrence rate, and improvement in follow-up outcomes. S530. For the target drug or medical device d, calculate the safety risk score Rs(d) based on one or more of the following: adverse event incidence, proportion of serious adverse events, reasons for discontinuation of drug or medical device, abnormal test results, and medical device-related complications. S540. Compare the real-world performance of each patient subgroup with the characteristics of the target population and calculate the population matching score Ad(d). S550. Calculate the R&D guidance gain score Pg(d) based on the degree of support for one or more of the following: indication optimization, trial inclusion / exclusion criterion adjustment, and combined protocol optimization, based on real-world evaluation results. S560. Based on the real-world comprehensive evaluation index function for pharmaceuticals and medical devices, Ef(d), Rs(d), Ad(d), and Pg(d) are comprehensively calculated. The real-world comprehensive evaluation index function for pharmaceuticals and medical devices is as follows: Ce(d) = ce1Ef(d) - ce2Rs(d) + ce3Ad(d) + ce4Pg(d) Where Ce(d) represents the real-world comprehensive evaluation index of the target drug / device d, and ce1, ce2, ce3, and ce4 represent the weight parameters corresponding to efficacy, safety risk, population matching, and R&D guidance gain, respectively. S570. When Rs(d) is higher than the preset risk threshold, Ad(d) is lower than the preset matching threshold in a specific subgroup, or Ef(d) fluctuates abnormally within a preset time window, risk signal identification and population adaptive review are triggered, and an explanation of the source of the anomaly is output.

7. The drug-device evaluation method based on map annotation collaboration according to claim 1, characterized in that, Step S6 includes: S610. Establish a feedback collection channel to summarize the abnormal evaluation results, risk signals, low-matching subgroup results, and manual review and correction records generated in step S5, and classify them according to terminology issues, relationship issues, model recognition issues, and evaluation stability issues. S620. When the feedback information involves new drug names, new medical device aliases, or new abbreviations, update the terminology mapping relationships and entity alias sets in the medical ontology knowledge graph. S630. When the feedback information involves incorrect relation direction, missing relation type, or incomplete ontology rules, the relation edges, ontology constraints, and logical rules in the medical ontology knowledge graph shall be corrected. S640. When the feedback information involves entity omission, relationship misjudgment, or terminology normalization error, the feedback information is written into the incremental training sample pool, and the intelligent annotation model is incrementally trained. S650. First, a candidate update set is formed, and consistency verification and local manual confirmation are performed on the candidate update set. Then, the updated content that passes the verification is written into the formal medical ontology knowledge graph. S660. Evaluate the effect of this round of feedback update based on the system iterative gain function, wherein the system iterative gain function is: Gi(t) = gs1Da(t) + gs2Ga(t) + gs3Ra(t) + gs4Sa(t) Where Gi(t) represents the system's overall iterative gain after the t-th feedback update, Da(t) represents the improvement in structured extraction accuracy, Ga(t) represents the improvement in knowledge graph consistency, Ra(t) represents the improvement in real-world evaluation stability, Sa(t) represents the improvement in security risk identification sensitivity, and gs1, gs2, gs3, and gs4 represent the weight parameters corresponding to structured accuracy, knowledge graph consistency, evaluation stability, and risk identification sensitivity, respectively. S670. When Gi(t) is greater than the preset gain threshold, retain the update result of this round; when Gi(t) is less than or equal to the preset gain threshold, modify or roll back the update strategy.

8. A drug-device evaluation device based on map annotation collaboration, employing the drug-device evaluation method based on map annotation collaboration as described in any one of claims 1 to 7, characterized in that, include: The data preprocessing module is used to collect, desensitize, standardize, and screen the quality of multi-source medical data from the hospital. Based on the sample availability evaluation function, it performs a comprehensive availability score on each piece of medical data to form a standardized research sample. The graph construction module is used to extract medical entities and entity relationships based on the standardized research samples, and to perform comprehensive confidence calculation and screening of candidate relationship edges based on the medical ontology edge confidence function to construct a medical ontology knowledge graph. The intelligent annotation module is used to perform initial predictive annotation on medical text based on the medical ontology knowledge graph, perform manual priority sorting based on the annotation sample priority function, and synchronously update the optimized annotation results to the medical ontology knowledge graph. The patient representation module is used to fuse multidimensional medical information of patients based on the medical ontology knowledge graph and the optimized annotation results, through the patient comprehensive clinical representation function, to form a patient comprehensive clinical representation vector and complete patient stratification. The comprehensive evaluation module is used to comprehensively evaluate the target drugs and medical devices based on the patient's comprehensive clinical representation vector and patient stratification results, and to identify risk signals through the drug-medical device real-world comprehensive evaluation index function. The feedback update module is used to feed back abnormal evaluation results and manual correction information to the map construction module and the intelligent annotation module, evaluate the update effect based on the system iterative gain function, and form a closed-loop iterative optimization mechanism.

9. An electronic device, comprising: A memory and a processor; characterized in that the memory stores a computer program, and the processor executes the computer program to implement the drug-device evaluation method based on map annotation collaboration as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, It stores a computer program, which, when executed by a processor, implements the drug and medical device evaluation method based on map annotation collaboration as described in any one of claims 1 to 7.