Initialization report determining method and apparatus, storage medium and electronic device

By acquiring images to be diagnosed and candidate reports, and using disease label matching and image features to determine the initial report, the problem of low efficiency in report writing requiring extensive modifications by doctors in existing technologies is solved, and efficient and accurate medical report generation is achieved.

WO2026124323A1PCT designated stage Publication Date: 2026-06-18SHANGHAI UNITED IMAGING INTELLIGENCE CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SHANGHAI UNITED IMAGING INTELLIGENCE CO LTD
Filing Date
2025-12-03
Publication Date
2026-06-18

Smart Images

  • Figure CN2025139712_18062026_PF_FP_ABST
    Figure CN2025139712_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Disclosed in the present description are an initialization report determining method and apparatus, a storage medium and an electronic device. The method comprises: acquiring an image to be diagnosed and candidate reports; on the basis of said image, determining, from amongst disease labels of the candidate reports, a disease label of said image as a target disease label; and on the basis of the target disease label, determining an initialization report from amongst the candidate reports, wherein the initialization report is determined on the basis of each candidate report the disease label of which is the target disease label. Because the initialization report is determined on the basis of the disease label in the present description, a disease corresponding to the initialization report in the present description is similar to the disease reflected by the image to be diagnosed; in this case, a medical report that a doctor needs to write for said image has relatively high consistency with the initialization report determined in present description, and only minor modifications are needed by the doctor needs for the initialization report, such that writing medical reports on the basis of the initialization report can greatly increase the writing speed.
Need to check novelty before this filing date? Find Prior Art

Description

An initialization report determination method, apparatus, storage medium, and electronic device.

[0001] Cross-references to related applications

[0002] The related applications of this application claim priority to Chinese Patent Application No. 202411826315.1, filed on December 11, 2024, entitled "A Method, Apparatus, Storage Medium and Electronic Device for Determining Initialization Report", the entire contents of which are incorporated herein by reference. Technical Field

[0003] This specification relates to the field of computer technology, and in particular to an initialization report determination method, apparatus, storage medium, and electronic device. Background Technology

[0004] In clinical settings, doctors need to write medical imaging reports for patients based on their images. To improve writing efficiency, doctors typically select a suitable report template from a template library based on the patient's actual symptoms as an initial report, and then modify or supplement the initial report to obtain a personalized medical report for the patient.

[0005] Existing template libraries typically contain report templates for different body parts or different diseases, and the content of these templates is relatively fixed. Due to the diversity of patient symptoms, doctors often need to make many modifications or supplements when writing medical reports based on existing templates, which has limited impact on writing efficiency.

[0006] Therefore, this specification provides a method for determining an initialization report. Summary of the Invention

[0007] This specification provides an initialization report determination method, apparatus, storage medium, and electronic device to at least partially solve the aforementioned problems existing in the prior art.

[0008] The following technical solution is adopted in this specification:

[0009] This specification provides a method for determining an initialization report, including:

[0010] Obtain the images to be diagnosed and the candidate reports;

[0011] Based on the image to be diagnosed and each candidate report, the disease label of the image to be diagnosed is determined from the disease labels of each candidate report and used as the target disease label;

[0012] An initial report is determined from the candidate reports based on the target disease label.

[0013] Optionally, the method for determining each candidate report is as follows:

[0014] Identify the medical images corresponding to each historical report;

[0015] Based on the similarity between the medical images, the medical images are clustered to obtain each target cluster.

[0016] Each candidate report is determined based on the cluster center of each target cluster.

[0017] Optionally, based on the image to be diagnosed and the candidate reports, a disease label for the image to be diagnosed is determined from the disease labels of the candidate reports as the target disease label, specifically including:

[0018] Determine the medical images corresponding to each candidate report and the disease labels for each candidate report;

[0019] Based on the similarity between the medical images corresponding to each candidate report and the image to be diagnosed, at least one related image of the image to be diagnosed is determined;

[0020] The target disease label is determined based on the disease label corresponding to the at least one related image.

[0021] Optionally, a target disease label is determined based on the disease label corresponding to the at least one related image, specifically including:

[0022] Obtain various disease labels corresponding to each relevant image;

[0023] For each disease label, the number of related images corresponding to that disease label is determined and used as the statistical frequency of that disease label;

[0024] The number of relevant images is determined as the total number. The ratio of the statistical frequency of the disease label to the total number is then determined to obtain the relevant probability of the disease label.

[0025] Based on the relevant probabilities of various disease labels, the target disease label corresponding to the image to be diagnosed is determined.

[0026] Optionally, based on the target disease label, an initialization report is determined from the candidate reports, specifically including:

[0027] At least one candidate report corresponding to the target disease label is used as a similar report;

[0028] Based on the similar reports, determine the initialization report.

[0029] Optionally, based on the similar reports, an initialization report is determined, specifically including:

[0030] The similarity between the image features of the image to be diagnosed and the text features of the similarity report is determined respectively;

[0031] Based on each similarity score, an initialization report is determined from the similarity reports.

[0032] Optionally, when at least one candidate report corresponding to the target disease label is used as a similar report,

[0033] Determine whether the positive or negative status of each position of the disease label in the candidate report is completely consistent with the target disease label;

[0034] When all the coded values ​​in the disease label of a candidate report are the same as the coded values ​​of the target disease label, the candidate report is determined to be a similar report.

[0035] The disease label consists of multiple disease codes in a preset order, and the code value is used to represent the disease status of the corresponding disease.

[0036] Optionally, the disease label consists of multiple disease codes in a preset order. The code values ​​are used to represent the disease status of the corresponding disease, including: a first status value indicating a positive disease, a second status value indicating a negative disease, and a third status value indicating an uncertain disease. The third status value is used to represent a disease status where imaging features suggest a lesion but a definitive diagnosis cannot be made. The third status value is represented by a preset intermediate value.

[0037] Optionally, the candidate report may contain diagnostic conclusions for multiple diagnostic items;

[0038] Based on the image to be diagnosed and the candidate reports, the disease label of the image to be diagnosed is determined from the disease labels of the candidate reports as the target disease label, specifically including:

[0039] For each diagnostic item, the corresponding disease labels are determined based on the disease labels of each candidate report;

[0040] Based on the disease labels corresponding to the diagnostic item, determine the disease label of the image to be diagnosed in the diagnostic item.

[0041] Based on the disease labels of the images to be diagnosed in each diagnostic item, the target disease label is determined.

[0042] This specification provides an initialization report determining device, the device comprising:

[0043] The acquisition module retrieves the images to be diagnosed and the candidate reports.

[0044] The prediction module determines the disease label of the image to be diagnosed from the disease labels of the candidate reports based on the image to be diagnosed and the candidate reports, and uses it as the target disease label.

[0045] The initialization report determination module determines the initialization report from the candidate reports based on the target disease label.

[0046] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described initialization report determination method.

[0047] This specification provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the above-described initialization report determination method.

[0048] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:

[0049] In the initialization report determination method provided in this specification, the image to be diagnosed and each candidate report are acquired. Based on the image to be diagnosed, the disease label of the image to be diagnosed is determined from the disease labels of each candidate report, and this disease label is used as the target disease label. Based on the target disease label, the initialization report is determined from each candidate report.

[0050] The candidate reports in this method are pre-selected from the historical reports included in the historical diagnostic records. Therefore, the initialization report in this manual is a historical report selected from the historical diagnostic records. Furthermore, because the initialization report is determined based on disease labels, the disease type corresponding to the initialization report in this manual is similar to the disease type reflected in the image to be diagnosed. Therefore, the medical report that the physician needs to write for the image to be diagnosed has a high degree of consistency with the initialization report determined in this manual. The physician needs to make fewer changes to the initialization report, and writing the medical report based on this initialization report can greatly improve the writing speed. Attached Figure Description

[0051] The accompanying drawings, which are included to provide a further understanding of this specification and form part of this specification, illustrate exemplary embodiments and are used to explain this specification, but do not constitute an undue limitation thereof. In the drawings:

[0052] Figure 1 is a flowchart illustrating one method for determining an initialization report in this specification;

[0053] Figure 2 is a schematic diagram of an initial cluster provided in an embodiment of this specification;

[0054] Figure 3 is a schematic diagram of the target cluster corresponding to an initial cluster provided in the embodiments of this specification;

[0055] Figure 4 is a schematic diagram of a process for determining a related image provided in an embodiment of this specification;

[0056] Figure 5 is a schematic diagram of a target disease label determination process corresponding to Figure 4 provided in the embodiments of this specification;

[0057] Figure 6 is a schematic diagram of an initialization report determination principle provided in the embodiments of this specification;

[0058] Figure 7 is a schematic diagram of an initialization report determination device provided in this specification;

[0059] Figure 8 is a schematic diagram of the electronic device corresponding to Figure 1 provided in this specification. Detailed Implementation

[0060] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. All other embodiments obtained by those skilled in the art based on the embodiments in this specification without creative effort are within the scope of protection of this application.

[0061] The technical solutions provided in the various embodiments of this specification are described in detail below with reference to the accompanying drawings.

[0062] Figure 1 is a flowchart illustrating an initialization report determination method according to this specification, which specifically includes the following steps:

[0063] S100: Obtain the images to be diagnosed and the candidate reports.

[0064] All steps in the initialization report determination method provided in this manual can be implemented by any electronic device with computing capabilities, such as a terminal or server. For ease of description, the initialization report determination method provided in this manual will be described below using a server as the execution subject.

[0065] During each diagnosis process, when the doctor needs to write a medical report for the patient based on medical images, the server obtains the images to be diagnosed so as to apply the initial report determination method in this manual to determine the initial report.

[0066] This instruction manual does not limit the type of image to be diagnosed; it can be any type of medical image, such as magnetic resonance imaging (MRI), computed tomography (CT), or positron emission tomography (PET-CT).

[0067] Historical diagnostic records contain a large number of historical reports, which are medical reports written by doctors for past patients during the historical diagnostic process. These records also include the diagnostic basis for each medical report, such as medical images obtained from imaging examinations and laboratory test results.

[0068] The initial report determined using the initial report determination method described in this manual is not a traditional, fixed-format, pre-designed disease or site template. Instead, it is a historical report retrieved from historical diagnostic records that is similar to the imaging information and disease characteristics represented by the image to be diagnosed.

[0069] Therefore, in order to retrieve the initialization report from the historical reports, the server, after acquiring the image to be diagnosed, also needs to acquire the historical reports used for retrieval, i.e., acquire the candidate reports. Each candidate report is a selection of historical reports from the various historical reports.

[0070] Because the number of historical reports in the historical diagnostic records is quite large, searching for the initial report within all historical reports would be time-consuming. Furthermore, the number of reports for various diseases or body parts within the historical reports may be uneven. Therefore, before step S100, the server uses a preset filtering method to select a subset of historical reports from the historical diagnostic records as candidate reports for matching with the image to be diagnosed. This specification does not limit the specific filtering method used; it can be based on clustering or manual filtering.

[0071] The process of determining candidate reports is completed before the server acquires the image to be diagnosed. During each diagnosis process, the server only needs to acquire the image to be diagnosed and the already determined candidate reports required for the current diagnosis process, and select the initial report from the candidate reports based on the image to be diagnosed, without having to repeat the process of determining candidate reports.

[0072] S102: Based on the image to be diagnosed and the candidate reports, determine the target disease label of the image to be diagnosed from the disease labels of the candidate reports.

[0073] The core of a medical report is the disease or abnormality seen in medical images. Since negative reports often contain similar information, the key to initializing a report is to automatically, accurately, and efficiently predict and diagnose the patient's disease or abnormality. Therefore, in the initial report determination method described in this manual, each historical report in the historical diagnostic records has a disease tag, and the disease type in the initial report of the image to be diagnosed is predicted based on these disease tags.

[0074] The disease label can be recorded in the Radiology Information System (RIS), obtained through keyword search, or manually labeled. A disease label in a historical report is used to indicate the type of disease diagnosed through that historical report. This manual does not limit the specific form of the disease label; it can be a disease name, disease number, disease code, etc.

[0075] The number of disease tags corresponding to a historical report is related to the actual diagnosis in that report; that is, a historical report may have one or more disease tags. Typically, the specific number of disease tags can be determined by medical experts based on the specific image scan type and scan site, or by technicians summarizing and analyzing large datasets of historical reports.

[0076] Taking the disease name as a disease label as an example, for historical reports obtained based on chest CT diagnosis, the disease labels that the historical report may have may consist of one or more of the following disease labels: lung shadow, pulmonary nodule, pneumonia, pulmonary tuberculosis, pulmonary edema, pulmonary bullae; bronchiectasis, tracheal diverticulum, bronchitis; mediastinal mass, mediastinal lymphadenopathy; pericardial effusion, cardiac hypertrophy, aortic hypertension, pulmonary artery thickening; fracture, bone destruction, bone tumor, etc.

[0077] After identifying the target disease labels, a standardized coding system for 18 chest diseases was first constructed in the context of chest CT imaging diagnosis. The disease coding order was fixed as follows: 1. Lung shadow, 2. Lung nodule, 3. Pneumonia, 4. Pulmonary tuberculosis, 5. Pulmonary edema, 6. Pulmonary bullae, 7. Trachea dilatation, 8. Tracheal diverticulum, 9. Bronchitis, 10. Mediastinal mass, 11. Mediastinal lymphadenopathy, 12. Pericardial effusion, 13. Cardiac hypertrophy, 14. Aortic hypertension, 15. Pulmonary artery thickening, 16. Fracture, 17. Bone destruction, 18. Bone tumor. Each coding bit corresponds to a preset disease type. A bit value "1" indicates a positive disease (lesion present), a bit value "0" indicates a negative disease (no lesion), and a bit value "0.5" indicates an uncertain disease (lesion with imaging features suggestive of a lesion but unable to be definitively diagnosed).

[0078] Based on the image to be diagnosed and each candidate report, the server determines the target disease label of the image to be diagnosed from the disease labels of each candidate report.

[0079] Specifically, the server first identifies the medical images corresponding to each selected candidate report and the disease labels for each candidate report from the historical diagnostic records. The medical images in a candidate report are the images upon which the doctor based their diagnosis and are one of the diagnostic criteria for that candidate report.

[0080] Secondly, the server determines the relevant images of the image to be diagnosed based on the similarity between the medical images corresponding to each candidate report and the image to be diagnosed.

[0081] The server can input the medical images corresponding to each candidate report into the first image encoder to obtain the image features of each candidate report's medical image. The image to be diagnosed is then input into the first image encoder to obtain the image features of the image to be diagnosed. The server determines the feature similarity between the image features of each candidate report's medical image and the image features of the image to be diagnosed, and uses this as the similarity between the medical image corresponding to each candidate report and the image to be diagnosed. This specification does not limit the specific method for determining feature similarity; it can be cosine similarity, Euclidean distance, etc.

[0082] Then, the server predicts the disease label of the image to be diagnosed based on the disease labels in the corresponding historical reports of the relevant images, in order to determine the target disease label. The target disease label is the type of disease that the image to be diagnosed may represent, as initially determined in this step.

[0083] Here, the server can determine one or more medical images corresponding to candidate reports as relevant images based on the similarity between the medical images corresponding to each candidate report and the image to be diagnosed.

[0084] If the server determines that there is only one relevant image, then the server can use the disease label corresponding to that one relevant image as the target disease label.

[0085] If the server identifies multiple relevant images, it can use the disease tags corresponding to these multiple relevant images as the target disease tag. Alternatively, it can further filter among the disease tags corresponding to each relevant image to determine the target disease tag.

[0086] Because the related images are medical images that share visual similarities with the image to be diagnosed, determined based on image similarity, visual features such as brightness, contrast, and texture in a medical image imply the tissue structure and lesion morphology of the corresponding scanned area. Therefore, if the disease reflected in the related images is also similar to the disease reflected in the image to be diagnosed, the server can predict the disease type that the image to be diagnosed may exhibit based on the disease labels of the related images.

[0087] Medical reports for similar disease types tend to have similar content. Therefore, by using the target disease tag in this step, we can accurately filter out historical reports with a high degree of similarity to the medical report to be written for the image to be diagnosed from a large number of historical reports in the historical diagnosis records.

[0088] S104: Determine an initial report from the candidate reports based on the target disease label.

[0089] The target disease label is the disease label that the server predicts may correspond to the image to be diagnosed. The candidate reports themselves have corresponding disease labels. Therefore, the server can select the candidate report corresponding to the target disease label as a similar report among the candidate reports.

[0090] Because a target disease label consists of at least one disease label, candidate reports may also correspond to one or more disease labels. The server can use different methods to determine similar reports based on the number of disease labels contained in the target disease label.

[0091] When the target disease label contains only one disease label, the server identifies the candidate report with the same disease label as the target disease label from the candidate reports that correspond to only one disease label, and uses it as a similar report.

[0092] When a target disease label contains multiple disease labels, the server checks each candidate report to see if the disease labels contained in the target disease label are the same as the disease labels corresponding to that candidate report. If so, the candidate report is considered a similar report. If not, the candidate report is not considered a similar report.

[0093] When at least one candidate report corresponding to the target disease label is used as a similar report, it is determined whether the positive or negative status of each position label in the disease label of the candidate report is completely consistent with the target disease label; when each coded value in the disease label of the candidate report is the same as the coded value of the target disease label, the candidate report is determined to be a similar report; wherein, the disease label is composed of multiple disease codes in a preset order, and the coded value is used to represent the disease status of the corresponding disease.

[0094] The encoded value is used to represent the disease status of the corresponding disease, including: a first status value indicating a positive disease, a second status value indicating a negative disease, and a third status value indicating an uncertain disease; the third status value is used to represent a disease status in which imaging features suggest a lesion but a clear diagnosis cannot be made, and the third status value is represented by a preset intermediate value.

[0095] Determining whether the disease labels included in the target disease label are the same as those corresponding to the candidate report requires considering not only positive disease labels, but also the same negative disease labels not included, and indeterminate disease labels. The candidate report and the target disease label must have completely identical values ​​in all coding positions. For example, if lung nodules are positive and fractures are negative, then when determining if the target disease label and the candidate report's disease labels are the same, both must contain lung nodules, and neither can contain fracture labels. For instance, when the target disease label is (0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), it indicates that lung nodules are positive (position 2 = 1) and fractures are positive (position 16 = 1), while all other diseases are negative. In this case, the system will reject the candidate report coded as (0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) because its fracture label differs in position 16 (target is positive, candidate is negative). Similarly, the system rejects reports coded (0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0) because it incorrectly labels a positive result for bone destruction in the 17th position. Likewise, the system rejects reports coded (0, 1, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0) because of uncertainty in the pneumonia label in the third position. Candidate reports with identical codes are included in the similar report set. This comprehensive matching mechanism ensures the synchronous accuracy of positive diagnoses, negative exclusion conclusions, and uncertain disease conclusions, fundamentally improving the clinical reliability of initial reports.

[0096] After the server identifies similar reports, it determines the initialization report based on the similar reports.

[0097] Specifically, the server can use all similar reports as the initial report. Alternatively, it can further filter the similar reports and select one or a small number of similar reports as the initial report.

[0098] In some scenarios, if the number of similar reports identified by the server is relatively small, or even only one or two, the server can directly use all of the similar reports as the initialization report.

[0099] In some scenarios, the number of historical reports contained in historical diagnostic records is quite large, possibly hundreds of thousands, millions, or even higher. Therefore, even after initial screening and matching the initial report within the candidate report range, a large number of similar reports can still be identified. Not all similar reports can be directly used as the initial report; further screening is required to determine the correct initial report. This manual does not restrict the method of further screening in this case; it can employ methods such as image-text feature matching or image-image feature matching.

[0100] In this embodiment, the disease label corresponding to the similar report is completely consistent with the target disease label, because the disease label corresponding to the similar report is the type of disease diagnosed through the similar report in the historical diagnosis process, and the target disease label represents the type of disease that the image to be diagnosed may have as predicted by the server.

[0101] The type of disease a patient exhibits during diagnosis is a crucial basis for doctors to write medical reports. In this embodiment, the initial report is determined from similar reports, which have a high degree of consistency in content with the medical reports that doctors need to write for patients based on the images to be diagnosed. Therefore, doctors only need to make minor modifications to the initial report to obtain the medical report to be diagnosed, reducing the workload of doctors in the process of writing medical reports and increasing their writing speed.

[0102] Based on the initialization report determination method shown in Figure 1, since each candidate report is pre-selected from the historical reports included in the historical diagnostic records, the initialization report in this manual is a historical report selected from the historical diagnostic records. Furthermore, because the initialization report is determined based on the target disease label of the image to be diagnosed, this method can specifically identify historical reports matching the possible disease type of the image to be diagnosed in each diagnostic process, serving as the initialization report. Therefore, the medical reports that doctors need to write for the images to be diagnosed have a high degree of consistency with the initialization reports determined in this manual. Doctors need to make fewer changes to the initialization reports, and writing medical reports based on these initialization reports can greatly improve the writing speed.

[0103] The target disease label defined in this specification may include multiple disease labels. Because the disease labels included in the target disease label are the disease types that the image to be diagnosed may show as predicted in this specification, the method in this specification can simultaneously predict multiple disease types based on the image to be diagnosed.

[0104] Because lesions of different diseases vary significantly in imaging, existing disease detection methods based on images can only detect a single type of disease and cannot simultaneously detect multiple diseases. In other words, existing disease detection methods cannot simultaneously predict multiple different disease types that may be present in the images to be diagnosed; they can only identify a target disease label composed of a single disease tag, and cannot obtain the target disease label composed of multiple different disease tags as described in this method.

[0105] Therefore, the method described in this specification can be used to detect various types of diseases that a patient may have, identify more comprehensive and accurate target disease labels, and then, based on the target disease labels, identify an initial report that is more suitable for the image to be diagnosed from among the candidate reports.

[0106] Prior to step S100, the historical diagnostic record contains all historical reports determined during the entire diagnostic process. Typically, the number of historical reports in the historical diagnostic record is quite large. Therefore, in order to improve the speed of determining the initial report, the server performs a preliminary screening of each historical report before acquiring the image to be diagnosed, and uses the screened historical reports as candidate reports.

[0107] This manual does not restrict the initial screening method; for example, screening can be done through clustering or manual screening.

[0108] If candidate reports are determined through manual screening, a selection of historical reports from the historical diagnostic records can be manually chosen as candidate reports based on specific selection criteria. This instruction manual does not limit the specific content of the selection criteria for manual selection; they can be set as needed. For example, representative typical reports or rare reports indicating rare diseases can be selected from the historical reports as candidate reports.

[0109] If clustering is used for filtering, the server can perform filtering by single or multiple clustering operations, depending on the granularity of the filtering and the number of candidate reports to be retained.

[0110] In one or more embodiments of this specification, the server also performs a preliminary screening of each historical report in the historical diagnostic records through a single clustering.

[0111] Specifically, the server identifies the historical reports contained in the historical diagnostic records, as well as the corresponding medical images for each historical report. Based on the similarity between the medical images, the images are clustered to obtain target clusters. Based on the historical reports corresponding to the cluster centers of each target cluster, candidate reports are determined.

[0112] The number of candidate reports can be controlled by setting the number of target clusters. In this embodiment, when determining candidate reports in a single clustering process, the number of candidate reports obtained by the server is the same as the number of target clusters that need to be aggregated in the first clustering process.

[0113] Before step S100 above, the server can perform two clustering operations to classify each historical report in the historical diagnostic records into more granular categories, filter out more candidate reports, and make the content of each candidate report more diverse.

[0114] First, the server performs an initial clustering of the medical images corresponding to each historical report based on the similarity between the images.

[0115] Then, for each initial cluster, the server performs a second clustering of each medical image contained in that initial cluster to obtain each target cluster corresponding to that initial cluster.

[0116] Finally, the server determines each candidate report based on the cluster center of each target cluster.

[0117] Specifically, the server determines the target cluster corresponding to each initial cluster and uses the historical report corresponding to the cluster center of each target cluster as a candidate report.

[0118] Taking CT images corresponding to various historical reports as an example, CT images can be categorized according to the scanning location, such as chest CT plain scan, abdominal CT plain scan, pelvic CT plain scan, coronary artery CT angiography, head CT plain scan, and urinary system CT plain scan. Within each location category, there are diseases that may be detected in that location. Therefore, disease labels can be divided according to different locations, and the disease label for a disease diagnosed from a particular scanning location can be attributed to that category.

[0119] The medical images corresponding to each historical report, after the first clustering, may be clustered into initial clusters based on different scan sites. Each initial cluster corresponds to a CT image of a scan site. After a second clustering, the medical images contained in each initial cluster can be further subdivided according to different disease labels, resulting in target clusters. Each target cluster contains medical images of the same disease label from that scan site.

[0120] Figure 2 is a schematic diagram of an initial cluster provided in an embodiment of this specification. In Figure 2, different shapes represent medical images of different scan sites. It can be seen that after one clustering, medical images of the same scan site are grouped together to form an initial cluster. As shown in Figure 2, the location corresponding to the cluster center in each initial cluster is marked with shaded areas, and the historical report corresponding to the medical image of the cluster center is given in Figure 2.

[0121] Taking the initial cluster corresponding to the chest CT plain scan shown in Figure 2 as an example, the second clustering is explained in conjunction with Figure 3. Figure 3 is a schematic diagram of the target clusters corresponding to an initial cluster provided in the embodiments of this specification. In Figure 3, the circles marked with dotted shaded areas represent the cluster centers of one initial cluster of the initial cluster during the first clustering, and the four circles marked with mesh-like shaded areas represent the cluster centers of the four target clusters obtained after the second clustering. Each target cluster in Figure 3 contains historical reports that are similar in disease type and have similar disease labels.

[0122] In this embodiment, the first clustering is coarse clustering, and the second clustering is fine clustering. As shown in Figure 3, the number of target clusters determined by the secondary clustering is greater than the number determined by the single clustering in the previous embodiment. Therefore, this embodiment can identify more candidate reports. This allows for finer-grained matching when screening candidate reports based on the target disease label, resulting in an initial report that is more similar to the required medical report, further improving the speed of medical report writing.

[0123] The methods of screening by manual selection and screening by clustering described above can be selected individually or simultaneously. When executed simultaneously, in one embodiment, the server can combine the candidate reports determined by manual selection and the candidate reports determined by clustering selection as the final candidate report for determining the initialization report.

[0124] In step S102 above, the server can select relevant images from the medical images corresponding to each historical report based on a similarity threshold or a first quantity threshold.

[0125] That is, the server can determine a preset similarity threshold and identify medical images corresponding to candidate reports with a similarity greater than or equal to the threshold as relevant images. The server can also determine a preset first quantity threshold, sort the medical images corresponding to each candidate report according to similarity, and select the number of medical images corresponding to the first quantity threshold in sequence as relevant images.

[0126] The specific number of relevant images determined by the server is related to the specific value of the similarity threshold or the first quantity threshold. Therefore, the corresponding similarity threshold or the first quantity threshold can be set according to the specific number of relevant images to be determined. For example, in a scenario where 3 relevant images need to be determined, the first quantity threshold can be set to 3.

[0127] Figure 4 is a schematic diagram of a relevant image determination process provided in an embodiment of this specification. In Figure 4, Image1 to Image8 represent medical images corresponding to eight different candidate reports, and the content in parentheses is the disease label of each candidate report. As shown in Figure 4, I0 represents the image features of the image to be diagnosed, and I1 to I8 represent the image features of the medical images corresponding to each candidate report. i Image features I0 of the image to be diagnosed, and image features I of the medical image corresponding to the i-th historical report. i The similarity between them.

[0128] Assuming relevant images are selected according to a preset quantity threshold, and the first quantity threshold is set to 3, the server sorts the medical images corresponding to each historical report according to similarity, and selects the top three medical images as relevant images. In Figure 4, the determined Top 1 to Top 3 relevant images are marked with shaded areas, namely Image5, Image7, and Image3.

[0129] It should be noted that Figure 4 uses eight historical reports as an example to illustrate the process of determining relevant images at the principle level. In practical applications, the number of historical reports is enormous. Among the medical images corresponding to these numerous historical reports, there are usually related images with a high degree of similarity to the image to be diagnosed. In medical image-based diagnostic scenarios, doctors write medical reports based on the image information presented by the medical images. Therefore, this embodiment can identify historical reports with a high degree of similarity to the medical report that the doctor needs to write for the image to be diagnosed from the historical reports corresponding to the relevant images, and use these as initial reports.

[0130] In step S102 above, when the server determines multiple related images, the server can use the disease tags corresponding to each related image as the target disease tag. Alternatively, the server can further filter among the disease tags corresponding to each related image to determine the target disease tag.

[0131] Taking Figure 4 as an example, when the server uses the disease labels corresponding to each relevant image as the target disease labels, the target disease labels determined by the server are "pulmonary nodules", "bone tumors" and "mediastinal masses", which are all the disease labels of the medical images corresponding to Image5, Image7 and Image3.

[0132] In another embodiment, the server further filters among the disease tags corresponding to each relevant image to determine the target disease tag.

[0133] First, the server determines the number of each relevant image as the total number. Then, for each disease label in the candidate reports corresponding to each relevant image, the server determines the number of relevant images corresponding to that disease label as the statistical frequency of that disease label.

[0134] Then, the server determines the ratio of the statistical frequency of this type of disease tag to the total number, obtaining the relevant probability of this type of disease tag. Based on the relevant probabilities of various disease tags, the target disease tag is determined among all disease tags.

[0135] Specifically, the server can select at least one disease label from various disease labels as the target disease label based on a relevant probability threshold or a second quantity threshold.

[0136] If the server determines the target disease label based on a relevant probability threshold, the server can determine a preset relevant probability threshold and use disease labels with relevant probabilities greater than or equal to that threshold as target disease labels. If the server selects based on a second quantity threshold, it can sort various disease labels according to their relevant probabilities and select the number of disease labels corresponding to the second quantity threshold in sequence as target disease labels.

[0137] The specific values ​​of the relevant probability threshold or the second quantity threshold can be set according to actual selection needs. The specific number of disease labels included in the target disease label in this specification is related to the value of the relevant probability threshold or the second quantity threshold set when determining the target disease label.

[0138] Since relevant images are medical images identified based on the similarity between images, the disease type that the image to be diagnosed may represent is similar to the disease type corresponding to the disease labels of each relevant image. The frequency with which a disease label appears in the candidate reports corresponding to each relevant image represents the number of relevant images with that type of disease label.

[0139] The higher the statistical frequency of a disease label, the greater the proportion of images with that disease label among all relevant images. Therefore, the greater the probability that the image to be diagnosed also has that disease label. Thus, the correlation probability represents the likelihood that a disease label matches the disease type represented by the image to be diagnosed. For a particular disease label, the more frequently it appears among the disease labels in the candidate reports corresponding to each relevant image, the greater the correlation probability of that disease label.

[0140] Figure 5 is a schematic diagram of a target disease label determination process corresponding to Figure 4, provided in an embodiment of this specification. The target disease label determination process will be described below with reference to Figure 5. In the embodiment shown in Figure 5, disease labels are screened based on a relevant probability threshold, and the relevant probability threshold is set to 0.5.

[0141] Based on the description of Figure 4, the server has identified the Top 1 to Top 3 relevant images as Image5, Image7, and Image3. Based on each relevant image, the server determines the disease labels for the candidate reports corresponding to each image. Specifically, the historical reports corresponding to Image5 have the disease labels "pulmonary nodule" and "bone tumor," the candidate reports corresponding to Image3 have the disease label "pulmonary nodule," and the candidate reports corresponding to Image7 have the disease label "mediastinal mass."

[0142] The server determined that the historical reports corresponding to the relevant images contained three disease tags: "pulmonary nodule," "mediastinal mass," and "bone tumor." Then, for each disease tag, the server determined the relevance probability as follows: "pulmonary nodule" = 2 / 3, "mediastinal mass" = 1 / 3, and "bone tumor" = 1 / 3. Among these three probabilities, only the relevance probability for "pulmonary nodule" is greater than the relevance probability threshold of 0.5; therefore, the server determined "pulmonary nodule" as the target disease tag.

[0143] In the above embodiment where disease labels from relevant images are directly used as target disease labels without prior screening, the identified target disease labels are "pulmonary nodule," "bone tumor," and "mediastinal mass." In this embodiment, after screening, the identified target disease label is "pulmonary nodule." A comparison shows that the method used in this embodiment to determine target disease labels can identify target disease labels containing fewer disease labels, meaning the identified target disease labels have a higher degree of matching with the disease type of the image to be diagnosed.

[0144] In step S104 above, the server can determine the similarity between each similar report and the image to be diagnosed, and further filter based on the similarity to determine the initial report. This matching degree can be determined through image-text feature matching or image-image feature matching.

[0145] In one or more embodiments of this specification, the server uses image and text feature matching to further filter similar reports.

[0146] Specifically, the server inputs the image to be diagnosed into the second image encoder to obtain the image features of the image. Then, it inputs each similar report into the text encoder to obtain the text features of each similar report.

[0147] Then, the similarity between the image features of the image to be diagnosed on the server and the text features of each similar report is used to determine the initial report among the similar reports.

[0148] This specification does not impose any restrictions on the specific network structures of the second image encoder and the text encoder, but it must be ensured that the second image encoder and the text encoder have undergone comparative learning training to ensure that the features obtained by the two different encoders are mapped to the same feature space and are comparable.

[0149] In one or more embodiments of this specification, the server may also use graph feature matching to further filter similar reports.

[0150] Specifically, the server inputs the image to be diagnosed into a third image encoder to obtain the image features of the image to be diagnosed. It then identifies the medical images corresponding to each similar report and inputs these medical images into the third image encoder to obtain the image features of the medical images corresponding to each similar report.

[0151] Then, based on the image features of the image to be diagnosed and the similarity of the image features of the corresponding medical images in each similar report, the server determines the initial report from each similar report.

[0152] In the two embodiments described above, this specification does not limit the specific number of the final determined initialization reports. The server selects one or more similar reports as initialization reports as needed.

[0153] When there is only one initialization report, the doctor can directly write the medical report based on that initialization report. When there are multiple initialization reports, the server can display each initialization report to the doctor, allowing the doctor to select one as needed to write the medical report, thus giving the doctor more choice and making the initialization report determination method in this manual more flexible.

[0154] Because the initial report defined in this manual is based on the image to be diagnosed, and the disease type that the image may represent is predicted, and then selected from each candidate report according to the predicted disease type, that is, the target disease label, the initial report determined by the method in this manual is highly similar to the medical report that the doctor will write. The doctor only needs to make a few modifications to complete the writing of the current medical report.

[0155] Figure 6 is a schematic diagram illustrating the principle of initialization report determination provided in the embodiments of this specification. In Figure 6, candidate reports are represented by rectangles, and the disease label of each candidate report is marked inside the rectangle. It can be seen that some candidate reports have one disease label, while others have multiple disease labels. The candidate reports in Figure 6 are determined through clustering, and the spatial distribution of each candidate report reflects the similarity between the medical images corresponding to each candidate report. During clustering, medical images corresponding to at least one of the same disease labels will be spatially clustered after clustering. The higher the similarity of the disease labels corresponding to different medical images, the closer the spatial distance between these medical images.

[0156] In the embodiment corresponding to Figure 6, the target disease label is "pulmonary nodule," and the disease label of the similar reports is the target disease label. The three shaded rectangles in Figure 6 represent the three identified similar reports. Below each similar report, the similarity between that report and the image to be diagnosed is indicated.

[0157] If the report with the highest similarity is selected as the initialization report, then the report with a similarity of 0.7 will be used as the initialization report. In the three similar reports in Figure 6, the rectangles marked with dark shading represent the similar reports selected as initialization reports, while the two rectangles marked with light shading represent the similar reports not selected as initialization reports.

[0158] In step S104 above, the server can also perform personalized filtering on each candidate report based on the personalized information entered by the doctor to obtain each preliminary report. Then, among each preliminary report, the server determines the initial report based on the preliminary report with the disease label as the target disease label.

[0159] The personalized information here can be set according to actual needs. It can be a patient's basic information, such as the patient's gender and age, or the doctor's department, or the examination site, etc.

[0160] Through this embodiment, based on the personalized information input by the doctor, a portion of candidate reports with low matching degree with the medical report that the doctor needs to write for the image to be diagnosed can be filtered out to obtain a preliminary report. Within the scope of the preliminary report, the disease type is matched according to the predicted target disease label, and the preliminary report that matches the disease type shown by the image to be diagnosed is selected as the final initial report.

[0161] Even for the same type of disease, the morphology, size, and onset characteristics of lesions generally differ among patients of different genders or ages, resulting in variations in the descriptions of lesions in medical reports. This embodiment first matches historical reports with consistent disease tags and personalized information, achieving a very high degree of match with the medical reports that doctors need to write based on the images to be diagnosed. Therefore, doctors need to make very few modifications to the initial report; minimal changes are required to quickly obtain the medical report written for the current diagnostic process.

[0162] For example, in one application scenario of this embodiment, a doctor is reviewing a patient's medical images. Based on the patient's age of 72 and gender as female, the server selects, from among the candidate reports, the candidate reports for patients aged 70-80 and female based on personalized information, as the initial selection report.

[0163] Then, based on the target disease label of the image to be diagnosed, the server selects the initial report from the preliminary reports that match the target disease label. This initial report is a historical report that matches the patient's age, gender, and disease type in the current diagnostic process.

[0164] In one or more embodiments of this specification, in order to enable doctors to more quickly identify the content that needs to be modified in the initialization report, the server may perform obfuscation processing on the initialization report.

[0165] Specifically, in the initialization report determined in step S104 above, the server identifies information related to the specific condition of the patient, such as information indicating the location and shape of the lesion.

[0166] Then, the identified information related to the specific details of the patient's condition is removed from the initialization report, resulting in a blurred initialization report.

[0167] This manual does not restrict the specific method of removal. For example, the server may replace the identified information related to the patient's specific condition with preset characters, or set the position of the identified information related to the patient's specific condition to be filled in.

[0168] In this embodiment, doctors write medical reports based on the blurred initial report. In this way, doctors can directly fill in the information related to the patient's specific condition after the removal operation in the blurred initial report to obtain a complete medical report.

[0169] In another embodiment, the server may also blur the historical reports contained in the historical diagnostic records, or blur the candidate reports obtained after preliminary screening, or blur similar reports. The specific method of blurring is the same as in the embodiments described above.

[0170] Typically, medical reports need to list the diagnostic conclusions for multiple different diagnostic items in bullet points, with each diagnostic item corresponding to a structure or tissue location to be examined. Taking a chest CT scan as an example, the medical report may include diagnostic conclusions for items such as "lungs," "trachea," "mediastinum," "heart and blood vessels," "chest cavity and chest wall," and "bones."

[0171] In one or more embodiments of this specification, the server can split a medical report into diagnostic conclusions corresponding to different diagnostic items, and determine different item templates for different diagnostic items based on the image to be diagnosed. Finally, multiple item templates are combined to obtain an initial report. This achieves more granular matching of the initial report and determines an initial report that is more suitable for the image to be diagnosed.

[0172] In step S102 above, the server determines the corresponding disease labels for each diagnostic item from the disease labels of each candidate report.

[0173] A disease tag corresponding to a diagnostic item represents the various disease types that may be diagnosed from that diagnostic item. In this embodiment, the correspondence between each diagnostic item and the disease tag can be pre-summarized manually by combining historically reported disease tags with the diagnostic items that diagnosed the corresponding disease types.

[0174] For example, the disease labels corresponding to the diagnostic item "Lungs" can include "Lung Shadows," "Lung Nodules," "Pneumonia," "Pulmonary Tuberculosis," "Pulmonary Edema," and "Pulmonary Bullae." The disease labels corresponding to the diagnostic item "Trachea" can include "Bronchiectasis," "Tracheal Diverticulum," and "Bronchiitis." The disease labels corresponding to the diagnostic item "Mediastinum" can include "Mediastinal Mass" and "Mediastinal Lymphaculous Enlargement." The disease labels corresponding to the diagnostic item "Heart and Blood Vessels" can include "Pericardial Effusion," "Cardiac Hypertrophy," "Aortic Hypertension," and "Pulmonary Artery Enlargement." The disease labels corresponding to the diagnostic item "Pleural Cavity and Chest Wall" can include "Pleural Effusion," "Pneumothorax," and "Pleural Thickening." The disease labels corresponding to the diagnostic item "Bone" can include "Fracture," "Bone Tumor," and "Bone Destruction."

[0175] Then, the server determines the disease label of the image to be diagnosed in the diagnostic item based on the disease labels corresponding to the diagnostic item.

[0176] Based on the example above, if the diagnosis item is "lungs", the server will only predict the disease label of the image to be diagnosed in the diagnosis item from the disease labels corresponding to "lungs", namely "lung shadow", "lung nodule", "pneumonia", "tuberculosis", "pulmonary edema" and "bullous pulmonary".

[0177] The server determines the target disease label for the image to be diagnosed based on the disease labels of each diagnostic item. This target disease label is used in step S104 for disease label matching to determine the initial report. Because this embodiment performs more granular prediction of the target disease label for each diagnostic item at the diagnostic item level, it can determine a more accurate target disease label.

[0178] In step S104 above, the server determines the project template for the diagnostic project based on candidate reports whose project tags are the target disease tags. Finally, an initialization report is generated based on the project templates for each diagnostic project.

[0179] Specifically, the server will use candidate reports with the same disease label as the target disease label of the image to be diagnosed in the diagnostic project as project similarity reports for the diagnostic project.

[0180] The server uses the diagnostic conclusions corresponding to the diagnostic item in the similarity reports as candidate conclusions for that diagnostic item. Then, based on the similarity between the image features of the image to be diagnosed and the text features of the candidate conclusions of the diagnostic item, the server determines the project template for that diagnostic item among each diagnostic conclusion.

[0181] The server combines the project templates of each diagnostic item to generate an initialization report.

[0182] The specific process of determining the disease label for the diagnostic item and matching the item template is similar to the process of determining the target disease label and initializing the report within the range of disease labels corresponding to the entire candidate report. Please refer to the corresponding descriptions of steps S102 to S104 above.

[0183] The initialization report in this embodiment is obtained by reintegrating project templates from different diagnostic items. The project templates included in the initialization report obtained using this embodiment may originate from multiple candidate reports. At the level of diagnostic items, candidate conclusions corresponding to each diagnostic item that matches the image to be diagnosed are selected from each candidate report, thereby obtaining an initialization report that is more adapted to the disease type shown by the image to be diagnosed, reducing the amount of modification required by doctors when writing medical reports, and further improving the speed of doctors when writing medical reports.

[0184] The above is the method for determining the initialization report provided in this manual. Based on the same idea, this manual also provides a corresponding device for determining the initialization report, as shown in Figure 7.

[0185] Figure 7 is a schematic diagram of an initialization report determination device provided in this specification, specifically including:

[0186] The acquisition module 200 is used to acquire the image to be diagnosed and each candidate report;

[0187] The prediction module 202 is used to determine the disease label of the image to be diagnosed from the disease labels of the candidate reports based on the image to be diagnosed and the candidate reports, and use it as the target disease label;

[0188] The initialization report determination module 204 is used to determine an initialization report from the candidate reports based on the target disease label.

[0189] Optionally, the device further includes an initial screening module 206;

[0190] The medical images corresponding to each historical report are identified. Based on the similarity between the medical images, the medical images are clustered to obtain each target cluster. Based on the cluster center of each target cluster, each candidate report is determined.

[0191] Optionally, the prediction module 202 is specifically used to determine the medical image corresponding to each candidate report and the disease label of each candidate report, determine at least one related image of the image to be diagnosed based on the similarity between the medical image corresponding to each candidate report and the image to be diagnosed, and determine the target disease label based on the disease label corresponding to the at least one related image.

[0192] Optionally, there are multiple related images. The prediction module 202 is specifically used to obtain various disease labels corresponding to each related image, determine the number of related images corresponding to each disease label as the statistical frequency of the disease label, determine the total number of related images as the total number, determine the ratio of the statistical frequency of the disease label to the total number to obtain the correlation probability of the disease label, and determine the target disease label corresponding to the image to be diagnosed based on the correlation probability of each disease label.

[0193] Optionally, the initialization report determination module 204 is specifically used to determine the initialization report based on at least one candidate report corresponding to the target disease label as a similar report.

[0194] Optionally, the initialization report determination module 204 is specifically used to determine the similarity between the image features of the image to be diagnosed and the text features of the similarity report, and to determine the initialization report in the similarity report based on each similarity.

[0195] Optionally, the candidate report contains diagnostic conclusions for multiple diagnostic items. The prediction module 202 is specifically used to determine, for each diagnostic item, the corresponding disease label for that diagnostic item based on the disease labels of each candidate report, determine the disease label of the image to be diagnosed in that diagnostic item based on the corresponding disease labels of that diagnostic item, and determine the target disease label based on the disease labels of the image to be diagnosed in each diagnostic item.

[0196] This specification also provides a computer-readable storage medium storing a computer program that can be used to execute the method for determining the initialization report provided in Figure 1 above.

[0197] This specification also provides a schematic structural diagram of the electronic device shown in Figure 8. As shown in Figure 8, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, memory, and non-volatile memory, and may also include other hardware required for business operations. The processor reads the corresponding computer program from the non-volatile memory into memory and then runs it to implement the method for determining the initialization report described in Figure 1. Of course, in addition to the software implementation, this specification does not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. That is to say, the execution subject of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.

[0198] Improvements in a technology can be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many improvements to the methodology can now be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that an improvement in methodology cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must also be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed ​​Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should also understand that by simply performing some logic programming on the method flow using one of these hardware description languages ​​and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.

[0199] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.

[0200] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, a laptop computer, a cellular phone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or any combination of these devices.

[0201] For ease of description, the above devices are described in terms of function, divided into various units. Of course, in implementing this specification, the functions of each unit can be implemented in one or more software and / or hardware.

[0202] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0203] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one or more flowchart illustrations and / or one or more block diagrams.

[0204] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0205] These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

[0206] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0207] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0208] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0209] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0210] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0211] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.

[0212] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.

[0213] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this application.

Claims

1. A method for determining an initialization report, characterized in that, include: Obtain the images to be diagnosed and the candidate reports; Based on the image to be diagnosed and each candidate report, the disease label of the image to be diagnosed is determined from the disease labels of each candidate report and used as the target disease label; An initial report is determined from the candidate reports based on the target disease label.

2. The method as described in claim 1, characterized in that, The method for determining each candidate report is as follows: Identify the medical images corresponding to each historical report; Based on the similarity between the medical images, the medical images are clustered to obtain each target cluster. Each candidate report is determined based on the cluster center of each target cluster.

3. The method as described in claim 1, characterized in that, Based on the image to be diagnosed and the candidate reports, the disease label of the image to be diagnosed is determined from the disease labels of the candidate reports as the target disease label, specifically including: Determine the medical images corresponding to each candidate report and the disease labels for each candidate report; Based on the similarity between the medical images corresponding to each candidate report and the image to be diagnosed, at least one related image of the image to be diagnosed is determined; The target disease label is determined based on the disease label corresponding to the at least one related image.

4. The method as described in claim 3, characterized in that, Based on the disease tags corresponding to the at least one related image, the target disease tag is determined, specifically including: Obtain various disease labels corresponding to each relevant image; For each disease label, the number of related images corresponding to that disease label is determined and used as the statistical frequency of that disease label; The number of relevant images is determined as the total number. The ratio of the statistical frequency of the disease label to the total number is then determined to obtain the relevant probability of the disease label. Based on the relevant probabilities of various disease labels, the target disease label corresponding to the image to be diagnosed is determined.

5. The method as described in claim 1, characterized in that, Based on the target disease label, an initial report is determined from the candidate reports, specifically including: At least one candidate report corresponding to the target disease label is used as a similar report; Based on the similar reports, determine the initialization report.

6. The method as described in claim 5, characterized in that, Based on the similar reports, the initialization report is determined, specifically including: The similarity between the image features of the image to be diagnosed and the text features of the similarity report is determined respectively; Based on each similarity score, an initialization report is determined from the similarity reports.

7. The method as described in claim 5 or 6, characterized in that, When using at least one candidate report corresponding to the target disease label as a similar report, the following is included: Determine whether the positive or negative status of each position of the disease label in the candidate report is completely consistent with the target disease label; When all the coded values ​​in the disease label of a candidate report are the same as the coded values ​​of the target disease label, the candidate report is determined to be a similar report. The disease label consists of multiple disease codes in a preset order, and the code value is used to represent the disease status of the corresponding disease.

8. The method as described in claim 7, characterized in that, The disease label consists of multiple disease codes in a preset order. The code value is used to represent the disease status of the corresponding disease, including: a first status value indicating a positive disease, a second status value indicating a negative disease, and a third status value indicating an uncertain disease. The third status value is used to represent a disease status where imaging features suggest a lesion but a definitive diagnosis cannot be made. The third status value is represented by a preset intermediate value.

9. The method as described in claim 1, characterized in that, The candidate report contains diagnostic conclusions for multiple diagnostic items; Based on the image to be diagnosed and the candidate reports, the disease label of the image to be diagnosed is determined from the disease labels of the candidate reports as the target disease label, specifically including: For each diagnostic item, the corresponding disease labels are determined based on the disease labels of each candidate report; Based on the disease labels corresponding to the diagnostic item, determine the disease label of the image to be diagnosed in the diagnostic item. Based on the disease labels of the images to be diagnosed in each diagnostic item, the target disease label is determined.

10. An initialization report determining device, characterized in that, include: The acquisition module retrieves the images to be diagnosed and the candidate reports. The prediction module determines the disease label of the image to be diagnosed from the disease labels of the candidate reports based on the image to be diagnosed and the candidate reports, and uses it as the target disease label. The initialization report determination module determines the initialization report from the candidate reports based on the target disease label.

11. A storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the method described in any one of claims 1 to 9.

12. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the method described in any one of claims 1 to 9.