Information processing device, information processing method, and information processing program

The information processing device addresses the challenge of multiple term candidates in medical text normalization by using classification information and field-specific rules to derive accurate representative notations for structured information, enhancing the analysis of medical reports.

JP2026104637APending Publication Date: 2026-06-25FUJIFILM CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
FUJIFILM CORP
Filing Date
2024-12-13
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Existing methods for normalizing terminology in medical texts, such as radiographic reports, face challenges when multiple terms are found as candidates during normalization, making accurate normalization difficult.

Method used

An information processing device that derives structured information by structuring text, obtains classification information, and normalizes it using a thesaurus or field-specific rules, ensuring accurate representation of terms.

Benefits of technology

Enables high-accuracy normalization of structured information by deriving a single representative notation for terms, even when multiple notations are possible, thereby improving search and analysis of medical reports.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026104637000001_ABST
    Figure 2026104637000001_ABST
Patent Text Reader

Abstract

The present invention provides an information processing device, method, and program that enables accurate normalization of terms extracted by structuring text. [Solution] The processor acquires text, derives structured information by structuring the text, acquires classification information regarding terms contained in the text, and normalizes the structured information based on the classification information.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to an information processing apparatus, an information processing method, and an information processing program.

Background Art

[0002] The texts created in the medical field are texts freely described by medical workers such as doctors, and as they are, they are unstructured data that is difficult to reuse statistically or for content analysis. In a radiographic report, which is one type of medical document, a doctor observes images taken by medical devices such as CT (Computed Tomography) devices or MRI (Magnetic Resonance Imaging) devices, and the results of grasping the properties such as the location, size, shape, or internal structure of each disease are described as findings in the findings section. In order to obtain this information described in the radiographic report, various methods for structuring the text have been proposed by extracting terms of classes (attributes) such as anatomy, lesions, lesion size, properties, and disease names from the findings section included in the radiographic report.

[0003] For example, Patent Document 1 proposes a method of classifying the attributes of the information described in the text in a certain unit of the text, analyzing the text for each same classification based on the classification result, and outputting the analysis result.

[0004] On the other hand, the way of expressing the text in the radiographic report varies among medical workers. Therefore, normalization is performed to search for synonyms of the terms extracted by structuring from a pre-prepared synonym dictionary and convert them into the representative notation (or dictionary code) of the searched synonyms. By performing normalization, it is possible to absorb the notation fluctuations of the terms in the text and unify the notation of the terms, so that the search and analysis of the radiographic report can be easily performed.

Prior Art Documents

Patent Documents

[0005]

Patent Document 1

[0006] However, depending on the terminology used in the text, multiple terms may be found as candidates during normalization. In such cases, normalization of the terminology is not possible.

[0007] This disclosure is made in light of the circumstances described above, and aims to enable accurate normalization of structured information, such as terminology derived by structuring text. [Means for solving the problem]

[0008] The information processing device described herein comprises a processor, The processor is, Get the text, By structuring the text, structured information can be derived. We obtain classification information about the terms contained in the text. Normalize structured information based on classification information.

[0009] In the information processing device provided for this disclosure, the classification information may include at least one of synonyms for structured information and information relating to the subject matter of the document.

[0010] In the information processing device described herein, the text is medical text. The subject matter of the text may include at least one of the organs, body parts, modalities from which the medical images used to create the text were acquired, and the medical departments that use the medical images.

[0011] In the information processing device according to this disclosure, the processor may normalize structured information by referring to a thesaurus classified according to classification information.

[0012] In the information processing device described herein, if the processor cannot normalize the structured information by referring to a thesaurus, it may supplement the structured information using rules corresponding to the field related to the document, and then normalize the supplemented structured information.

[0013] In the information processing device described herein, the processor may output the basis for normalizing structured information.

[0014] In the information processing device described herein, the processor may display normalized structured information.

[0015] In the information processing device according to this disclosure, if the processor can normalize the structured information without using classification information, it may normalize the structured information without using classification information and display the structured information normalized without using classification information separately from the structured information normalized with classification information.

[0016] The information processing method disclosed herein involves a computer acquiring text, By structuring the text, structured information can be derived. We obtain classification information about the terms contained in the text. Normalize structured information based on classification information.

[0017] The information processing program disclosed herein includes a procedure for acquiring text, The procedure for deriving structured information by structuring text, Procedures for obtaining classification information about terms contained in a text, The computer is instructed to perform a procedure to normalize structured information based on classification information.

[0018] Furthermore, the technology disclosed herein may be provided as a program product. [Effects of the Invention]

[0019] According to the present disclosure, normalization of structured information derived by structuring text can be performed with high accuracy. [Brief Description of the Drawings]

[0020] [Figure 1] Fig. showing the schematic configuration of an information processing system to which the information processing apparatus according to this embodiment is applied [Figure 2] Block diagram showing the hardware configuration of the information processing apparatus according to this embodiment [Figure 3] Block diagram showing the functional configuration of the information processing apparatus according to this embodiment [Figure 4] Fig. for explaining structuring [Figure 5] Fig. showing an example of a synonym dictionary [Figure 6] Fig. showing the display screen of the derivation result of representative notations in the first embodiment [Figure 7] Flowchart showing the processes performed in the first embodiment [Figure 8] Fig. showing an example of a template used in the second embodiment [Figure 9] Fig. showing the display screen of the derivation result of representative notations in the second embodiment [Figure 10] Flowchart showing the processes performed in the second embodiment [Figure 11] Fig. showing another example of the display screen of the derivation result of representative notations [Modes for Carrying Out the Invention]

[0021] Hereinafter, an example of an embodiment of the disclosed technology will be described while referring to the drawings. In each drawing, the same or equivalent components and parts are given the same reference numerals, and duplicate explanations are omitted. Also, the dimensional ratios in the drawings are exaggerated for convenience of explanation and may differ from the actual ratios.

[0022] First, with reference to Figure 1, an information processing system 1 to which the information processing device 20 according to the first embodiment of this disclosure is applied will be described. Figure 1 is a diagram showing the schematic configuration of the information processing system 1. The information processing system 1 is a system for analyzing image interpretation reports created by interpreting medical images acquired at medical institutions, etc.

[0023] Specifically, the information processing system 1 includes a medical institution system 10 already established in a medical institution such as a hospital, and an information processing device 20. The medical institution system 10 and the information processing device 20 are connected to each other via a wired or wireless network 9, enabling them to communicate with one another. The network 9 is, for example, a LAN (Local Area Network) and a WAN (Wide Area Network).

[0024] The medical institution system 10, based on examination orders from physicians in clinical departments using a known ordering system, performs imaging of the subject's area to be examined and stores the captured medical images. It also enables radiologists to interpret the medical images and create interpretation reports, as well as allowing physicians in the requesting clinical departments to view the interpretation reports.

[0025] The medical institution system 10 includes an imaging device 11, an image interpretation terminal (image interpretation WS (WorkStation)) 12, a medical consultation WS 13, an image server 14, and a report server 16. The medical institution system 10 may also include known information systems such as a RIS (Radiology Information System) 18 and a HIS (Hospital Information System) 19. The imaging device 11, image interpretation WS 12, medical consultation WS 13, image server 14, report server 16, RIS 18, and HIS 19 are connected to each other via a wired or wireless network 9, enabling them to communicate with one another.

[0026] Each device is a computer on which an application program is installed to function as a component of the medical institution system 10. The application program may be recorded and distributed on a recording medium such as a DVD-ROM (Digital Versatile Disc Read Only Memory) or a CD-ROM (Compact Disc Read Only Memory), and installed on the computer from that recording medium. Alternatively, it may be stored in a storage device or network storage of a server computer connected to the network 9 in an externally accessible state, and downloaded and installed on the computer upon request.

[0027] The imaging device 11 is a modality that generates medical images representing the area to be diagnosed by imaging the area of ​​the subject to be diagnosed. Examples of imaging devices 11 include plain X-ray imaging devices, CT (Computed Tomography) devices, MRI (Magnetic Resonance Imaging) devices, PET (Positron Emission Tomography) devices, ultrasound diagnostic devices, endoscopes, and fundus cameras. The medical images generated by the imaging device 11 are transmitted to the image server 14.

[0028] The Image Interpretation WS12 is a computer used by medical professionals, such as radiologists in the radiology department, for interpreting medical images and creating interpretation reports. It consists of a processing unit, a display device such as a display, and input devices such as a keyboard and mouse. The Image Interpretation WS12 handles requests to view medical images from the image server 14, various image processing on medical images received from the image server 14, display of medical images, and acceptance of text input related to medical images. The Image Interpretation WS12 also performs analysis processing on medical images, assists in creating interpretation reports based on the analysis results, requests registration and viewing of interpretation reports from the report server 16, and displays interpretation reports received from the report server 16. These processes are carried out by the Image Interpretation WS12 executing software programs for each process.

[0029] The medical WS13 is a computer used by medical professionals, such as physicians in a clinical department, for detailed observation of medical images, viewing of image interpretation reports, and creation of electronic medical records. It consists of a processing unit, a display device such as a display, and input devices such as a keyboard and mouse. The medical WS13 performs tasks such as requesting the image server 14 to view medical images, displaying medical images received from the image server 14, requesting the report server 16 to view image interpretation reports, and displaying image interpretation reports received from the report server 16. These processes are performed by the medical WS13 executing software programs for each process.

[0030] The image server 14 is a general-purpose computer with a software program installed that provides the functionality of a database management system (DBMS). The image server 14 is connected to the image database 15. When the image server 14 receives a request to register a medical image from the imaging device 11, it formats the medical image into a database format and registers it in the image database 15. Also, when the image server 14 receives a viewing request from the image interpretation workstation 12 and the medical consultation workstation 13, it searches for the medical image registered in the image database 15 and sends the retrieved medical image to the image interpretation workstation 12 and the medical consultation workstation 13 that made the viewing request.

[0031] The image database 15 is implemented using storage media such as an HDD (Hard Disk Drive), SSD (Solid State Drive), and flash memory. Medical images acquired by the imaging device 11 and metadata related to those medical images are registered in the image database 15. The connection method between the image server 14 and the image database 15 is not particularly limited; it may be connected via a data bus, or via a network such as a NAS (Network Attached Storage) or SAN (Storage Area Network).

[0032] Metadata may include identification information such as an image ID (identification) for identifying medical images, a tomographic ID assigned to each tomographic image contained in the medical image, a subject ID for identifying the subject, and a test ID for identifying the test. Metadata may also include information about the medical department that requested the acquisition of the medical images. Furthermore, metadata may include information related to the acquisition of medical images, such as the modality, acquisition method, acquisition conditions, acquisition purpose, and acquisition date and time. "Acquisition method" and "acquisition conditions" include, for example, the type of imaging device 11, manufacturer, acquisition site, acquisition protocol, acquisition sequence, imaging technique, use of contrast agent, and slice thickness in tomography. Metadata may also include information about the subject, such as the subject's name, date of birth, age, and sex. Metadata may be obtained, for example, from RIS18 and HIS19.

[0033] The report server 16 is a general-purpose computer with software programs installed that provide the functionality of a database management system. The report server 16 is connected to the report DB 17. When the report server 16 receives a request to register a radiology report from the radiology WS 12, it formats the radiology report into a database format and registers it in the report DB 17. Also, when the report server 16 receives a request to view a radiology report from the radiology WS 12 and the medical WS 13, it searches the report DB 17 for the radiology report registered therein and sends the retrieved radiology report to the radiology WS 12 and the medical WS 13 that made the viewing request.

[0034] The report database 17 is implemented using storage media such as HDDs, SSDs, and flash memory. The report database 17 stores the image interpretation reports created in the image interpretation workstation 12. The connection method between the report server 16 and the report database 17 is not particularly limited; it may be connected via a data bus, or via a network such as a NAS or SAN.

[0035] Each device included in the medical institution system 10 may be located in the same facility (e.g., a hospital) or in different facilities. Furthermore, the number of devices included in the medical institution system 10 is not particularly limited, and each device may consist of multiple devices having similar functions.

[0036] The information processing device 20 analyzes the image interpretation reports obtained from the medical institution system 10. The information processing device 20 will be described below.

[0037] First, an example of the hardware configuration of the information processing device 20 will be described with reference to Figure 2. The information processing device 20 includes a CPU (Central Processing Unit) 21, a non-volatile storage unit 22, and a memory 23 as a temporary storage area. The information processing device 20 also includes a display 24 such as a liquid crystal display, an input unit 25 such as a keyboard and mouse, and a network interface 26. The network interface 26 is connected to network 9 and performs wired and / or wireless communication. The CPU 21, storage unit 22, memory 23, display 24, input unit 25, and network interface 26 are connected to each other via a bus 28 such as a system bus and a control bus, enabling the exchange of various types of information.

[0038] The storage unit 22 is implemented by a storage medium such as an HDD, SSD, or flash memory. The information processing program 27 of the information processing device 20 is stored in the storage unit 22. The CPU 21 reads the information processing program 27 from the storage unit 22, expands it into memory 23, and executes the expanded information processing program 27. The CPU 21 is an example of the processor of this disclosure. As the information processing device 20, for example, a personal computer, server computer, smartphone, tablet terminal, and wearable terminal can be appropriately applied.

[0039] Next, an example of the functional configuration of the information processing device 20 will be described with reference to Figure 3. As shown in Figure 3, the information processing device 20 has an information acquisition unit 30, a structured information derivation unit 31, a classification information acquisition unit 32, a normalization unit 33, and a display control unit 34. When the CPU 21 executes the information processing program 27, the CPU 21 functions as the information acquisition unit 30, the structured information derivation unit 31, the classification information acquisition unit 32, the normalization unit 33, and the display control unit 34.

[0040] The information acquisition unit 30 acquires image interpretation reports from external devices such as the report server 16. The image interpretation report includes a report of findings that describes the image interpretation results. The report of findings that describes the image interpretation results included in the image interpretation report is an example of the text used in this disclosure.

[0041] The structured information derivation unit 31 analyzes the image interpretation report acquired by the information acquisition unit 30, and structures the findings text by extracting terms of each class, such as anatomy, lesion, quantity, characteristics, and disease name, from the findings text contained in the image interpretation report, and derives the terms of each class as structured information. As a method for extracting terms from the findings text, known named entity recognition methods using natural language processing models such as BERT (Bidirectional Encoder Representations from Transformers) described in Patent Document 1 can be appropriately applied.

[0042] For example, the structured information output unit 31 outputs structured information from the findings text, including anatomy such as liver S1 and sacral vertebra S2, diseases such as liver cysts, nodules and ground-glass opacities, quantities such as 1 cm and 5 mm, characteristics such as irregularity and calcification, and names of lesions such as fractures and lung cancer.

[0043] Specifically, as shown in Figure 4, if the findings statement 36 included in the image interpretation report is "A 6 cm diameter mass was found in liver segment 3, exhibiting early staining and washout, suggesting HCC," the structured information output unit 31 outputs "liver segment 3" as the anatomy, "mass" as the lesion, "6 cm in diameter" as the quantity, "early staining, washout" as the characteristics, and "HCC (hepatocellular carcinoma)" as the disease name as structured information 37.

[0044] The classification information acquisition unit 32 acquires classification information relating to terms included in the image interpretation report. The classification information includes at least one of the structured information synonyms derived by the structured information derivation unit 31 and information relating to the field of the image interpretation report being analyzed. Examples of information relating to the field of the image interpretation report being analyzed include the organ containing the lesion described in the image interpretation report, the location within the organ, the imaging site (chest and head, etc.), the type of modality (CT scanner and MRI scanner, etc.), and the medical department that requested the imaging of the image used to create the image interpretation report (orthopedics and internal medicine, etc.).

[0045] The image interpretation report is associated with the medical image from which the image interpretation report was created, and metadata is associated with the medical image. As mentioned above, the metadata includes information such as the modality from which the image from which the image interpretation report was created was acquired, the area of ​​imaging, and the medical department. Alternatively, the image interpretation report may also include information such as the modality, the area of ​​imaging, and the medical department. Therefore, the classification information acquisition unit 32 can acquire classification information from the metadata of the medical image from which the image interpretation report was created, or from this information contained in the image interpretation report. Furthermore, the classification information acquisition unit 32 may acquire classification information by having the user input classification information into the information processing device 20 in advance and save it in the storage unit 22, and then acquiring the saved classification information. Alternatively, the classification information acquisition unit 32 may acquire classification information using a derivation model that has been trained by machine learning to derive classification information (e.g., organ name, body part name, etc.) by analyzing the findings text.

[0046] For example, when using a derivation model to obtain organs as classification information, the finding statement 36 in Figure 4, "A 6 cm diameter mass was found in liver segment S3, exhibiting early staining and washout, suggesting HCC," yields "liver" as classification information.

[0047] The normalization unit 33 derives a representative notation for the structured information by normalizing the structured information based on the classification information acquired by the classification information acquisition unit 32.

[0048] In this embodiment, a thesaurus is used for normalization. Figure 5 shows an example of a thesaurus. As shown in Figure 5, the thesaurus 39 associates dictionary codes, organs, representative notations, and synonyms. In the thesaurus 39 shown in Figure 5, dictionary code 1100 is associated with liver, liver S2, and "liver S2, liver S2, S2" as organs, representative notations, and synonyms, respectively. Also, dictionary code 1200 is associated with abdomen, sacrum S1, and "sacrum S1, S1" as organs, representative notations, and synonyms, respectively. Also, dictionary code 1201 is associated with abdomen, sacrum S2, and "sacrum S2, S2" as organs, representative notations, and synonyms, respectively.

[0049] Let's assume the finding statement was "A liver cyst was found in S2." The structured information output unit 31 outputs "S2" and "liver cyst" as structured information. When referring to the thesaurus 39 shown in Figure 5 for the term "S2," it is not possible to determine whether the representative notation is liver S2 or sacral S2.

[0050] In this embodiment, the normalization unit 33 derives a representative notation based on the classification information. Here, when the classification information acquisition unit 32 analyzes the finding statement "A liver cyst was found in S2" using the derivation model in order to acquire classification information, the derivation model outputs that the organ related to the finding statement is "liver" based on the description of "liver cyst". For this reason, the classification information acquisition unit 32 acquires "liver" as classification information.

[0051] The normalization unit 33 refers to the synonym dictionary 39 shown in Figure 5 and, since the organ obtained as classification information for "S2" is the liver, it obtains "Liver S2" as the representative notation.

[0052] The display control unit 34 displays the results of the derivation of representative notations of terms included in the radiology report on the display 24. Figure 6 is a diagram showing the display screen of the results of the derivation of representative notations in the first embodiment. As shown in Figure 6, the display screen 40 displays the findings 41 and the derivation results 42 included in the radiology report. The derivation results 42 include "liver S2" and "liver cyst". The display screen 40 also displays the text 43 of the "synonym dictionary", which is the basis for the normalization used to derive the derivation results 42.

[0053] The derived results are sent to the report server 16 and stored in the report database 17, associated with the image interpretation report from which the derived results were obtained.

[0054] Next, the process performed in the first embodiment will be described. Figure 7 is a flowchart of the process performed in the first embodiment. First, the information acquisition unit 30 acquires the image interpretation report from the medical institution system 10 (step ST1). Next, the structured information extraction unit 31 derives structured information by analyzing and structuring the findings text included in the image interpretation report (step ST2).

[0055] Next, the classification information acquisition unit 32 acquires classification information related to the terms included in the image interpretation report (step ST3). Then, the normalization unit 33 derives a representative notation by normalizing the structured information based on the classification information (step ST4). Finally, the display control unit 34 displays the result of the representative notation derivation on the display screen 24 (step ST5), and the process ends.

[0056] Thus, in this embodiment, a representative notation for a structured term is derived by normalizing the structured information, which is the term contained in the observation text, based on classification information regarding the term contained in the text. Therefore, even for terms that would yield multiple representative notations if a thesaurus were consulted, a single representative notation can be derived based on the classification information. Consequently, the normalization of terms extracted by structuring the text can be performed with high accuracy.

[0057] Next, a second embodiment of the present disclosure will be described. Note that the configuration of the information processing apparatus in the second embodiment is the same as the configuration of the information processing apparatus in the first embodiment described above, and only the processing performed is different, so a detailed description of the apparatus will be omitted here. The second embodiment differs from the first embodiment in that the normalization unit 33 refers to rules corresponding to the field related to the document when normalizing the structured information.

[0058] Here, if the findings included in the radiology report is "Fractures were found in S1 and S2," the structured information extraction unit 31 structures the findings to obtain "S1," "S2," and "fracture" as structured information. In this case, the classification information acquisition unit 32 obtains "sacrum" as classification information from "fracture" and "S1" derived from the findings. The normalization unit 33 also refers to the synonym dictionary 39 shown in Figure 5 and derives sacrum S1, whose synonym is "S1," as the representative notation.

[0059] On the other hand, regarding the structured information "2", even though the classification information is sacrum, the synonym dictionary does not include "2", so the normalization unit 33 cannot normalize "2", and as a result cannot derive a representative notation. In the second embodiment, the structured information is supplemented by referring to rules corresponding to the field related to the text before referring to the synonym dictionary. In the second embodiment, a template is used as the rule corresponding to the field related to the text.

[0060] Figure 8 shows an example of a template used in the second embodiment. As shown in Figure 8, the template 50 is associated with a specific region. Specifically, the thoracic vertebrae are associated with the template "Thoracic Th{}", and the sacral vertebrae are associated with the template "Sacral S{}". Numbers are substituted for "{}". Th is an abbreviation for thoracic vertebrae, and S is an abbreviation for sacral vertebrae.

[0061] The normalization unit 33 identifies the location as the sacrum by referring to the classification information "sacrum" derived from "S1" for the structural information "2". Next, the normalization unit 33 inserts "2" into the "{}" of the sacral vertebra template in template 50 and completes the derived structured information "2" to obtain "sacral vertebra S2" as the completed structured information. Then, by referring to the thesaurus, it derives "sacral vertebra S2" as the representative notation for the completed structured information "sacral vertebra S2".

[0062] Figure 9 shows the display screen of the derived representative notation in the second embodiment. As shown in Figure 9, the display screen 45 shows the findings 46 and derived results 47 included in the radiology report. The derived results 47 include "sacral vertebra S1" and "sacral vertebra S2". The display screen 45 also shows the text 48 of the "synonym dictionary" and "template" which are the basis for the normalization used to derive the derived results 47.

[0063] Next, a second embodiment will be described. Figure 10 is a flowchart showing the process performed in the second embodiment. First, the information acquisition unit 30 acquires the image interpretation report from the medical institution system 10 (step ST11). Next, the structured information derivation unit 31 derives structured information by analyzing and structuring the findings text included in the image interpretation report (step ST12). Next, the classification information acquisition unit 32 acquires classification information regarding the terms included in the image interpretation report (step ST13). Next, the normalization unit 33 determines whether the structured information can be normalized by referring to a synonym dictionary based on the classification information (step ST14).

[0064] If step ST14 is affirmed, the normalization unit 33 derives a representative notation by normalizing the structured information by referring to a synonym dictionary based on the classification information (step ST15). If step ST14 is denied, the normalization unit 33 supplements the structured information by referring to the template 50 (step ST16), and proceeds to step ST15 to derive a representative notation by normalizing the supplemented structured information. Then, the display control unit 34 displays the result of the representative notation derivation on the display screen 24 (step ST17), and the process ends.

[0065] Thus, in the second embodiment, when the structured information cannot be normalized by referring to a thesaurus based on classification information, the structured information is supplemented by referring to a template. Therefore, by using the supplemented structured information, the normalization of terms extracted by structuring the text can be performed with high accuracy.

[0066] In each of the above embodiments, it is possible to normalize the structured information without using classification information. For example, if the report states "A hepatic cyst was found in liver segment S2," the structured information would be "liver segment S2" and "hepatic cyst." In such a case, the classification information would be the organ "liver," but even without using classification information, "liver segment S2" can be derived as a representative notation by referring to a thesaurus. On the other hand, as in the first embodiment above, if the report states "A hepatic cyst was found in segment S2," it is not possible to derive "liver segment S2" as a representative notation without using classification information.

[0067] Therefore, the normalization unit 33 may determine whether structured information, such as terminology, can be normalized without using classification information, and if it can be derived, it may normalize the structured information without using classification information. In this case, the display control unit 34 may display on the display screen of the derived results that the structured information has been normalized without using classification information. For example, as shown in Figure 11, on the display screen 60 of the derived results, in addition to the finding statement 61, "A hepatic cyst was found in liver S2," and the derived result 62 (liver S2 and hepatic cyst), the text 63, "Classification information not used," may be displayed to indicate that classification information was not used. Displaying the text 63 is an example of how, in this disclosure, structured information normalized without using classification information is displayed in distinction from structured information normalized with classification information.

[0068] In this case, the derived results obtained without using classification information may be displayed in a different text color or highlighted to distinguish them from the derived results obtained using classification information.

[0069] Furthermore, in each of the above embodiments, a thesaurus, or a thesaurus and template, is displayed as the basis for normalizing the structured information, but this is not limited to this. The basis for normalizing the structured information may be omitted.

[0070] Furthermore, while the above embodiments target findings contained in medical documents such as radiology reports for structuring and normalization, the invention is not limited to this. The technology of this disclosure can also be applied to structuring and normalizing any documents other than medical documents. For example, the technology of this disclosure can be applied to structuring and normalizing documents containing place names by converting the place name descriptions contained in the documents into searchable descriptions (geographic coordinate values).

[0071] In this embodiment, each process is executed on any computer. Furthermore, any computer may execute these processes using a processor as hardware, a program as software, or a combination thereof. In that case, the processor is configured to work in cooperation with the program to execute the various processes in this embodiment, and can function as a unit or means in this embodiment. Also, the execution order of the processes by the processor is not limited to the order described and may be changed as appropriate. Any computer may be a general-purpose computer, a computer designed for a specific purpose, a workstation, or any other system capable of executing each process.

[0072] A processor may consist of one or more hardware components, and the type of hardware is not limited. For example, a processor may consist of a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a programmable logic device such as an FPGA (Field Programmable Gate Array), a dedicated circuit for executing a specific process such as an ASIC (Application Specific Integrated Circuit), a GPU (Graphic Processing Unit), or an NPU (Neural Processing Unit). Furthermore, the type of hardware may be a combination of different types of hardware. When multiple hardware components are configured to execute one or more processes of a given processor, these components may reside in physically separate devices or in the same device. Also, in any embodiment, the order of each process performed by the processor is not limited to the order described above and may be changed as appropriate. Hardware is composed of electrical circuits (circuitry) that combine circuit elements such as semiconductor elements.

[0073] Furthermore, the program may be firmware or software such as microcode. Alternatively, the program may be, for example, a set of program modules, each function of which may be implemented by a processor configured to perform its respective function. The program may be program code or multiple code segments stored on one or more non-temporary computer-readable media (e.g., storage media or other storage). The program may be divided and stored on multiple non-temporary computer-readable media located in physically separate devices. Program code or code segments may represent any combination of procedures, functions, subprograms, routines, subroutines, modules, software packages, classes, or instructions, data structures, or program statements. Program code or code segments may be connected to other code segments or hardware circuits by sending and receiving information, data, arguments, parameters, or memory contents.

[0074] Furthermore, although the above embodiment describes a configuration in which the information processing program 27 is pre-stored (installed) in the storage unit 22, the invention is not limited to this configuration. The information processing program 27 may be provided in the form of a recording medium such as a CD-ROM (Compact Disc Read Only Memory), DVD-ROM (Digital Versatile Disc Read Only Memory), or USB (Universal Serial Bus) memory. Alternatively, the information processing program 27 may be provided in the form of a download from an external device via a network.

[0075] The technology disclosed herein extends to all program products. Program products include all forms of products for providing programs. For example, program products include programs provided via networks such as the Internet, and non-temporary computer-readable recording media such as CD-ROMs, DVDs, and USB memory sticks on which programs are stored.

[0076] The following are additional notes to this disclosure. (Additional note 1) Equipped with a processor, The aforementioned processor, Get the text, By structuring the aforementioned text, structured information is derived. We obtain classification information regarding the terms contained in the aforementioned text, An information processing device that normalizes the structured information based on the classification information. (Additional note 2) The information processing device according to Appendix 1, wherein the classification information includes at least one of synonyms for the structured information and information relating to the subject matter of the document. (Additional note 3) The above text is a medical document, The information processing device according to Appendix 2 includes, as field information relating to the said text, at least one of organs, body parts, the modality that acquired the medical image used to create the said text, and the medical department that uses the said medical image. (Additional note 4) The information processing device according to appendix 1 or 2, wherein the processor normalizes the structured information by referring to a synonym dictionary classified according to the classification information. (Additional note 5) The information processing apparatus according to Appendix 4, wherein the processor, when it is not possible to normalize the structured information by referring to the synonym dictionary, supplements the structured information using rules corresponding to the field related to the document, and normalizes the supplemented structured information. (Additional note 6) The processor is an information processing device according to any one of the appendix items 1 to 5 that outputs the basis for normalizing the structured information. (Additional note 7) The processor is an information processing device according to any one of the appendices 1 to 6 that displays the normalized structured information. (Additional note 8) The information processing apparatus according to Appendix 7, wherein the processor normalizes the structured information without using the classification information if the structured information can be normalized without using the classification information, and displays the structured information normalized without using the classification information as a distinction from the structured information normalized using the classification information. (Additional note 9) Computers Get the text, By structuring the aforementioned text, structured information is derived. We obtain classification information regarding the terms contained in the aforementioned text, An information processing method for normalizing structured information based on the classification information. (Additional note 10) The procedure for obtaining the text, The procedure for deriving structured information by structuring the aforementioned text, A procedure for obtaining classification information regarding terms contained in the aforementioned text, An information processing program that causes a computer to perform a procedure for normalizing the structured information based on the classification information. [Explanation of Symbols]

[0077] 1. Information Processing System 9 Network 10. Healthcare System 11. Imaging device 12 Image Interpretation Workshop 13 Medical Workshop 14 Image Server 15 Image Database 16. Report Server 17 Report Database 18 RIS 19 HIS 20 Information Processing Devices 21 CPU 22 Memory section 23 memory 24 displays 25 Input section 26 Network Interface 27 Information Processing Programs 28 buses 30 Information acquisition department 31 Structured information derivation part 32 Classification information acquisition unit 33 Normalization section 34 Display Control Unit 36 Observations 37 Structured information 39 Thesaurus 40,45,60 display screen 41,46,61 Observations 42,47,62 Derivation results 43,48,63 Text 50 templates

Claims

1. Equipped with a processor, The aforementioned processor, Get the text, By structuring the aforementioned text, structured information is derived. We obtain classification information regarding the terms contained in the aforementioned text, An information processing device that normalizes the structured information based on the classification information.

2. The information processing apparatus according to claim 1, wherein the classification information includes at least one of synonyms for the structured information and information relating to the field of the document.

3. The above text is a medical document, The information processing apparatus according to claim 2, wherein the field information relating to the document includes at least one of an organ, a body part, the modality that acquired the medical image used to create the document, and the medical department that uses the medical image.

4. The information processing apparatus according to claim 1 or 2, wherein the processor normalizes the structured information by referring to a synonym dictionary classified according to the classification information.

5. The information processing apparatus according to claim 4, wherein the processor, when it is unable to normalize the structured information by referring to the synonym dictionary, supplements the structured information using rules corresponding to the field related to the document, and normalizes the supplemented structured information.

6. The information processing apparatus according to claim 1 or 2, wherein the processor outputs the basis for normalizing the structured information.

7. The information processing apparatus according to claim 1 or 2, wherein the processor displays the normalized structured information.

8. The information processing apparatus according to claim 7, wherein the processor normalizes the structured information without using the classification information if it can normalize the structured information without using the classification information, and displays the structured information normalized without using the classification information as a distinction from the structured information normalized using the classification information.

9. Computers Get the text, By structuring the aforementioned text, structured information is derived. We obtain classification information regarding the terms contained in the aforementioned text, An information processing method for normalizing structured information based on the classification information.

10. The procedure for obtaining the text, The procedure for deriving structured information by structuring the aforementioned text, A procedure for obtaining classification information regarding terms contained in the aforementioned text, An information processing program that causes a computer to perform a procedure for normalizing the structured information based on the classification information.