Chinese clinical phenotype fine-grained named entity recognition method and system

A named entity recognition and fine-grained technology, applied in the field of clinical medical record information processing, can solve problems such as misleading analysis results

Pending Publication Date: 2022-05-31
BEIJING JIAOTONG UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If only symptom-level extraction is performed, the model will extract "fever", "cough", "chest tightness" and "chest pain" as symptoms for clinical analysis, which will mislead the analysis results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese clinical phenotype fine-grained named entity recognition method and system
  • Chinese clinical phenotype fine-grained named entity recognition method and system
  • Chinese clinical phenotype fine-grained named entity recognition method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] The present embodiment 1 provides a Chinese clinical phenotype fine-grained named entity recognition system, which includes:

[0038] The extraction module is used for character-level embedded feature extraction of clinical text through the natural language pre-training model BERT;

[0039] The encoding module is used to integrate and encode the character-level embedded features and the sequence features of clinical texts by using the bidirectional long and short word memory model BiLSTM to obtain labels;

[0040] The decoding module is used to decode and predict the label by using the conditional random field CRF, and obtain the named entity recognition result.

[0041]In this embodiment 1, the above-mentioned system is used to realize a Chinese clinical phenotype fine-grained named entity recognition method, and the method includes:

[0042] Character-level embedding feature extraction of clinical text through natural language pre-training model BERT;

[0043] Using...

Embodiment 2

[0055] In this embodiment 2, a fine-grained phenotype named entity recognition method Phenonizer based on Chinese clinical medical records is provided, as follows figure 1 shown. In the framework of this method, the natural language pre-training model BERT is used to extract the character-level embedded features of clinical texts, and then the bidirectional long and short word memory model BiLSTM is used to integrate and encode the character-level features and text sequence features. The airport CRF performs the decoding prediction of the label.

[0056] In this embodiment 2, the Phenonizer technical framework includes three layers of text information processing modules, and the specific process is described as follows:

[0057] 1) Character-level embedding representation based on BERT layer

[0058] The input data of the natural language pre-training model BERT method is the patient's clinical medical record text sequence.

[0059] BERT is a pretrained language representat...

Embodiment 3

[0091] Embodiment 3 of the present invention provides an electronic device, including a memory and a processor, the processor and the memory communicate with each other, the memory stores program instructions that can be executed by the processor, and the processor invokes the The described program instruction executes the Chinese clinical phenotype fine-grained named entity recognition method, and the method includes the following process steps:

[0092] Character-level embedding feature extraction of clinical text through natural language pre-training model BERT;

[0093] Using the bidirectional long and short word memory model BiLSTM to integrate the character-level embedded features and the sequence features of clinical texts and encode the features to obtain labels;

[0094] The conditional random field (CRF) is used to decode and predict the label, and the named entity recognition result is obtained.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a Chinese clinical phenotype-based fine-grained named entity recognition method and system, and belongs to the technical field of clinical medical record information processing, and the method comprises the steps: carrying out the character-level embedded feature extraction of a clinical text through a natural language pre-training model BERT; integrating the character-level embedded features and the sequence features of the clinical text by using a bidirectional long and short word memory model BiLSTM, and carrying out feature coding to obtain tags; and carrying out decoding prediction on the tag by using a conditional random field (CRF) to obtain a named entity recognition result. According to the method, a clinical fine-grained phenotype entity standard data set for a fine-grained named entity experiment is established, negative symptoms and positive symptoms are distinguished, and more accurate structured data is provided for clinical analysis.

Description

technical field [0001] The invention relates to the technical field of clinical medical record information processing, in particular to a Chinese clinical phenotype fine-grained named entity recognition method and system. Background technique [0002] Chinese Electronic Medical Records (CEMRs), as important clinical data, record the patient's symptoms and signs, past history and diagnosis in text or semi-structured form. Therefore, the structured extraction of information from medical record texts is particularly important for subsequent clinical data analysis, in which Named Entity Recognition (NER) is one of the key technologies. Chinese clinical medical record named entity recognition refers to the use of artificial intelligence, data mining and other computer technologies to build an entity extraction model by training and learning clinical electronic medical record data. Such models can automatically extract patient phenotypic entities from medical record texts, usuall...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/295G06K9/62G06N3/04G16H10/60
CPCG06F40/295G16H10/60G06N3/044G06N3/045G06F18/214
Inventor 周雪忠杨扩邹群盛程闯舒梓心
Owner BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products