Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Medical text named entity recognition method and system

A named entity recognition and text technology, applied in the field of medical text processing, can solve the problems of high complexity of long text, sparse label space, slow convergence speed, etc., to improve the efficiency of follow-up operations, the ability of strong perception of location, and simple data structure Effect

Pending Publication Date: 2022-02-11
SHANDONG UNIV +1
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The first method will generate a large number of entity combinations, which is more complex for long texts
The second method will lead to sparse label space and slower convergence speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Medical text named entity recognition method and system
  • Medical text named entity recognition method and system
  • Medical text named entity recognition method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0041] This embodiment discloses a medical text named entity recognition method based on span mode, such as figure 1 shown, including the following steps:

[0042] Step 1: Obtain the medical text to be recognized;

[0043] Step 2: Perform named entity recognition on the medical text based on the pre-trained named entity recognition model.

[0044] Wherein, the named entity recognition model includes an encoding layer and a decoding layer.

[0045] In this embodiment, the named entity recognition model is trained using the RoBERTa model. Such as figure 2 As shown, the training process specifically includes:

[0046] (1) Obtain labeled medical text samples as a training set;

[0047] In this embodiment, the medical texts are obtained from electronic medical records, including various medical texts including admission records, first trip records, and discharge records, and the medical texts are marked according to specific labeling specifications, such as marking body parts...

Embodiment 2

[0085] The purpose of this embodiment is to provide a medical text named entity recognition system, said system comprising:

[0086] A data acquisition module, configured to acquire medical texts to be identified;

[0087] The named entity recognition module is used to perform named entity recognition on the medical text to be recognized based on the pre-trained named entity recognition model; wherein, the named entity recognition model training method includes:

[0088] Obtain the medical text training dataset that has been labeled with entities, and perform character-level encoding, entity location encoding, and entity category encoding on each training data;

[0089] According to the selected Chinese pre-training model, the named entity recognition model is obtained through training according to the character-level code and the corresponding entity position code and entity category code.

Embodiment 3

[0091] The purpose of this embodiment is to provide an electronic device.

[0092] An electronic device includes a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor executes the program, it implements a medical text named entity recognition method according to the embodiment.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a medical text named entity recognition method and system. The method comprises the following steps of: acquiring a to-be-recognized medical text; based on a pre-trained named entity recognition model, performing named entity recognition on the to-be-recognized medical text; wherein a named entity recognition model training method comprises the steps of: acquiring a medical text training data set subjected to entity labeling, and performing character-level coding, entity position coding and entity category coding on each piece of training data; and performing training to obtain the named entity recognition model according to a selected Chinese pre-training model, the character-level coding, the corresponding entity position coding and the entity category coding. According to the invention, through the method of improving entity position coding and entity category coding, named entity prediction can be rapidly and efficiently carried out.

Description

technical field [0001] The invention belongs to the technical field of medical text processing, and in particular relates to a medical text named entity recognition method and system. Background technique [0002] The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art. [0003] The task of named entity recognition is mainly to identify entities in text. The mainstream models currently used include softmax logistic regression model, conditional random field (Conditional Random Field, CRF), span and other modes. Among them, softmax is consistent with the data input format of CRF, and generally adopts BIO, BIEO and other formats to encode text output. Softmax regards label prediction as a traditional classification problem, and the category corresponding to the output with the highest probability after normalization is used as the classifier output. CRF adds the category transition prob...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/295G06N3/04G06N3/08
CPCG06F40/295G06N3/08G06N3/045
Inventor 薛付忠胡锡峰季晓康陈耀祖张琪王永超仉率杰潘威张健
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products