Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Disease symptom extraction method based on AC automaton

An extraction method and automaton technology, applied in medical data mining, natural language data processing, computer-aided medical procedures, etc., to achieve the effect of design and optimization

Inactive Publication Date: 2019-03-26
DONGHUA UNIV
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The technical problem to be solved by the present invention is: how to effectively and quickly extract the symptom words produced by adverse drug reactions covered in the medical record information in the long text of unstructured electronic medical record information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Disease symptom extraction method based on AC automaton
  • Disease symptom extraction method based on AC automaton

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] Below in conjunction with specific embodiment, further illustrate the present invention.

[0018] This embodiment provides a method for extracting disease symptoms based on AC automata. Using the AC automaton, first construct a dictionary tree using a dictionary of symptom words, and then construct a failure pointer. After the AC automaton is realized, after confirming that the character string is encoded in UTF-8, match the symptom words.

[0019] The specific implementation process is:

[0020] Step 1: Construct a dictionary tree based on the symptom word dictionary. The ordered set of edges on the path from the root node to any node represents the corresponding prefix of the symptom word in the dictionary. Such as figure 1 As shown, "tinnitus" and "earplug" have the common prefix "ear", and "dizziness" and "headache" have the common prefix "head".

[0021] The second step is to construct the failure pointer. Set a pointer, the initial state points to the root no...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a disease symptom extraction method based on an AC automaton, and the method comprises the steps: 1, constructing a dictionary tree through using a symptom word dictionary; 2, constructing a failure pointer to implement an AC automaton algorithm; 3, converting the format of electronic medical record information into an UTF-8 encoding format; 4, performing the matching of thesymptom words in the electronic medical record information through the AC automaton algorithm: performing the marking and extracting the symptom words if the complete matching is achieved, and continuing to read the electronic medical record information until a termination symbol is read; 5, taking a failed node up along a parent node of the location in the symptom dictionary tree if the matchingof one or more words is achieved but the successful matching is not achieved, and switching to step 4. The method can effectively and quickly extract symptom words in the unstructured electronic medical record, thereby facilitating the research of automatic monitoring of adverse drug reactions, and facilitating the design and optimization of the spontaneous reporting system for adverse drug reactions.

Description

technical field [0001] The invention relates to the technical field of symptom matching in unstructured medical texts such as electronic medical records, in particular to a method for extracting disease symptoms involved in the detection of adverse drug reactions. Background technique [0002] Extracting the symptom information after taking medicine covered in it from the patient's unstructured electronic medical record information is the basis for realizing the automatic monitoring of adverse drug reactions. [0003] The Aho-Corasick automaton algorithm (AC automaton algorithm for short) originated from the dictionary tree algorithm and is one of the main multi-pattern matching algorithms. The AC automaton algorithm has outstanding advantages such as linear worst-case time complexity, high flexibility, tolerance to short patterns, and resistance to complexity attacks. It is currently one of the preferred online matching algorithms for technicians in related fields. [0004...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G16H10/60G16H50/70G06F17/22
CPCG16H10/60G16H50/70G06F40/157
Inventor 李继云王天磊孙莉俞捷林靖生乐嘉锦
Owner DONGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products