Text word segmentation analysis method and system for medical record text data structuring
A text data and text word segmentation technology, which is applied in patient-specific data, electronic digital data processing, natural language data processing, etc., can solve the problems of low efficiency of traditional medical record data mining, unsatisfactory case entity mapping relationship, poor accuracy, etc., to achieve The effect of reducing manual recognition and manual repetitive work, accurate medical vocabulary, and improving word segmentation accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0046] The text word segmentation analysis method of medical record text data structure of the present invention, comprises the following steps:
[0047] S100. Construct a medical thesaurus based on medical text data, the medical thesaurus includes medical words, weights and parts of speech, and the above parts of speech includes traditional words and medical words;
[0048] Generate all the words of the medical text data to be segmented based on the thesaurus dictionary, and construct a directed acyclic graph based on all the words above;
[0049] S200. Based on the above-mentioned medical thesaurus and directed acyclic graph, search for the maximum zero-returning path through dynamic programming to search for the maximum segmentation combination of sentence word frequency, and obtain a word set with context order and part of speech;
[0050] S300. Based on the three dimensions of the position of the word, the original part of speech of the word, and the medical part of speec...
Embodiment 2
[0073] The medical record text data structured text word segmentation analysis system of the present invention performs structured word segmentation and analysis on the medical record text data through the medical record text data structured text word segmentation analysis method disclosed in Embodiment 1, and the system includes a medical thesaurus building module , a word segmentation model building module, a word segmentation module, a triple analysis module and a standardization module, the medical thesaurus building module is used to build a medical thesaurus based on medical text data, and the medical thesaurus includes medical words, weights and parts of speech, and the parts of speech include The word traditional part of speech and the word medical part of speech; the word segmentation model building module is used to generate all the words of the medical text data to be segmented based on the thesaurus dictionary, and build a directed acyclic graph based on all the abov...
Embodiment 3
[0082] In a computer-readable medium of the present invention, computer instructions are stored on the computer-readable medium, and when executed by a processor, the computer instruction causes the processor to execute the method disclosed in Embodiment 1. Specifically, a system or device equipped with a storage medium may be provided, on which a software program code for realizing the functions of any of the above embodiments is stored, and the computer (or CPU or MPU of the system or device) ) to read and execute the program code stored in the storage medium.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com