Text detection and recognition method for medical document structured knowledge extraction

A structured knowledge and text detection technology, applied in character recognition, character and pattern recognition, semantic analysis, etc., can solve the problems of low intelligence and low efficiency of medical document information processing, so as to improve processing efficiency and improve information Utilization efficiency, the effect of intelligent information processing

Pending Publication Date: 2020-09-11
成都知识视觉科技有限公司
View PDF5 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to provide a text detection and recognition method for extracting structured knowledge of medical documents, so as to solve the problems of low efficiency and low degree of intelligence in information processing of existing medical documents

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] A text detection and recognition method for extracting structured knowledge from medical documents, comprising the following steps:

[0029] (1) Image recognition: perform OCR recognition on the preprocessed medical document image;

[0030] (2) Template matching: match the corresponding template of the medical document image for identification from the template database;

[0031] (3) Text detection: the position information of each text in the graph is obtained through text detection based on deep learning;

[0032] (4) Misalignment adjustment: Based on deep learning GCN technology, using the spatial and semantic relationship between texts, it can automatically adjust the misaligned text to the correct position;

[0033] (5) Text recognition: recognize the detected text through the OCR recognition model based on deep learning, convert it into text data, and provide basic data for the structured extraction of text data;

[0034] (6) Result verification: the recognition...

Embodiment 2

[0037] On the basis of Example 1, after verifying the recognition results based on the rule engine, vertical field knowledge map, and value range statistics in step 6, the system provides high-risk recognition items and error prompts for candidate items, which can be manually verified , and record the manual modification behavior, which is convenient for the subsequent model to iteratively upgrade.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text detection and recognition method for medical document structured knowledge extraction, belongs to the technical field of medical document information extraction, and aims to solve the problems of low processing efficiency and low intelligent degree of existing medical document information. The method comprises the following steps: (1) image identification: carrying out OCR identification on a medical document image; (2) template matching: matching the identified image with a corresponding template; (3) text detection: obtaining position information of the text inthe graph through text detection; (4) misalignment adjustment: adjusting misalignment characters correctly by utilizing a space and semantic relationship between texts; (5) text recognition: recognizing a text through OCR, and converting the text into text data; (6) result verification: verifying the recognition result based on a rule engine, a vertical domain knowledge graph and value domain statistics; and (7) structured output: structuring the recognized and verified text content, and outputting the structured text content as editable data. The method is suitable for medical document textdetection and recognition.

Description

technical field [0001] The invention belongs to the technical field of information extraction of medical documents, and in particular relates to a text detection and recognition method for extracting structured knowledge of medical documents. Background technique [0002] There are a large number of paper medical bills in the settlement of hospital outpatient and hospitalization expenses, and these medical bills are statistical information used by hospitals and community clinics to settle expenses. However, for a long time, due to the backward management of medical bills in hospitals and community outpatient clinics, a series of troubles and problems have been caused, which have been puzzling the hospital administrators. In terms of processing medical bill information, the vast majority of hospitals and almost all community outpatient clinics are still at the stage of "manual decentralized processing, paper-based warehouse storage, and manual query and update", which has bec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06F40/232G06F40/30
CPCG06F40/232G06F40/30G06V30/413G06V30/10
Inventor 向飞王一哲罗璟诣向宇王刚唐书毅黄驰曾欢
Owner 成都知识视觉科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products