Method for extracting text information through secondary semantic annotation

A technology of semantic labeling and extraction method, applied in the field of information extraction, can solve the problems of wrong labeling and not having, and achieve the effect of improving efficiency, reducing the problem of wrong labeling, and improving the effect of information extraction.

Active Publication Date: 2014-05-21
ZHEJIANG UNIV
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method faces another dilemma in the Chinese field: due to the characteristics of the Chinese language itself (no separator between words, usually a word can have multiple parts of speech, etc.), it is easy to appear due to overwriting Mislabeling caused by sexual phrases

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting text information through secondary semantic annotation
  • Method for extracting text information through secondary semantic annotation
  • Method for extracting text information through secondary semantic annotation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The present invention will now be described in detail in conjunction with the accompanying drawings and embodiments, and the method of the present invention is completed by a computer. figure 1 It is the flowchart of the first embodiment of the method of the present invention, and the specific steps are as follows:

[0041] Step 1, generate a task-specific semantic dictionary for a specific task.

[0042] The semantic dictionary serving concept semantic annotation is concept-oriented, and supplements the entry of a specific term form of the concept for a specific semantic meaning already defined in the concept annotation semantic network. Such as figure 2 Shown: "Fetal heart rate" is a concept marked as clinical discovery (FINDING), and the terminology of this concept in clinical documents includes "fetal heart rate", "fetal heart rate", "fetal HR", "FHR" and so on. In the process of semantic annotation, when words and phrases matching these terms are found in the te...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for extracting text information through secondary semantic annotation. The method includes the following steps that first, a task semantic dictionary is established according to task requirements in text to be processed; second, semantic annotation is performed on the text to be processed for the first time through a prestored general semantic dictionary; third, semantic annotation is performed on the text which is subjected to the first semantic annotation for the second time through the task semantic dictionary; fourth, the text which is subjected to the secondary semantic annotation is analyzed and information of the text is extracted through sublanguage grammar based on semantics, and the information comprises concepts required by tasks and relations corresponding to the concepts. The method solves the problems that an existing large-scale semantic dictionary is low in coverage and meanwhile semantic marks cannot serve specific sublanguage grammar, and a better solution is provided for establishing information extraction with task adaptability in the Chinese environment.

Description

technical field [0001] The invention relates to the field of information extraction, in particular to a text information extraction method using secondary semantic annotation. Background technique [0002] Today is an era of data explosion, but the use of information is limited by the form of data. At present, in some fields, a large amount of information mainly exists in the form of free text, such as medical records and inspection reports accumulated in the clinical field for a long time. Although these texts contain a large number of valuable information resources, there are technical obstacles to directly using these texts to serve large-scale data analysis. [0003] In order to cope with the challenges brought by the information explosion and to make better use of these massive text data, some automated tools are urgently needed to extract the information. This technology is usually called natural language processing technology. Natural language technology was born in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 段会龙李昊旻张寅升葛彩霞
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products