Conditional random fields (CRF) based acronym expansion explanation recognition method

A conditional random field and acronym technology, applied in the field of machine learning, can solve problems such as ignoring, consuming human resources, and complicating the extraction rules of inductive acronyms

Inactive Publication Date: 2014-05-07
NANKAI UNIV
View PDF2 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] 1) Inductive acronym extraction rules are complicated and consume human resources
[0005] 2) Ignore the phenomenon that most acronyms and their extended interpretations do not appear in pairs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Conditional random fields (CRF) based acronym expansion explanation recognition method
  • Conditional random fields (CRF) based acronym expansion explanation recognition method
  • Conditional random fields (CRF) based acronym expansion explanation recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] In order to make the object, technical solution and advantages of the present invention clearer, the implementation manner of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0021] In order to better identify the extended explanations of acronyms in the sequence text, the present invention models the recognition task of the traditional pair of acronyms and extended explanations as a sequence labeling task, and uses a conditional random field to identify the acronyms Extended explanation.

[0022] In order to model the acronym extension interpretation recognition task as a sequence recognition problem, NP labels are used to describe the label categories of the words to be recognized. Use "B" to indicate the beginning of the extended interpretation, "I" to indicate other words in the extended interpretation, and other irrelevant words are marked as "O". A complete explanation of an acronym expansion should ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a CRF based acronym expansion explanation recognition method and relates to the machine learning field and acronym recognition task. According to the method, traditional acronym and expansion explanation pair recognition tasks are modeled into a sequence mark task, and the structural model of CRF is used for recognizing expansion explanations of the acronyms. Three types of characteristics comprising spelling characteristics, corresponding characteristics of the acronyms and context relevant characteristics are designed and extracted, and the model is improved. According to the method, the model considers acronym expansion explanation context information and structure information and has the potential sparse characteristic learning capacity, various characteristic functions and combination methods are further designed, and accordingly, possible expansion explanations are recognized from text sequences.

Description

technical field [0001] The invention relates to the field of machine learning and abbreviation recognition tasks, in particular to an acronym extension explanation recognition method based on a conditional random field. Background technique [0002] At present, the automatic recognition and extraction methods for English abbreviations and their explanations mainly include rule-based methods and fully supervised machine learning methods. These methods usually require that acronyms must appear in the text, and then design different rules and features to match possible extended explanation candidates within a certain window size near the acronym. [0003] In the process of realizing the present invention, the inventor finds that at least the following disadvantages and deficiencies exist in the prior art methods: [0004] 1) The rules for extracting inductive acronyms are complicated and consume human resources. [0005] 2) Ignore the phenomenon that most acronyms and their e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F40/279G06N5/025
Inventor 刘杰陈季梦黄亚楼刘天笔王嫄
Owner NANKAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products