Method and device for extracting open category named entity by means of random walking on map

A random walk and named entity technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of not considering the impact of seeds, not considering the difference in quality of different templates, and not being able to calculate candidate entities well Confidence and other issues to achieve the effect of improving system performance

Active Publication Date: 2014-03-26
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The traditional template-based open category named entity extraction method does not consider the impact of seeds when calculating the confidence of candi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for extracting open category named entity by means of random walking on map
  • Method and device for extracting open category named entity by means of random walking on map
  • Method and device for extracting open category named entity by means of random walking on map

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0026] The basic idea of ​​the present invention is to rank the candidate entities extracted using the template according to their confidence by comprehensively considering the quality of the template and the confidence of the candidate entities, thereby improving the accuracy of the extraction results of the open category named entities.

[0027] For open category named entity extraction, the main difficulty is to calculate the confidence of candidate entities. The way to solve this problem is to comprehensively consider the confidence of candidate entities and the quality of templates. Candidate entities are obtained by matching templates, and there is the following relationship between candidate entities and templates:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for extracting an open category named entity by means of random walking on a map. The method comprises the steps that 1, a context, on a corpus, of a seed is analyzed to obtain a template; 2, the template is used for extracting a candidate entity from the corpus; 3, a map is structured according to the relation among a seed entity, the template and the candidate entity; 4, the confidence coefficient of the candidate entity is computed through the random walking algorithm on the map. The method can overcome the adverse effects on the computation of confidence coefficient of the candidate entity caused by different qualities of the template, and effectively improve the accuracy of extraction of the open category named entity. Experiments prove that the average accuracy of an extraction result is improved by 4.36%.

Description

technical field [0001] The invention relates to the technical field of natural language processing, and relates to a method and a device for extracting named entities of open categories from large-scale text corpus. Background technique [0002] Named entity conveys important information in human language, and its recognition and extraction is one of the key technologies in natural language processing research. The goal of open category named entity extraction technology is to extract open category named entities from massive, redundant, heterogeneous, and irregular network data, and then build an open category named entity list. These open class named entity lists have important uses in both industry and academia. Therefore, open category named entity extraction technology has important theoretical significance and practical value. [0003] Traditional open-category named entity extraction systems generally adopt a template method: the template is obtained by analyzing th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/367G06F40/295
Inventor 刘康赵军齐振宇
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products