Entity Name Disambiguation Method Based on Keyword Extraction

A keyword and entity technology, applied in the field of entity name disambiguation based on keyword extraction, to achieve high-precision results

Active Publication Date: 2021-08-24
BEIHANG UNIV
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Existing disambiguation technologies target abbreviations and polyphonic characters, but there are basically no cases where the original text is processed to obtain a preliminary entity name and then similarity calculation with keywords to disambiguate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Entity Name Disambiguation Method Based on Keyword Extraction
  • Entity Name Disambiguation Method Based on Keyword Extraction
  • Entity Name Disambiguation Method Based on Keyword Extraction

Examples

Experimental program
Comparison scheme
Effect test

specific example

[0068] step one:

[0069] Starting status: Unmisstable information Original text such as entity A tags (for example, national optical appliances), unimpeded information original text, such as entity A Labeling text A (Guoguang Company and Didi Electric generate market trade disputes), b (the number of turns-shaped changes in China's optoelectronic categories), C (the country supports the development of new energy optoelectronic equipment);

[0070] The processing means is: the deletion filtering of the filtered and invalid connection words of the non-text portion of the text A, B, and C;

[0071] That is the same end: Get text a '(Guoguang Company Madi Electric Co., Ltd.), B' (China's Optoelectronics Class Class During Town Change), C '(the country strongly supports new energy optoelectronic equipment development);

[0072] Step 2:

[0073] Start state: The final state of the previous step;

[0074] The operation is: Will Text A '(Guoguang Company Madi Electric Co., Ltd.), B' (Chi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a method based on keyword extraction methods, including: text pre -processing and negative word filtration phase, word marking and analysis phase, keyword extraction combination comparison, the purpose is to scattered and heterogeneous heterogeneous qualityFind a target text that is related to the entity and the entity occupies an important position in the text in the severe Internet text; the keywords and entity names in the extract text can be combined to determine whether it is related text.The matching of names when the name appears in the same text; the present invention combines multiple stages of processing steps, which greatly improves the accuracy of matching text with the physical name.

Description

Technical field [0001] The present invention relates to the field of natural language processing. More particularly, the present invention relates to a method for disambiguation entity name based on the extracted keyword. Background technique [0002] Named entity disambiguation is the basic study of natural language processing technology, has important practical value in semantic annotation, online recommendation systems, search engines and other Internet applications, so it has great significance for research entity disambiguation method name. [0003] Named Entity ambiguity means that for a given named entity alleged to have multiple meanings. When a plurality of named entity entities point, the background for the text selection is correct semantic entities named entity disambiguation main content. Lead to ambiguity named entities including naming diversity and ambiguity entities for two reasons. Entity refers to one named entity alleged diversity variety of expression, includ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/295G06F40/216G06F40/242G06K9/62
CPCG06F18/295
Inventor 吴俊杰部慧陈禹州李晔林罗炎林
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products