Unlock instant, AI-driven research and patent intelligence for your innovation.

An Entity Disambiguation Method in Complex Chinese Text

An entity and disambiguation technology, applied in the information field, to achieve the effect of accurate linking, improving entity recall rate, and improving link accuracy rate

Active Publication Date: 2022-07-19
BEIJING UNIV OF POSTS & TELECOMM
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the purpose of the present invention is to provide a method for entity disambiguation in complex Chinese texts, which can effectively solve the problem of entity ambiguity in the field of complex Chinese texts, and improve the entity recall rate and entity link accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Entity Disambiguation Method in Complex Chinese Text
  • An Entity Disambiguation Method in Complex Chinese Text
  • An Entity Disambiguation Method in Complex Chinese Text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings.

[0018] like figure 1 As shown, an entity disambiguation method in a complex Chinese text of the present invention includes:

[0019] Step 1: Extract all entity references to be disambiguated from the Chinese text to be disambiguated;

[0020] Step 2: Adopt entity retrieval technology to select several entities from the entity knowledge base for each entity to be disambiguated as pre-candidate entities, and all pre-candidate entities constitute a pre-candidate entity set for each entity to be disambiguated, Then calculate the first similarity between each entity reference to be disambiguated and each pre-candidate entity in its pre-candidate entity set, and select several pre-candidate entities as candidate entities according to the first similarity. The candi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An entity disambiguation method in complex Chinese text, comprising: extracting all entity references to be disambiguated from the Chinese text to be disambiguated; using entity retrieval technology to select a number of entity references to be disambiguated from an entity knowledge base The entity is used as a pre-candidate entity, and then the first similarity between the reference of each entity to be disambiguated and each pre-candidate entity is calculated, and several entities are selected as candidate entities according to the first similarity; the reference of each entity to be disambiguated and the Disambiguation similarity of each candidate entity, and determine whether the maximum value of the disambiguation similarity between each entity reference to be disambiguated and all candidate entities is greater than the disambiguation similarity threshold. If so, the entity reference to be disambiguated is valid Linking entities, linking the entity reference to be disambiguated to the candidate entity corresponding to the maximum disambiguation similarity in the entity knowledge base. The invention belongs to the field of information technology, can effectively solve the problem of entity ambiguity in the complex Chinese text field, and improve the entity recall rate and the entity link accuracy rate.

Description

technical field [0001] The invention relates to an entity disambiguation method in complex Chinese text, and belongs to the field of information technology. Background technique [0002] There are widespread ambiguity and irregularity problems in natural language. For example, irregularities such as abbreviations, abbreviations, and language usage habits of words will cause the same words to express different meanings or different words to express the same meaning in different language environments. The Chinese language and culture are extensive and profound, with richer semantics and expressions, especially in literary works such as novels, which often have a large number of characters, scenes and intricate organizational structures in the text. Entities such as characters, scenes, and organizations in these novels also bring many ambiguity problems, which bring great challenges to many downstream tasks of natural language processing based on novels. [0003] Although ther...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/279G06F40/216G06F40/30G06N3/04G06N3/08
CPCG06F40/279G06F40/216G06F40/30G06N3/084G06N3/044G06N3/045
Inventor 王玉龙王闯刘同存王纯张乐剑王晶
Owner BEIJING UNIV OF POSTS & TELECOMM