Name disambiguation method and apparatus

A name and disambiguation technology, applied in special data processing applications, network data retrieval, instruments, etc., can solve problems such as unsuitable large-scale data processing, poor disambiguation effect, information error cascading, etc., and achieve small storage space , Reduce the complexity of the comparison, the effect of high execution speed

Active Publication Date: 2016-10-26
INST OF SCI & TECHN INFORMATION OF CHINA
View PDF8 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The process of traditional name disambiguation in the field of entity disambiguation includes: feature collection and selection, clustering algorithm; in the feature selection stage, it is necessary to combine Internet knowledge resources to extract more character entity features, or construct the name to be disambiguated Social network; because the extraction of these features needs to rely on network resources, but the network resources are scattered and not necessarily accurate, and its wrong information can easily lead to error cascades, so the disambiguation effect is not good
In the clustering algorithm stage, this stage is realized by improving the clustering algorithm or using multiple clustering methods, but this process needs to manually set the threshold or the number of categories, and it takes time to run when clustering algorithms are used for large-scale text data. Long, it is difficult to apply in the actual system; in addition, when adding text information, it is necessary to re-cluster all the text in the database, so it is not suitable for large-scale data processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Name disambiguation method and apparatus
  • Name disambiguation method and apparatus
  • Name disambiguation method and apparatus

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0076] The embodiments of the present invention are described in detail below. Examples of the embodiments are shown in the accompanying drawings, in which the same or similar reference numerals indicate the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the accompanying drawings are exemplary, and are only used to explain the present invention, and cannot be construed as limiting the present invention.

[0077] Those skilled in the art can understand that, unless specifically stated otherwise, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the term "comprising" used in the specification of the present invention refers to the presence of the described features, integers, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, eleme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a name disambiguation method and apparatus. The method comprises the following steps: preprocessing full-text information of names to be disambiguated so as to extract semantic features of the full-text information; according to the semantic features, generating semantic fingerprints of the full-text information of the names to be disambiguated, including mail fingerprints, coauthor fingerprints, mechanism fingerprints and text fingerprints; through comparing the full-text information of the names to be disambiguated with semantic fingerprints having same-name full-text information as the names to be disambiguated in a preset semantic fingerprint database, determining similarity between the full-text information of the names to be disambiguated and the semantic fingerprints having the same-name full-text information as the names to be disambiguated in the preset semantic fingerprint database; and according to the semantic fingerprint similarity, determining a name group after disambiguation which the semantic fingerprints of the full-text information of the names to be disambiguated belongs to. By using such a method, while name disambiguation accuracy is ensured, the name disambiguation speed is improved, and increment name disambiguation is supported.

Description

Technical field [0001] The present invention relates to the field of entity disambiguation. Specifically, the present invention relates to a method and device for name disambiguation. Background technique [0002] In recent years, with the development of computer technology and the popularization and application of the Internet, more and more information is available on the Internet. The rapid growth of information enables us to obtain a wealth of information content, but at the same time it also creates the problem of how to quickly obtain the information we need. As the user's demand for high-quality search continues to increase, and the search for character information is also increasing. Users hope to obtain the basic information of the person they want to know through search. Due to the widespread existence of the same name phenomenon, the name of the person is often very ambiguous, and the quality of the current search results is not satisfactory. Therefore, it becomes mo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/951G06F40/30
Inventor 韩红旗姚长青付媛李琳娜于永胜
Owner INST OF SCI & TECHN INFORMATION OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products