Method and device for name disambiguation

A name and disambiguation technology, applied in instruments, network data indexing, and other database retrieval, etc., can solve problems such as unsuitable large-scale data processing, scattered network resources, and cascading information errors, and achieve small storage space and high performance Speed, the effect of improving comparative efficiency

Active Publication Date: 2018-12-28
INST OF SCI & TECHN INFORMATION OF CHINA
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The process of traditional name disambiguation in the field of entity disambiguation includes: feature collection and selection, clustering algorithm; in the feature selection stage, it is necessary to combine Internet knowledge resources to extract more character entity features, or construct the name to be disambiguated Social network; because the extraction of these features needs to rely on network resources, but the network resources are scattered and not necessarily accurate, and its wrong information can easily lead to error cascades, so the disambiguation effect is not good
In the clustering algorithm stage, this stage is realized by improving the clustering algorithm or using multiple clustering methods, but this process needs to manually set the threshold or the number of categories, and it takes time to run when clustering algorithms are used for large-scale text data. Long, it is difficult to apply in the actual system; in addition, when adding text information, it is necessary to re-cluster all the text in the database, so it is not suitable for large-scale data processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for name disambiguation
  • Method and device for name disambiguation
  • Method and device for name disambiguation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0076] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0077] Those skilled in the art will understand that unless otherwise stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the word "comprising" used in the description of the present invention refers to the presence of said features, integers, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and / or groups thereof. It will be unders...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a name disambiguation method and apparatus. The method comprises the following steps: preprocessing full-text information of names to be disambiguated so as to extract semantic features of the full-text information; according to the semantic features, generating semantic fingerprints of the full-text information of the names to be disambiguated, including mail fingerprints, coauthor fingerprints, mechanism fingerprints and text fingerprints; through comparing the full-text information of the names to be disambiguated with semantic fingerprints having same-name full-text information as the names to be disambiguated in a preset semantic fingerprint database, determining similarity between the full-text information of the names to be disambiguated and the semantic fingerprints having the same-name full-text information as the names to be disambiguated in the preset semantic fingerprint database; and according to the semantic fingerprint similarity, determining a name group after disambiguation which the semantic fingerprints of the full-text information of the names to be disambiguated belongs to. By using such a method, while name disambiguation accuracy is ensured, the name disambiguation speed is improved, and increment name disambiguation is supported.

Description

technical field [0001] The present invention relates to the field of entity disambiguation, in particular, the present invention relates to a name disambiguation method and device. Background technique [0002] In recent years, with the development of computer technology and the popularization and application of the Internet, more and more information is available on the Internet. The rapid growth of information makes us obtain rich information content, but also creates the problem of how to quickly obtain the information we need. As users' demand for high-quality search continues to increase, and people's information search is also growing. Users hope to obtain the basic information of the person they want to know through search. Due to the common phenomenon of the same name, the name is often highly ambiguous, and the quality of the current search results is not satisfactory. So it becomes more difficult to obtain information about a specific person. And name disambigua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G06F17/30
CPCG06F16/951G06F40/30
Inventor 韩红旗姚长青付媛李琳娜于永胜
Owner INST OF SCI & TECHN INFORMATION OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products