Supercharge Your Innovation With Domain-Expert AI Agents!

Method, system, and equipment for disambiguating characters with duplicate names that improve the efficiency of article-by-article filing

A technology for person names and characters, applied in the field of person duplication disambiguation, can solve the problems of low feature collection efficiency and low computational efficiency, and achieve the effect of improving robustness and accuracy, improving computational efficiency, and solving efficiency problems.

Active Publication Date: 2021-08-17
GLOBAL TONE COMM TECH
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Through the above analysis, the existing problems and defects of the existing technology are: the existing methods for disambiguation of characters with the same name, there are problems of low feature collection efficiency and low calculation efficiency in the process of incremental file-by-article archiving

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, system, and equipment for disambiguating characters with duplicate names that improve the efficiency of article-by-article filing
  • Method, system, and equipment for disambiguating characters with duplicate names that improve the efficiency of article-by-article filing
  • Method, system, and equipment for disambiguating characters with duplicate names that improve the efficiency of article-by-article filing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0060] Aiming at the problems existing in the prior art, the present invention provides a method, system, and device for disambiguating characters with duplicate names that improve the efficiency of file-by-article filing. The present invention will be described in detail below with reference to the accompanying drawings.

[0061] Such asfigure 1 As shown, the method for disambiguating people with the same name provided by the embodiment of the present invention to improve the efficiency of filing one by one includes the following steps:

[0062] S101: Divide names into groups according to names;

[0063] S102: Obtain the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of disambiguation of the same name, and discloses a method, system and device for disambiguating characters with the same name that improves the efficiency of file-by-article disambiguation, divides name groups according to names; obtains text sets corresponding to name groups; splits texts based on rules Set, divide text group; Calculate the similarity between the text groups; The homonym disambiguation system of the efficiency of described article-by-article filing includes: name group division module, text collection acquisition module, large group splitting module, storage module, training set Building blocks, model training and optimization modules, sub-model discriminator building blocks, prediction modules, ID selector building blocks, ID selector training and optimization modules, classification prediction modules. The present invention adopts the method of rule + model to improve the calculation efficiency of the whole model; reduces the number of candidate matches in units of groups to improve calculation efficiency; uses multi-model fusion technology, and each sub-model uses different features to improve the robustness and prediction of the model precision.

Description

technical field [0001] The invention belongs to the technical field of character duplicate name disambiguation, and in particular relates to a method, system and device for character duplicate name disambiguation that improve the efficiency of filing articles one by one. Background technique [0002] At present, the phenomenon of people with the same name is common in all countries in the world, so it is difficult to carry out big data analysis based on personnel. Even the simplest work of counting the number of patent inventors in a certain city cannot be easily carried out. Based on the cluster analysis of personnel, it is more difficult to implement the analysis of personnel flow, so the work of disambiguation of duplicate names in the literature database is particularly important. [0003] Currently commonly used technical means such as: rule-based disambiguation of characters with duplicate names, dynamic programming-based disambiguation of characters with duplicate nam...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/11G06F16/31G06F16/35G06N20/00
CPCG06F16/113G06F16/31G06F16/35G06N20/00
Inventor 杨万征蔡超程国艮
Owner GLOBAL TONE COMM TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More