Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Method for disambiguating inventor names in patent data

A technology for inventors and names, applied in the field of inventor name disambiguation, to achieve the effects of improving computing efficiency, reducing computing costs, and reducing the amount of computing

Active Publication Date: 2021-08-13
西安循数信息科技有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the particularity of Chinese characters, a new algorithm needs to be designed to disambiguate the inventor's name based on Chinese characters, and there are limitations in the application of the original algorithm.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for disambiguating inventor names in patent data
  • Method for disambiguating inventor names in patent data
  • Method for disambiguating inventor names in patent data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] The present invention provides a method for disambiguation of inventor names in patent data, specifically as figure 1 shown, including the following steps:

[0053] Step 1. Extract the inventor's name collection, the inventor's collaborator collection (i.e. the patent applicant collection), the inventor's application unit collection and the knowledge classification number collection in the patent data. In this embodiment, the inventor is also called the inventor Home;

[0054]Step 2. Because the subsequent calculations need to involve the comparison of two elements in the set, the complexity is related to the size of the set. If the set is too large, the calculation time will be too long, so the set needs to be reduced according to the actual situation: if there are no repeated inventions in the patent data If the number of elements in the person name collection is greater than 10,000, the inventor name collection is initially filtered; the degree of similarity of the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for disambiguating inventor names in patent data, which belongs to the field of information processing and comprises the following steps: extracting an inventor name set, a partner set, an application unit set and a knowledge classification number set in the patent data; preliminarily filtering the inventor name set; calculating the similarity degree of knowledge classification numbers among inventors in the inventor name set, and if the similarity degree of the knowledge classification numbers among the inventors is greater than a set threshold value a, adding the name pair into a potential similar set Pi; extracting set elements from the data in the potential similar set Pi by adopting a traversal method, and calculating name similarity between two elements; further judging the result data by adopting a random forest algorithm; and displaying the result data to a user for selection in a visual interface operation mode, selecting whether change and modification are needed by the user, and replacing the data in the result set after the user submits the result data. According to the method, most useless data can be quickly screened out, and the calculation amount is reduced.

Description

technical field [0001] The invention belongs to the field of information processing, and in particular relates to a method for disambiguation of names of inventors in patent data. Background technique [0002] Inventor name disambiguation is mainly used to deal with inventor name ambiguity caused by input or coding errors in patent data. For example, in the patent data, the applicants are Zhang Ming and Zhang Riyue. In fact, they are the same inventor, but the applicants are divided into two due to errors in data input. Such errors will affect the patent data. Network analysis of human subjects. Therefore, it is necessary to use the inventor name disambiguation algorithm to deal with such errors. [0003] The inventor disambiguation technology in the existing patent data of the same company is mainly completed by the Bayesian disambiguation model developed by the team of Professor Fleming of the University of California, USA, which uses the prior probability and the poster...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/216G06F40/268G06K9/62
CPCG06F40/216G06F40/268G06F18/22G06F18/214G06F18/24323
Inventor 孙笑明熊旺王雅兰马浩智刘斌
Owner 西安循数信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products