Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Disambiguation Method for Author Names Based on Rule Matching and Machine Learning

A machine learning and paper author technology, applied in the field of data processing, can solve problems such as the inability to apply large-scale data sets, and achieve the effect of improving the disambiguation effect and accuracy of the paper

Active Publication Date: 2021-02-19
PEKING UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method also needs to manually formulate rules in the initial stage, which cannot be applied to large-scale data sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Disambiguation Method for Author Names Based on Rule Matching and Machine Learning
  • A Disambiguation Method for Author Names Based on Rule Matching and Machine Learning
  • A Disambiguation Method for Author Names Based on Rule Matching and Machine Learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] In order to make the technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0045] In order to solve the above-mentioned problems existing in the prior art, in the technical solution of the present invention, the candidate author will be determined first based on the artificially constructed name matching rules. author, title, abstract, keywords and publication name, etc.) to extract features, then select an appropriate machine learning algorithm for disambiguation, and determine the author of the paper to be processed.

[0046] figure 1 It is a flowchart of a method for disambiguating author names of papers based on rule matching and machine learning in an embodiment of the present invention. Such as figure 1 As shown, the author name disambiguation method based on rule matching and machine learning in the embodiment ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a disambiguation method for author names of papers based on rule matching and machine learning. The method includes: preprocessing the paper information to be processed; matching the author name in the preprocessed paper information with the pre-built name matching rule to obtain a set of candidate authors; determining according to the number of candidate authors in the set of candidate authors The author of the pending paper. The application of the invention can improve the disambiguation effect of papers and effectively improve the accuracy of author name disambiguation.

Description

technical field [0001] The present application relates to the technical field of data processing, in particular to a method for disambiguating author names of papers based on rule matching and machine learning. Background technique [0002] Universities and scientific research institutions need to count the paper information of the authors of their units, and archive the papers of the unit, so as to establish the literature database of the unit. However, the current method of sorting papers of employees in the unit is not perfect. Generally, only the title and signed author of the paper are recorded, but not archived by individual author. Therefore, it is difficult to evaluate the scientific research achievements and level of scientific researchers of the unit, and it is also difficult to provide search support for specific scholars' papers to the outside world. [0003] Author name disambiguation is a thorny problem in automatic archiving of papers. On the one hand, there...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/35
Inventor 邓可君华凯邓昌明姜宁袁玲彭一明张治坤
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products