Name disambiguation method orienting Chinese authors in English literature

An author and document technology, applied in the field of Chinese author name disambiguation, which can solve the problems of increasing the probability of duplicate names, the complexity of Chinese author duplicate names, and the loss of Chinese characters.

Inactive Publication Date: 2017-01-04
ZHEJIANG UNIV
View PDF6 Cites 30 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

After the Chinese name is converted into pinyin (or English name), the characteristics of Chinese characters are lost, and the probability of duplicate names is greatly increased, which makes the problem of duplicate names of Chinese authors in English documents mor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Name disambiguation method orienting Chinese authors in English literature
  • Name disambiguation method orienting Chinese authors in English literature
  • Name disambiguation method orienting Chinese authors in English literature

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The technical solution of the present invention will be further described below with reference to the accompanying drawings.

[0022] In two English documents, two Chinese authors have the same name (they may not actually be the same person) mainly including the following situations:

[0023] (1) The pinyin spelling of the full names of the two authors is the same, "LI, JIANG" and "LI, JIANG";

[0024] (2) Among the two authors, one author's name has only initials, for example, "LI,J." and "LI,JIANG".

[0025] (3) Both authors have only initials, for example, "LI,J." and "LI,J."

[0026] In order to solve the problem of a large number of Chinese authors with duplicate names in English documents caused by "one (pin) sound and multiple (Chinese) characters", the present invention provides a method that comprehensively considers the similarity of the author's organization and subject characteristics, co-authorship relationship, and citation relationship. The name disambi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a name disambiguation method orienting Chinese authors in English literature. The method mainly comprises the following steps of (1) extracting personnel information of authors in bibliographical reference information of the English literature, and building the collaboration relationship, the reference relationship and the like between the authors; (2) comparing e-mail addresses of the authors with the same names; (3) calculating the similarity of the affiliated units and subjects of the authors with the same names; (4) calculating the similarity of the collaboration relationship of the authors with the same names; (5) calculating the similarity of the reference relationship of the authors with the same names; (6) performing the name disambiguation on the basis of the three-similarity clustering calculated in the step (3) to the step (5).

Description

technical field [0001] The invention relates to a disambiguation method for Chinese author names in English documents. [0002] technical background [0003] The problem of duplication of author names has a long history and has always been a hot topic in the fields of information science and computer science. In recent years, as the number of international papers published by China has risen sharply, the attention of Chinese authors in the international academic circle has continued to rise. At the same time, in English academic literature databases, the problem of duplicate names of Chinese authors has become increasingly prominent. After Chinese names are converted into pinyin (or English names), the characteristics of Chinese characters are lost, and the probability of duplicate names is greatly increased, which makes the problem of duplicate names of Chinese authors in English documents more complicated and difficult to solve. The problem of the same name of the author ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/285
Inventor 李江杨斯杰
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products