A Name Disambiguation Method for Chinese Authors in English Documents

A technology of authors and documents, applied in the field of disambiguation of Chinese author names, can solve problems that are difficult to solve, time-consuming, and lose the characteristics of Chinese characters

Inactive Publication Date: 2019-08-16
ZHEJIANG UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

After the Chinese name is converted into pinyin (or English name), the characteristics of Chinese characters are lost, and the probability of duplicate names is greatly increased, which makes the problem of duplicate names of Chinese authors in English documents more complicated and difficult to solve
The problem of the same name of the author has become an important factor that interferes with the accuracy of the search. It often takes a lot of time to judge whether the author of the same name of two papers is the same person.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Name Disambiguation Method for Chinese Authors in English Documents
  • A Name Disambiguation Method for Chinese Authors in English Documents
  • A Name Disambiguation Method for Chinese Authors in English Documents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The technical solution of the present invention will be further described below with reference to the drawings.

[0022] In the two English documents, the names of two Chinese authors with the same name (in fact may not be the same person) mainly include the following situations:

[0023] (1) The pinyin spelling of the full names of the two authors is the same, "LI,JIANG" and "LI,JIANG";

[0024] (2) Among the two authors, one of the authors has only an acronym, for example, "LI,J." and "LI,JIANG".

[0025] (3) The names of both authors only have initials, for example, "LI,J." and "LI,J.".

[0026] In order to solve the problem of duplicate names of a large number of Chinese authors in English documents caused by "one (pin) phonetic and multiple (Chinese) characters", the present invention provides a comprehensive consideration of the similarity of the author's institution and discipline characteristics, co-authoring relationship, and citation relationship The name disambiguatio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a name disambiguation method orienting Chinese authors in English literature. The method mainly comprises the following steps of (1) extracting personnel information of authors in bibliographical reference information of the English literature, and building the collaboration relationship, the reference relationship and the like between the authors; (2) comparing e-mail addresses of the authors with the same names; (3) calculating the similarity of the affiliated units and subjects of the authors with the same names; (4) calculating the similarity of the collaboration relationship of the authors with the same names; (5) calculating the similarity of the reference relationship of the authors with the same names; (6) performing the name disambiguation on the basis of the three-similarity clustering calculated in the step (3) to the step (5).

Description

Technical field [0001] The invention relates to a method for disambiguating the names of Chinese authors in English documents. [0002] technical background [0003] The issue of author's duplicate name has a long history and has always been a hot topic in the fields of information science and computer science. In recent years, as the number of international papers published in China has risen sharply, the attention of Chinese authors in the international academic community has continued to increase. At the same time, the problem of duplicate names of Chinese authors has become increasingly prominent in English academic literature databases. After Chinese names are converted into pinyin (or English names), the characteristics of Chinese characters are lost, and the probability of duplicate names is greatly increased. This makes the problem of duplicate names of Chinese authors in English literature more complicated and difficult to solve. The problem of the same author name becom...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/28
CPCG06F16/285
Inventor 李江杨斯杰
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products