Method for extracting author affiliation information of English literature published by Chinese authors

An information extraction, English technology, applied in the fields of instruments, computing, electrical and digital data processing, etc., can solve problems such as irregular organization names, and achieve the effect of ensuring accuracy, high recall rate and accuracy rate

Active Publication Date: 2015-09-02
PEKING UNIV
View PDF12 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] At present, the research on the irregularity of institution names in English literature focuses on how to avoid the influence of institution name irregularities by constructing retrieval formulas, and the causes and improvement of the irregularity phenomenon. No scholars have discussed how to pass the irregular institution names Name of agency that technically handles transition to specification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting author affiliation information of English literature published by Chinese authors
  • Method for extracting author affiliation information of English literature published by Chinese authors
  • Method for extracting author affiliation information of English literature published by Chinese authors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0070] The following takes the WOS literature database as an example to elaborate on the specific operation process.

[0071] Step 1: Download the bibliographic information of all English papers published by Chinese authors.

[0072] (1) First search English papers published by Chinese authors by constructing a search formula, and search according to "CU=Peopels R China" in the advanced search interface of WOS.

[0073] Use the export function provided by WOS itself, select the "Save as other file format" option, select "Full Record" for the "Record Content" option, and select "Tab Delimited" for the "File Format" option to export bibliographic information in batches. Among them, each line is a record, corresponding to the bibliographic information of a paper, including the title of the paper (TI), author name (AU), source publication (SO), author organization (C1), publication date (PD) and other fields , each of which has a corresponding field ID. Different fields in the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for extracting author affiliation information of English literature published by Chinese authors. The method is used for extracting the Chinese name information of the affiliations of the Chinese authors from an English literature library and includes: using web crawler to acquire the bibliographic data of all related English papers published by the Chinese authors from the English literature library; extracting paper titles, author affiliation information and publishing time from the bibliographic data; processing the author affiliation information to allow the author affiliation information to correspond to the standard Chinese names of author affiliations; saving the extracted paper titles, author affiliation information, publishing time and the standard Chinese names of the affiliations into a self-built database for follow-up inquiry and statistical counting. By the method, search result accuracy is guaranteed to a large degree, the process of manual affiliation information inquiring and checking is avoided, a user can inquire and statistically count the English literature information published by the affiliations, and high recall ratio and accuracy are achieved.

Description

technical field [0001] The invention relates to a technology and method for extracting information from texts, in particular to a method for accurately searching and counting English documents according to the Chinese name of the author's institution. Background technique [0002] Web of Science (WOS for short) is a database product developed by American Thomson Scientific Company based on WEB, including three major citation databases (SCI, SSCI and A&HCI) and two chemical databases (CCR, IC). Most of the excellent academic papers in various fields published by researchers from all over the world are included in this database, and many scholars also use the number of papers included in this database as one of the signs to measure their own level. Engineering Index (EI for short) is another well-known literature database retrieval system, which mainly collects literature in the field of engineering technology. [0003] In document databases such as WOS or EI, the name of the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30G06F17/28
Inventor 王继民郭鑫姜庆远王一博程煜华
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products