Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Technical literature information extraction method and system and storage medium

A technology for technical literature and information extraction, which is applied in digital data information retrieval, unstructured text data retrieval, text database browsing/visualization, etc. It can solve the problems of cumbersome steps and complicated operations, and achieve the effect of concise operation steps.

Active Publication Date: 2021-11-30
北京市科学技术研究院
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to provide a method, system and storage medium for extracting technical document information, which solves the problems that multiple softwares need to be used together in the traditional document information extraction method, the steps are cumbersome, and the operation is complicated.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Technical literature information extraction method and system and storage medium
  • Technical literature information extraction method and system and storage medium
  • Technical literature information extraction method and system and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] like figure 1 As shown, this embodiment provides a method for extracting technical literature information, including the following steps:

[0069] S1. According to the search text input by the user, retrieve several pieces of technical literature information corresponding to the search text on the literature search website;

[0070] S2. Preprocessing the pieces of technical document information to obtain a summary list of technical information;

[0071] Since different literature websites have different rules for exporting technical literature information, for example: HowNet contains a maximum of 500 pieces of technical literature information. In order to facilitate the extraction of all technical literature information in the current technical field, it is necessary to summarize all technical literature information into one list file, such as figure 2 As shown, the preprocessing of the several pieces of technical literature information in step S2 specifically inclu...

Embodiment 2

[0111] In addition, the method of embodiment 1 of the present invention can also be by means of Figure 10 The architecture of the technical literature information extraction system shown is realized. Figure 10 The architecture of the technical literature information extraction system is shown. like Figure 10 As shown, the technical document information extraction system may include a document information retrieval module 1, a document information processing module 2, a comprehensive chart generation module 3 and a specific chart generation module 4; some modules may also have subunits for realizing their functions, for example, in Document information processing module 2 also includes merging unit 21, deduplication unit 22 and normalization unit 23, and comprehensive chart generation module 3 also includes list generation unit 31, information list generation unit 32, histogram generation unit 33 and relationship diagram generation The unit 34 also includes a scholar resea...

Embodiment 3

[0117] The bibliographic information files of several documents downloaded from HowNet are stored in the bibliographic information database, and then the bibliographic information files to be analyzed in the bibliographic information database are processed by the bibliographic processing module to obtain a bibliographic information Summarize the xls file, which includes the title, author, unit, source of literature, keywords, abstract, publication time and other information of the literature to be analyzed.

[0118] Because HowNet can only store 500 pieces of bibliographic information in one file, and with the continuous update of the search formula by relevant personnel, repeated redundant information will appear in the information bibliography, which will increase the processing time of the information, so bibliographic processing is adopted. The module merges and deduplicates the bibliographic information files to be analyzed.

[0119] The bibliography processing module mai...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a technical literature information extraction method and system and a storage medium, wherein technical literatures are retrieved according to the technical field which technicians want to research, the technical literatures are preprocessed, and a comprehensive chart data set can be generated according to a summarized list obtained after preprocessing, a specific chart data set can be generated according to specific scholars or institutions or keywords interested by the technicians, and the comprehensive chart data set and the specific chart data set are displayed in the form of a data table and a visual graph; therefore, technicians can conveniently analyze the development direction of the current field or the development direction of specific research scholars, research institutions or keywords. In the whole process, a technician only needs to input a to-be-studied direction or to-be-studied scholars, organizations or keywords, multiple tools do not need to be used, the operation steps are simple, and the research personnel can be well supported to carry out subject field development analysis.

Description

technical field [0001] The present invention relates to the technical field of document information extraction, in particular to a method, system and storage medium for extracting technical document information. Background technique [0002] At present, the development analysis of subject areas is one of the research focuses of technical workers, which can enable technical personnel and industry decision makers to grasp the progress, dynamics and trends of the field in a relatively short period of time, thereby playing a role of decision support, which is beneficial to Relevant researchers and research institutions timely and accurately grasp the context and opportunities of development, and assist and support decision-makers or decision-making departments in making decisions. [0003] In order to meet the dual needs of users in various disciplines to obtain their own needs from massive literature information and to process these information in batches, domestic and foreign ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/34G06F16/36G06F16/383
CPCG06F16/345G06F16/367G06F16/383
Inventor 熊蕊
Owner 北京市科学技术研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products