Technical literature information extraction method and system and storage medium
A technology for technical literature and information extraction, which is applied in digital data information retrieval, unstructured text data retrieval, text database browsing/visualization, etc. It can solve the problems of cumbersome steps and complicated operations, and achieve the effect of concise operation steps.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0068] like figure 1 As shown, this embodiment provides a method for extracting technical literature information, including the following steps:
[0069] S1. According to the search text input by the user, retrieve several pieces of technical literature information corresponding to the search text on the literature search website;
[0070] S2. Preprocessing the pieces of technical document information to obtain a summary list of technical information;
[0071] Since different literature websites have different rules for exporting technical literature information, for example: HowNet contains a maximum of 500 pieces of technical literature information. In order to facilitate the extraction of all technical literature information in the current technical field, it is necessary to summarize all technical literature information into one list file, such as figure 2 As shown, the preprocessing of the several pieces of technical literature information in step S2 specifically inclu...
Embodiment 2
[0111] In addition, the method of embodiment 1 of the present invention can also be by means of Figure 10 The architecture of the technical literature information extraction system shown is realized. Figure 10 The architecture of the technical literature information extraction system is shown. like Figure 10 As shown, the technical document information extraction system may include a document information retrieval module 1, a document information processing module 2, a comprehensive chart generation module 3 and a specific chart generation module 4; some modules may also have subunits for realizing their functions, for example, in Document information processing module 2 also includes merging unit 21, deduplication unit 22 and normalization unit 23, and comprehensive chart generation module 3 also includes list generation unit 31, information list generation unit 32, histogram generation unit 33 and relationship diagram generation The unit 34 also includes a scholar resea...
Embodiment 3
[0117] The bibliographic information files of several documents downloaded from HowNet are stored in the bibliographic information database, and then the bibliographic information files to be analyzed in the bibliographic information database are processed by the bibliographic processing module to obtain a bibliographic information Summarize the xls file, which includes the title, author, unit, source of literature, keywords, abstract, publication time and other information of the literature to be analyzed.
[0118] Because HowNet can only store 500 pieces of bibliographic information in one file, and with the continuous update of the search formula by relevant personnel, repeated redundant information will appear in the information bibliography, which will increase the processing time of the information, so bibliographic processing is adopted. The module merges and deduplicates the bibliographic information files to be analyzed.
[0119] The bibliography processing module mai...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com