An incomplete patent automatic indexing method

An automatic indexing and patented technology, applied in metadata text retrieval, instrumentation, unstructured text data retrieval, etc., can solve the problems of high cost of human resources, low efficiency of manual indexing methods, unsatisfactory accuracy and duplicate checking rate And other issues

Active Publication Date: 2019-05-07
CHONGQING INST OF GREEN & INTELLIGENT TECH CHINESE ACADEMY OF SCI
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although these methods can achieve the purpose of indexing and classifying patents to a certain extent, the manual indexing method is inefficient and the cost of human res

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An incomplete patent automatic indexing method
  • An incomplete patent automatic indexing method
  • An incomplete patent automatic indexing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0028] In view of the lack of specific knowledge discovery and mining models and methods for pharmaceutical patents, this embodiment provides a self-labeled classification method for pharmaceutical English patents, combined with figure 2 , the method consists of the following steps:

[0029] step one:

[0030] Aiming at the small amount of manual indexing data, artificial indexing data plus Thomson Reuters data are used as the experimental set. The indexing results are shown in Table 1. The experimental set is divided into training set and training set according to the ratio of 8:2. For the verification set, here we do not impose too many completeness constraints on the patent itself, and only require the patent itself to have any of the three items of abstract, claims, and instructions as training data.

[0031] Indexing results of the training set in Table 1

[0032] NME

DDD

NCP

NAM

BLA

NFP

BTN

NUS

NDT

NCF

MIPs

NSP

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an incomplete patent automatic indexing method, and belongs to the field of big data artificial intelligence deep learning. The method comprises the following steps: S1, selecting a patent data source, and reading patent abstract, claims, instructions and other related text data; S2, carrying out vector training by adopting Word2ver and GloVe word vector technologies to generate a lexicon; S3, preprocessing the data by adopting an ISRI word stem extractor; S4, extracting patent features of the experiment set by using CNN and LSTM respectively in combination with the word bank, establishing a feature model, and verifying and selecting the word bank and the feature model; And S5: indexing patents in the test set one by one in combination with the selected word libraryand the feature model. The invention provides an incomplete patent automatic indexing method, which can accurately, comprehensively and quickly finish a patent classification task, is beneficial to construction of an intelligent analysis decision system of patent big data, and is beneficial to effective integration, deep analysis and mining of patent resources and innovation research of an application mode.

Description

technical field [0001] The invention relates to a patent automatic indexing method, which belongs to the field of big data artificial intelligence, and is especially suitable for large-scale patent indexing processing. Background technique [0002] Looking at the world today, countries attach great importance to the cultivation and development of strategic emerging industries, strive to seize opportunities in a new round of higher-level competition, and actively create and effectively use intellectual property rights. Especially in the field of biomedical industry, intellectual property big data has become the focus of competition among countries. Usually, big data usually contains a wealth of knowledge and value. Through in-depth analysis and mining, it can provide effective precise scientific analysis and decision support for various industries or fields. As an important carrier of intellectual property rights, patents have become an important strategic resource of big da...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/38G06F16/35G06Q50/18
CPCY02D10/00
Inventor 史晓雨冀倩倩尚明生
Owner CHONGQING INST OF GREEN & INTELLIGENT TECH CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products