Patent document abstract-based automatic patent classification method

A patent document, automatic classification technology, applied in the application field of computer analysis technology, can solve the problems of narrow classification method, increased manpower and material resources, difficulty in ensuring classification consistency and accuracy, etc., to improve classification effect, reduce calculation amount, The ideal effect of classification

Inactive Publication Date: 2016-07-27
ZHENJIANG CHANGYUAN INFORMATION TECH
View PDF2 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Under normal circumstances, such a classification method is often too narrow, and it needs to rely on relevant experts to manually read the content of the application. The rapid increase of patent documents i

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Patent document abstract-based automatic patent classification method
  • Patent document abstract-based automatic patent classification method
  • Patent document abstract-based automatic patent classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] In this example, according to the IPC category table, 1,000 patent samples are extracted from parts A-H, and the samples are evenly selected according to the actual category at the same category level. The specific execution steps are as follows:

[0057] (1) Obtain the description of the IPC category, perform word segmentation and part-of-speech tagging on the description, and remove stop words. Here, the ICTCLAS word segmentation tool of the Chinese Academy of Sciences is used. After manually correcting the word segmentation results, a user dictionary is built.

[0058] (2) Perform format conversion, title and abstract extraction on patent samples, and perform Chinese word segmentation and part-of-speech tagging on titles and abstracts. Use regular expressions to remove stop words, function words, conjunctions and other words that are not very useful for patent classification, and only keep nouns, adjectives, and verbs.

[0059] (3) Construct an inverted index file. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a patent document abstract-based automatic patent classification method. The method comprises the following steps: dictionary construction, generation of various layers and classes or feature vectors of an IPC, patent text feature selection, patent text vectorization, SVM-based classification model construction and to-be-classified patent classification. In the patent text feature selection, the information contained in the patent titles and abstracts are fully utilized so that the calculation amount of the patent feature selection is remarkably decreased; in the calculation process of the patent text vectorization, a TF-IDF calculation method is improved, and the part of speech weights and position weights of words are added, so that the classification effect is further improved; and in the patent classification, a layer-based classification method is adopted, and the method fully utilizes the advantages of an SVM classifier and a KNN classifier, so that the disadvantages that the SVM classifier is overmuch in training and the KNN classifier is large in calculation amount are overcome.

Description

technical field [0001] The invention belongs to the application field of computer analysis technology of patent documents, and in particular relates to a patent classification method using patent abstracts. Background technique [0002] Patent refers to the general term of various types of specifications published by national patent offices or international patent organizations, and is the main body of patent documents. According to the statistics of the World Intellectual Property Organization (WIPO), 70% to 90% of the world's inventions first appear in patent documents, rather than in other media such as magazines and papers. In addition, in order to protect their own interests, enterprises will apply for patents as early as possible. Patents often concentrate the most active and advanced technologies, including 90% to 95% of the world's technical information. At the same time, for the convenience of examination, patent documents are often written in more detail. Compared...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/27G06K9/62
CPCG06F40/216G06F40/289G06F18/2411
Inventor 彭彦朱玉全李竞何峰余飞
Owner ZHENJIANG CHANGYUAN INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products