Specification-based patent classification method

A patent classification and specification technology, applied in the fields of text database clustering/classification, instruments, electronic digital data processing, etc., can solve the problems of difficulty in ensuring the consistency and accuracy of classification results, consuming a lot of manpower and material resources, and achieve rich content. , improve the classification accuracy, reduce the effect of noise interference

Active Publication Date: 2017-09-01
JIANGSU UNIV
View PDF1 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For the patent application that is about to be submitted, its classification number is unknown and needs to be determined. For this, the current common practice is to determine according to the field of the patent description object or the content of the patent, and it is necessary to rely on relevant experts to manually read the application form. With the rapid increase in the number of patent applications (the number of patent applications per year is close to 1 million), this method requires a lot of manpower and material resources, and the limitations of the experts' own knowledge make it difficult to ensure the consistency and accuracy of the classification results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Specification-based patent classification method
  • Specification-based patent classification method
  • Specification-based patent classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The patent literature is taken as an example below to describe the patent classification method of the present invention in detail, and the specific execution process is as follows:

[0060] Step 1: Obtain the data of the patent text, and perform text preprocessing on the patent specification, mainly word segmentation and stop word removal.

[0061] ① Obtain the description of the IPC category, perform word segmentation and part-of-speech tagging on the description, remove stop words, and manually correct the word segmentation results to build a user dictionary.

[0062] ② Perform format conversion and specification extraction on the patent samples extracted above, add the user dictionary constructed in (1) to the word segmentation program, and then perform Chinese word segmentation and part-of-speech tagging on the specification.

[0063] ③ Use regular expressions to remove stop words, function words, conjunctions and other words that are not very useful for patent cla...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a specification-based patent classification method and belongs to the field of text processing and data mining. The method comprises the steps of firstly performing text preprocessing on patent specifications; secondly establishing reverse index files and selecting feature words by utilizing a feature selection method combining information gain with a word frequency; thirdly calculating weights of the feature words by utilizing an improved TF-IDF formula, and creating patent eigenvectors; fourthly establishing a training patent field set; and finally classifying patents by utilizing an optimized KNN classifier. The research provides a new idea for patent literature classification and lays a foundation for further researching patent literature intelligent retrieval and the like.

Description

technical field [0001] The invention belongs to the application of computer analysis technology in patent documents, and in particular relates to a patent classification method using patent specifications. Background technique [0002] A patent is a concrete manifestation of technological innovation and corporate value, and is one of the important carriers, achievements and sources of knowledge development and innovation. Many inventions and creations only appear in patent documents. According to the statistics of the World Intellectual Property Organization (WIPO), 70% to 90% of the world's inventions first appear in patent documents, rather than in other media such as magazines and papers. In addition, in order to protect their own interests, enterprises will apply for patents as early as possible. Patents often concentrate the most active and advanced technologies, including 90% to 95% of the world's technical information. At the same time, for the convenience of examina...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06K9/62
CPCG06F16/35G06F40/289G06F18/24147
Inventor 朱玉全金健佘远程石亮
Owner JIANGSU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products