Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for automatically extracting key phrases of patent documents

A key phrase, patent document technology, applied in the field of text information processing, can solve the problems of lack of indexing tools, dependence on labor, and high construction costs

Active Publication Date: 2014-06-25
中国专利信息中心
View PDF3 Cites 44 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, because data processing requires a lot of time, manpower and financial resources, the construction cost is very high, and the processing efficiency is not satisfactory at present
[0004] However, there is currently a lack of professional and accurate indexing tools, and most of them use manual indexing to improve the accuracy rate, making indexing work even more difficult to meet the needs of the current increasing number of patent applications
Chinese invention patent CN1818906A provides an indexing method for patent documents. This method establishes technical classification and keyword correspondence, and provides corrections to improve the accuracy rate. However, this method still relies on manual work and is not fully automatic. The data processed by the method is large and difficult to be practical

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for automatically extracting key phrases of patent documents
  • Method for automatically extracting key phrases of patent documents
  • Method for automatically extracting key phrases of patent documents

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0116] The following takes a fragment of a specific patent document as an example to illustrate the method involved in the present invention, but the following examples only illustrate the present invention, and are not intended to limit the present invention.

[0117] 【example】

[0118] Title of Invention: Transmission method of random access channel in time division duplex system

[0119] Main classification number: H04L1 / 18

[0120] Abstract: The present invention provides a random access channel transmission method in a time division duplex system, which includes the following steps: determining the number of RACHs in the UpPTS of the time division duplex system;...

[0121] Rights request:

[0122] 1. A method for transmitting random access channel RACH in a time division duplex system, characterized in that...

[0123] Technical Field: The present invention relates to the field of communications, and in particular, to a method for transmitting a random access channel in a time divis...

specific Embodiment approach

[0128] Hereinafter, the present invention will be described in detail with reference to the accompanying drawings and the embodiments.

[0129] …

[0130] First, read the above text from the patent document. Mark the text field, for example: mark at the beginning and end of the abstract with , The starting position of the claims is marked with Wait. The method of marking the text field can be realized by reading its existing XML tags or using existing methods such as template matching. After the text field is marked, the IPC main classification number is obtained, and the position information of the text is recognized. The position information recognition mainly adopts preset rule templates and so on.

[0131] Use existing tools to perform sentence segmentation, word segmentation, and part-of-speech tagging on the above text. Common word segmentation tools such as ICTCLAS, CWS, etc., common part-of-speech tagging methods are based on SVM, conditional random field, and HMM. For...

example 2

[0135] 【example】

[0136] Invention Title: Combined structure of heterogeneous shells based on inserts and grooves

[0137] Main classification number: G06F1 / 18

[0138] Abstract: A heterogeneous shell combination structure, including a first member, a second member, and an adhesive. …

[0139] Claims: 1. A heterogeneous shell combination structure, comprising: a first member having at least one groove;...

[0140] Technical Field: The present invention relates to a shell coupling structure, and more particularly, to a structure that strengthens the coupling strength of a shell of a heterogeneous material.

[0141] BACKGROUND OF THE INVENTION: In order to meet the requirements of today's consumers, the shells of current notebook computers emphasize the characteristics of good heat dissipation performance, light weight, sturdiness and wear resistance, and diverse colors.

[0142] …

[0143] SUMMARY OF THE INVENTION In view of the above problems, the present invention provides a heterogeneo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for automatically extracting key phrases of patent documents. The method includes steps of 1, preprocessing texts; 2, recognizing topic types of patented inventions; 3, extracting alternative key phrases and filtering the same; 4, subjecting the alternative key phrases to weight calculation and selecting the key phrases.

Description

Technical field [0001] The present invention relates to text information processing technology, and more specifically, to a method for automatically extracting key phrases of patent documents. Background technique [0002] With the rapid growth of the number of patent documents, the professional and socialized search of patent documents has become more and more common. Realizing the recall and accuracy of patent document data has become the difficulty and focus of patent document information retrieval. For a long time, the retrieval of patent information using original patent data has often resulted in poor recall and precision and often contradicted each other. Since the original information of patent documents comes from the applicant’s original submissions, there are often a large number of relevant technical information and citation techniques, so in the search process, in order to ensure the recall rate, too many documents will be introduced, a lot of noise data or Noise li...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
Inventor 任智军张威李进杨婧张江涛肖湘
Owner 中国专利信息中心
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products