Automatic industry classification method and system
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An automatic classification and industrial technology, applied in the field of document analysis, can solve problems such as missing in-depth information and loss
Active Publication Date: 2020-05-08
BEIJING BENYING TECH CO LTD
View PDF6 Cites 0 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
The disadvantage of this method is that the natural language processing method used loses the information on the word order relationship, does not use the hierarchical vector generated by the abstract, claims and specification, and misses the in-depth information contained in the patent text
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
Embodiment 1
[0084] like figure 1 , 2 As shown, step 1000 is executed, the industry tree generation module 200 is used to define the target industry tree, and the scope of patents to be divided is manually determined as required.
[0085] Step 1100 is executed to determine the target patent scope using the confirmation module 210 . Define the industry tree as needed: I={i 1 ,..., i j ,…, i n}, where i j ∈I is the first-level industry j is the number of the first-level industry, 1≤j≤n, n is the number of all leaf nodes under I. Set any non-leaf node i of I jkl… ={i jkl…1 ,...,i jkl…t}, the degree of other nodes other than the leaf node is ≥ 2, where k is the second-level industry number, l is the third-level industry number, and t is the second-to-last industry number.
[0086] Execute step 1200, use the tag generation module 220 to generate tags on the target industry tree, determine the number p of patents that can be tagged according to resource constraints, p≥N, and tag at leas...
Embodiment 2
[0115] A method for automatic industry classification, comprising the following steps:
[0116] 1. Define the target industry tree. Define the industry tree as needed: I={i 1 ,...,i n}, where i j ∈I is the primary industry, which can be further divided into secondary industries, i j ={i j1 ,...,i jm}, and so on, any non-leaf node i of I jkl… ={i jkl…1 ,...,i jkl…t}. According to the general practice of industry division, the degree of other nodes other than leaf nodes is ≥2. Let N be the number of all leaf nodes under I.
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
PUM
Login to view more
Abstract
The invention provides an automatic industry classification method and system. The method comprises the step of determining a target patent range. The method also comprises the following steps of: defining a target industry tree; generating marks on the target industry tree; performing target patent rough classification by using the marks; and performing target patent fine classification accordingto a rough classification result. According to the automatic industry classification method and system provided by the invention, a direct-push learning method is used to realize the full mining of small-annotation-amount information; the information of the IPC is used, and therefore, so that information dimensions is enriched, and a calculation amount is reduced; hierarchical vectors generated by abstracts, claims and specifications are used; information in the aspect of word order relations is reserved; and patent texts are mined more deeply.
Description
technical field [0001] The invention relates to the technical field of document analysis, in particular to an industry automatic classification method and system. Background technique [0002] The rapid development of science and technology has brought about the surge of patent texts and the continuous emergence of new industries. In order to analyze technological development in the context of an industry, it is necessary to label patents with industry labels. The method of manual labeling is slow and expensive, but the accuracy is high. Therefore, there is a need for an automatic classification method with a small amount of annotation, high computational efficiency, and more fully mining annotation information. [0003] Existing methods either require a large amount of manual labeling, or do not use manual labeling at all, so that the corresponding relationship with the target industry cannot be directly established. Existing methods generally use patent texts for natura...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.