An industry automatic classification method and system

An automatic classification and industrial technology, applied in the field of document analysis, can solve the problems of missing in-depth information, loss, etc., and achieve the effect of reducing the amount of annotation, improving computing efficiency and classification accuracy.

Active Publication Date: 2021-09-24
BEIJING BENYING TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this method is that the natural language processing method used loses the information on the word order relationship, does not use the hierarchical vector generated by the abstract, claims and specification, and misses the in-depth information contained in the patent text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An industry automatic classification method and system
  • An industry automatic classification method and system
  • An industry automatic classification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0084] like figure 1 , 2 Shown, step 1000, using industry tree generation module 200 defines the target tree industry, a range to be determined manually Patent divided needed.

[0085] Step 1100, validation module 210 determines a target using patentable scope. The need to define the tree industry: I = {i 1 , ..., i j , ..., i n }, Where, i j ∈I is an industry j is the number one industry, 1≤j≤n, n is the number of I at all leaf nodes. I is a set of any non-leaf node i jkl… = {I jkl…1 , ..., i jkl…t }, Other nodes of the outer leaf node ≧ 2, where, k is the number two industry, l is a three sector number, t is the penultimate stage number industry.

[0086] Step 1200, using the tag generation module 220 generates a marker at the target tree industry, resource constraints determine the number p can be marked according to the patent, p≥N, each leaf node of a tree of at least marking industry patent belongs to the node, wherein, N is the number of a last sector.

[0087] Step 1300, c...

Embodiment 2

[0115] A method for automatic classification of industries, including the following steps:

[0116] 1, define the target tree industry. The need to define the tree industry: I = {i 1 , ..., i n }, Where i j ∈I as an industry, we can continue to be divided into two sectors, i j = {I j1 , ..., i jm }, And so on, any of a non-leaf node I i jkl… = {I jkl…1 , ..., i jkl…t }. According to other nodes outside of general practice by sector, leaf nodes ≥2. Let N be the number of all I leaf nodes.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides an industry automatic classification method and system, wherein the method includes determining the target patent scope, and further includes the following steps: defining a target industry tree; generating marks on the target industry tree; using the marks to perform rough classification of target patents; Carry out target patent sub-classification according to the rough classification results. The industry automatic classification method and system proposed by the present invention use the direct push learning method to realize the full mining of small label information; use the information of IPC to enrich the information dimension and reduce the amount of calculation; use abstract, The hierarchical vectors generated by claims and descriptions retain information on word order relationships and dig deeper into patent texts.

Description

Technical field [0001] Technical Field The present invention relates to a document analysis, more particularly to a method and system for automatic classification industry. Background technique [0002] Bring a surge of new industries and emerging patent text of the rapid development of science and technology. In order to analyze the development of technology in the industry background, industry needs to be marked with labels to patents. Manual annotation method is slow and costly, but high accuracy. Therefore a need for a smaller amount of labeling, higher computational efficiency, tap more fully automatic classification of indexed information. [0003] Existing methods are either require a larger amount of manual annotation, or completely without manual tagging and thus can not directly establish a correspondence between the target industry. Patent conventional methods generally used for natural language processing text, informative and important dimension calculation ipc omitt...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06K9/62
CPCG06F18/23G06F18/24323G06V10/7625G06F18/23213G06F40/289G06F40/30G06Q50/184G06F2216/11G06F16/353G06Q10/0637
Inventor 李卫宁
Owner BEIJING BENYING TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products