Supercharge Your Innovation With Domain-Expert AI Agents!

Patent text classification method combining ALBERT and BiGRU

A text classification and patented technology, which is applied in the field of computer analysis of patent documents, can solve problems such as disappearance, RNN gradient explosion, and unsatisfactory processing of long sequence texts, and achieve the effect of improving representation ability and effect

Inactive Publication Date: 2021-02-12
HUBEI UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the traditional RNN has the problem of gradient explosion and disappearance, and the effect of processing long sequence text is not ideal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Patent text classification method combining ALBERT and BiGRU
  • Patent text classification method combining ALBERT and BiGRU
  • Patent text classification method combining ALBERT and BiGRU

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention.

[0035] figure 1 A flow chart showing the work of a patented text classification algorithm combining ALBERT and BiGRU according to the present invention.

[0036] Such as figure 1 As shown, the method for classifying patent texts includes the following steps:

[0037] Step 1. Perform data cleaning on the patent data set released by the State Information Center, eliminate patent text data whose classification numbers are non-strict IPC classifications, and label the patent data according to the IPC classification numbers. After data cleaning, the remaining original data is about 2.32 million, including all parts of the IPC (from A to H), and there are 124 categories ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of computer analysis of patent literatures, and particularly relates to a patent text classification method combining ALBERT and BiGRU. Most existing patent text classification algorithms adopt Word2vec and other modes to obtain word vector representation of a text, abandon position information of a large number of words and cannot represent complete semantics of the text. In order to solve the problem, the patent text classification method combining ALBERT and BiGRU is provided, a dynamic word vector pre-trained by ALBERT is used for replacing a static word vector trained in a traditional Word2vec mode and the like, and the representation capacity of the word vector is improved; a BiGRU neural network model is used for training, semantic association between long-distance words in the patent text is reserved to the maximum extent, the patent text classification effect is improved, and the method has good performance in multiple evaluation indexes.

Description

technical field [0001] The invention belongs to the technical field of computer analysis of patent documents, and in particular relates to a patent text classification method combining ALBERT and BiGRU. Background technique [0002] With the rapid development of science and information technology, the number of patent applications is increasing year by year. In 2018, global innovators filed a total of 3.3 million invention patent applications, achieving growth for the ninth consecutive year, with an increase of 5.2%. Among them, the number of patent applications accepted by the State Intellectual Property Office of China is the largest, reaching 1.54 million, accounting for 46.7% of the global total. In order to facilitate the retrieval and management of patent documents, it is necessary to classify patent documents according to professional technical fields. At present, the task of patent classification is still mainly completed by patent examiners, which not only consume...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35G06N3/04
CPCG06F16/353G06F16/355G06N3/045G06N3/044
Inventor 曾诚温超东任俊伟何鹏马传香肖奎
Owner HUBEI UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More