A multi-task-assisted extreme multi-label short text classification method using co-occurrence information

A text classification and classification method technology, applied in the information field, can solve the problems of effect dependence, difficult to apply effectively, increase the number of neurons in the output layer, etc., achieve the effect of reducing maintenance costs, solving high maintenance costs, and enhancing robustness

Active Publication Date: 2022-07-19
TIANJIN UNIV OF SCI & TECH
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Among them, the embedding method has high time complexity, and the effect is extremely dependent on the clustering effect in the preprocessing process; the multi-classifier method ignores the information between labels, and regards labels as independent individuals, and because each label requires training A classifier has a huge deployment cost and is difficult to effectively apply to real business scenarios; the tree method cannot solve the long-tail problem in the data set, and is large in scale, high in cost, and poor in accuracy, making it difficult to use stably in industrial scenarios
Existing deep learning methods do not optimize the long-tail problem, but simply increase the number of neurons in the output layer, resulting in their effect is usually not as good as the other three methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-task-assisted extreme multi-label short text classification method using co-occurrence information
  • A multi-task-assisted extreme multi-label short text classification method using co-occurrence information
  • A multi-task-assisted extreme multi-label short text classification method using co-occurrence information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0035] The design idea of ​​the present invention is to use co-occurrence information and multi-task learning technology to assist in improving the prediction effect of extreme multi-label text classification tasks. Inspired by the explicit association relationship between co-occurrence information and tags in the tag set, the explicit association relationship inspires the present invention to construct co-occurrence information by using the relevant feature information of the account itself, thereby effectively improving the prediction effect of the method on high-frequency tags and low-frequency tags. Further, inspired by the shared parameters in multi-task learning, the present invention uses the information learned from the multi-label text classification task to assist the extreme multi-label text classification task when it is difficult to simply solve p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a multi-task auxiliary extreme multi-label short text classification method using co-occurrence information. , and model the feature information as explicit model input co-occurrence information; construct multi-label text classification tasks and extreme multi-label text classification tasks related to microblog short texts; build multi-task learning task models; Bo short text data pre-trains the multi-task learning task model; fine-tunes the multi-task learning task model; quantifies the neural network output and finally outputs the multi-task prediction result. The present invention utilizes co-occurrence information to design a multi-task learning architecture, and realizes multi-label classification of large-scale short texts. The method can realize stable, accurate and real-time multi-label prediction for large-scale short text data sets under the condition of low industrial deployment cost. .

Description

technical field [0001] The invention belongs to the field of information technology, and relates to natural language processing and text classification methods, in particular to a multi-task auxiliary extreme multi-label short text classification method using co-occurrence information. Background technique [0002] With the increasing speed of text data production, data diversity, and semantic complexity, traditional multi-label text classification methods have been unable to meet daily industrial needs in terms of accuracy and real-time performance. There is an increasing demand for multi-label text classification tasks. [0003] In order to solve the above problems, the existing technologies mostly solve them by methods such as embedding, multi-classifier, tree, and deep learning. Among them, the time complexity of the embedding method is high, and the effect is extremely dependent on the clustering effect in the preprocessing process; the multi-classifier method ignores ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06N3/08
CPCG06F16/35G06N3/08
Inventor 王嫄徐涛王世龙周宇博王欢杨巨成赵婷婷陈亚瑞
Owner TIANJIN UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products