Text classification method and device, computer equipment and storage medium

A text classification and text technology, applied in the fields of computer equipment and storage media, text classification methods, and devices, can solve problems such as thin information and affect the accuracy of text classification, and achieve the effect of expanding information basis and improving accuracy

Active Publication Date: 2019-12-20
TENCENT TECH (SHENZHEN) CO LTD
View PDF6 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the solutions in the related art use multiple sets of encoders to encode multiple texts in parallel, and the sentence vector of each text only represents the characteristics of the corresponding current text, resulting in relatively thin information carried by the sentence vector, which affects the accuracy of text classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device, computer equipment and storage medium
  • Text classification method and device, computer equipment and storage medium
  • Text classification method and device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.

[0065] This application proposes a text classification scheme, which can extract correlation features between multiple texts through a self-attention mechanism in the process of multi-text classification, so as to improve the accuracy of multi-text classification. For ease of understanding, several terms involved in the embodiments of the present application are explained below.

[0066...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text classification method, and relates to the technical field of natural language processing. The method comprises the steps of generating a long text containing at least two to-be-classified texts; processing the long text through a self-attention sub-model, and obtaining fusion word vectors of all words in the long text, wherein the self-attention sub-model is used for fusing the incidence relation between all the words in the original word vectors of all the words; and processing the fused word vector of each word in the long text through the output sub-model toobtain a classification result of at least two texts to be classified. According to the scheme, in an artificial intelligence scene based on multi-text classification, word association relationship fusion is carried out on different to-be-recognized texts, text classification can be carried out in combination with the association relationship between the to-be-classified texts in the process of carrying out classification through the output sub-model, the information basis of text classification is expanded, and the accuracy of multi-text classification is improved.

Description

technical field [0001] The embodiments of the present application relate to the technical field of natural language processing, and in particular to a text classification method, device, computer equipment, and storage medium. Background technique [0002] Multi-text classification is an important part of natural language processing and is widely used in sentiment analysis, question-answer matching, search engines and other scenarios. [0003] Multi-text classification usually refers to the application of finding the target text from multiple texts through the classification model. In related technologies, a classification model for multi-text classification usually consists of an output layer and multiple sets of parallel encoders. When performing text classification, multiple sets of encoders are used to encode multiple texts in parallel, and each set of encoders is responsible for Encode a text, obtain the sentence vector of each text, and then process the sentence vecto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F16/33
CPCG06F16/3347G06F16/35
Inventor 缪畅宇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products