Text classification method, device, computer equipment and storage medium

A text classification and text technology, applied in text database clustering/classification, text database query, unstructured text data retrieval, etc., can solve the problems of thin information and affect the accuracy of text classification, so as to expand information basis and improve The effect of accuracy

Active Publication Date: 2021-08-17
TENCENT TECH (SHENZHEN) CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the solutions in the related art use multiple sets of encoders to encode multiple texts in parallel, and the sentence vector of each text only represents the characteristics of the corresponding current text, resulting in relatively thin information carried by the sentence vector, which affects the accuracy of text classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method, device, computer equipment and storage medium
  • Text classification method, device, computer equipment and storage medium
  • Text classification method, device, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present application as recited in the appended claims.

[0065] This application proposes a text classification scheme, which can extract correlation features between multiple texts through a self-attention mechanism in the process of multi-text classification, so as to improve the accuracy of multi-text classification. For ease of understanding, several terms involved in the embodiments of the present application are explained below.

[0066...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application relates to a text classification method and relates to the technical field of natural language processing. The method includes: generating a long text containing at least two texts to be classified; processing the long text through a self-attention sub-model to obtain a fusion word vector of each word in the long text, and the self-attention sub-model is used to classify each word The association relationship between each word is fused in the original word vector; the fused word vector of each word in the long text is processed through the output sub-model, and the classification results of at least two texts to be classified are obtained. This solution realizes the fusion of word associations between different texts to be recognized in the artificial intelligence scenario based on multi-text classification. In the process of classifying through the output sub-model, it can realize the combination of the texts to be classified The association relation is used for text classification, which expands the information basis of text classification and improves the accuracy of multi-text classification.

Description

technical field [0001] The embodiments of the present application relate to the technical field of natural language processing, and in particular to a text classification method, device, computer equipment, and storage medium. Background technique [0002] Multi-text classification is an important part of natural language processing and is widely used in sentiment analysis, question-answer matching, search engines and other scenarios. [0003] Multi-text classification usually refers to the application of finding the target text from multiple texts through the classification model. In related technologies, a classification model for multi-text classification usually consists of an output layer and multiple sets of parallel encoders. When performing text classification, multiple sets of encoders are used to encode multiple texts in parallel, and each set of encoders is responsible for Encode a text, obtain the sentence vector of each text, and then process the sentence vecto...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/33
CPCG06F16/3347G06F16/35
Inventor 缪畅宇
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products