Unlock instant, AI-driven research and patent intelligence for your innovation.

Text classification method and device

A text classification and text technology, applied in the computer field, can solve problems such as end-to-end impact, and achieve the effect of improving accuracy

Active Publication Date: 2022-04-29
XIAMEN MEIYA PICO INFORMATION
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] With the development of deep learning, its end-to-end thinking has had a huge impact on traditional text classification methods.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification method and device
  • Text classification method and device
  • Text classification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Application overview

[0025] For different language families, the preprocessing process of the text classification task also has its own characteristics. Different from English and other Latin languages, Chinese text classification tasks need to be segmented during the preprocessing process. However, since there is no clear boundary between Chinese words and words, the result of word segmentation often introduces more noise. ; and if feature extraction is done only from the character granularity, the semantics contained in it will be lacking. Therefore, whether it is word granularity or word granularity, the extracted features will have certain defects. The direct impact of these defects is that when using relatively simple network models such as textCNN or FastText for Chinese text classification tasks, the performance of the model is limited.

[0026] Although models such as BERT can achieve the best results in various tasks after pre-training and fine-tuning, their...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the present application discloses a text classification method and device. A specific implementation of the method includes: obtaining the text to be classified; performing word segmentation on the text to be classified to obtain a list of words; dividing the text in the text to be classified into tones to obtain a list of tone combinations; determining the word vector of each word in the word list , and determine the tone vector of each tone combination in the tone combination list; the obtained word vector and tone vector are input into the pre-trained text classification model to obtain a label for representing the category of the text to be classified. This embodiment realizes the combination of word vector and tone vector, and extracts the semantic and intonation features of the text from the two dimensions of word and tone respectively. Using these features can effectively improve the shortcomings of word / word-level features and improve the performance of text classification. accuracy.

Description

technical field [0001] The embodiment of the present application relates to the field of computer technology, and specifically relates to a text classification method and device. Background technique [0002] One of the classic tasks of natural language processing is text classification, also known as text classification. The purpose of this task is to assign a predefined label to text. The process of text classification is usually divided into two stages: feature extraction and label classification. In the first stage, some specific word combinations (such as two words, three words, word frequency or the inverse text frequency of words) can be extracted with the help of machine learning models; in the second stage, through the information provided by these features, the computer can Have a relatively objective understanding and judgment on the attributes of the text. Traditional text classification tasks are carried out under the guidance of this framework. [0003] Wit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F40/279G06F40/242G06N3/04G06N3/08
CPCG06F16/35G06N3/08G06N3/045
Inventor 蒋卓赵建强黄剑张辉极
Owner XIAMEN MEIYA PICO INFORMATION