Unlock instant, AI-driven research and patent intelligence for your innovation.

Text classification and model training method and device, equipment and storage medium

A classification model and text classification technology, applied in the field of training data processing, can solve problems such as affecting the model training effect and uneven training difficulty.

Active Publication Date: 2021-03-09
IFLYTEK CO LTD
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In view of the above problems, this application is proposed in order to provide a text classification and model training method, device, equipment and storage medium to solve the difficulty of training existing training text Unbalanced phenomenon, which will affect the training effect of the model when training the model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text classification and model training method and device, equipment and storage medium
  • Text classification and model training method and device, equipment and storage medium
  • Text classification and model training method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0084] In an optional implementation manner, a set of confidences that the training text belongs to the labeled category predicted by each initial text classification model may be determined as an indication of the classification difficulty of the training text.

[0085] That is, a set of k-1 confidence levels may be determined as a classification difficulty representation of the training text.

[0086] In another optional implementation manner, a mathematical operation may be performed on the confidence of each initial text classification model that the training text belongs to the label category to obtain a comprehensive confidence, and the comprehensive confidence is determined as the training text. The classification difficulty representation of the text.

[0087] Among them, the mathematical operation on each confidence level can include many different ways, such as calculating the average value, median value, maximum value, minimum value, etc. of multiple confidence leve...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text classification and model training method and device, equipment and a storage medium. The method comprises the steps: firstly dividing a training text set into a plurality of training text subsets, training a corresponding initial text classification model through employing each subset, and taking each subset as a verification set; and performing classification prediction on each training text in the verification set by utilizing the initial text classification model corresponding to each subset except the verification set to obtain the classification difficulty of each training text, According to the classification difficulty, re-dividing each training text in the training text set into a plurality of training text subsets with different classification difficulties. According to the text classification method, powerful training data support is provided for better training a target text classification model, and then the target text classification model can be progressively trained according to the classification difficulty from low to high based on training text subsets with different classification difficulties; the problem that the model training effect is poor due to the unbalanced text training difficulty is solved.

Description

technical field [0001] The present application relates to the technical field of training data processing, and more specifically, relates to a text classification and model training method, device, equipment and storage medium. Background technique [0002] A fundamental and important task in the field of natural language understanding is to classify text. In order to realize text classification, the prior art generally trains a neural network model to perform text classification processing through the model. [0003] In real-world scenarios, a large number of training texts often show a long-tail phenomenon (also known as the sample imbalance problem), and different training texts contain different amounts of information, which leads to different training texts being learned by the model. The degree of difficulty is different, that is, the sample imbalance phenomenon caused by the unbalanced training difficulty of a single training text. The existing technology does not d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/35
CPCG06F16/35
Inventor 葛学志刘权陈志刚王志国胡国平
Owner IFLYTEK CO LTD