The invention discloses a complaint short text classification method based on deep integrated learning

A technology of text classification and ensemble learning, which is applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc. It can solve the problems of indiscriminate reverse document frequency, low classification efficiency, and fast word update and other problems, to achieve the effect of improving generalization learning ability, powerful feature extraction ability, good tolerance and robustness

Inactive Publication Date: 2019-05-10
HEFEI UNIV OF TECH
View PDF5 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional text classification method encounters great difficulties when dealing with short text classification tasks, such as the amount of text information is small, the data is sparse, the total amount of data is particularly large but each individual is short, and the word frequency-reverse document frequency (Term Frequency-Inverse Document Frequency, TF-IDF) algorithm or LDA (Latent Dirichlet Allocation) topic model has the characteristics of high vector dimension and low classification efficiency when classifying text
Due to customer complaints that there are few information units in the short text, the words are relatively open, the total amount of words is large, the repetition rate is low, and the words are updated quickly, and new words and strange words appear frequently, etc.
The word frequency has been indistinguishable from the gap, and the reverse document frequency has no distinction, which makes the traditional text classification method very challenging when dealing with short text.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • The invention discloses a complaint short text classification method based on deep integrated learning
  • The invention discloses a complaint short text classification method based on deep integrated learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] refer to figure 1 and figure 2 , a method for classifying complaint short texts based on deep integrated learning proposed by the present invention, including:

[0029] Step S1, preprocessing the customer complaint text set to obtain the preprocessed complaint text set.

[0030] This step specifically includes: performing text screening, desensitization processing, removing stop words, filtering sensitive words, establishing a custom dictionary for the customer complaint text set in the customer complaint text set, and obtaining a preprocessed complaint text set.

[0031] In the specific scheme, firstly, the text set of customer complaints is preprocessed. The preprocessing process includes text screening, desensitization, removal of stop words, filtering of sensitive words, and establishment of custom dictionaries.

[0032] For example: customer complaint texts are mainly for the mobile communication industry, so the specific nouns of the mobile communication indust...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a complaint short text classification method based on deep integrated learning, which comprises the following steps: preprocessing a client complaint text set to obtain a preprocessed complaint text set; Designing complaint classification labels according to the theme classification of the preset complaint text, and marking corresponding complaint classification labels on the preprocessed complaint text set to obtain a training sample set; Performing text feature extraction on the training sample set by adopting a BTM topic model to obtain text feature vectors; Carryingout text feature extraction on the training sample set by adopting a convolutional neural network to obtain a convolutional semantic feature vector; Performing normalization and fusion on the text feature vector and the convolutional semantic feature vector by adopting a normalization combination strategy to obtain a combined text feature vector; And inputting the combined text feature vectors into a random forest model for training, combining classification results of a plurality of decision trees by adopting a weighting method according to the difference of different decision trees, and obtaining the category with the maximum probability as a text classification result of the training sample set.

Description

technical field [0001] The invention relates to the technical field of text classification, in particular to a method for classifying complaint short texts based on deep integrated learning. Background technique [0002] At present, the method for mobile communication operators to classify customer complaint work orders is mainly to use text mining and artificial intelligence algorithms to establish a complaint identification system to intelligently classify complaint work orders, so as to ensure that complaint work orders are allocated to appropriate technology in a short time The support department handles it. Due to the short length and large quantity of customer complaint texts, the reasons for complaints are various. The traditional text classification method encounters great difficulties when dealing with short text classification tasks, such as the amount of text information is small, the data is sparse, the total amount of data is particularly large but each individ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06F17/27
Inventor 岳丹阳方帅王刚岳学民
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products