Multi-model fused short text classification method

A classification method and text classification technology, applied in character and pattern recognition, special data processing applications, instruments, etc., can solve problems such as adaptability and classification effect can not fully meet the needs

Active Publication Date: 2016-04-06
XI AN JIAOTONG UNIV
View PDF4 Cites 56 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the adaptability and classification effect of these single clas

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-model fused short text classification method
  • Multi-model fused short text classification method
  • Multi-model fused short text classification method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0085] The invention provides a multi-model fusion short text classification method, which includes two parts: a learning method and a classification method. The learning method and the classification method realize different functions respectively.

[0086] (1), the learning method includes the following steps:

[0087] (11) Segment and filter the short text training data to obtain a word set;

[0088] (12) calculate the IDF value of each word in the word set;

[0089] (13) obtain the TFIDF value of all words in each training short text in step 1), then build into text vector, promptly obtain VSM text vector;

[0090] (14) Carry out text clustering based on VSM text vector, and construct ontology tree model by clustering result, then build out keyword overlapping model on the basis of ontology tree;

[0091] (15) Build a naive Bayesian model based on the VSM text vector;

[0092] (16) Build a support vector machine model based on the VSM text vector.

[0093] The steps (...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-model fused short text classification method. The multi-model fused short text classification method comprises a learning method and a classification method. The learning method comprises the following steps: carrying out word segmentation and filtration on short text training data to obtain a word set; calculating the IDF value of each word; calculating the TFIDF values of all the words and constructing a text vector VSM; and carrying out text learning on the basis of a vector space model, and constructing an ontology tree model, a keyword overlapping model, a naive Bayesian model and a support vector machine model. The classification method comprises the following steps: carrying out word segmentation and filtration on a to-be-classified short text; generating a text vector on the basis of the support vector machine model; respectively classifying by using the ontology tree model, the keyword overlapping model, the naive Bayesian model and the support vector machine model to obtain single model classification results; and fusing the single model classification results to obtain a final classification result. According to the method disclosed in the invention, multiple classification modes are fused and the short text classification correctness is improved.

Description

【Technical field】 [0001] The invention belongs to the field of intelligent information processing and computer technology, and in particular relates to a short text classification method. 【Background technique】 [0002] With the rapid development of the Internet, various network applications have penetrated into all aspects of social life. Various social applications represented by Weibo and WeChat, as well as various online comment and feedback mechanisms have become important channels for publishing and obtaining information in modern society. On Weibo, netizens can express their feelings, experiences, and perceptions, the government can release announcements and various information, and people can freely express their views and opinions on certain things, etc. [0003] Data such as Weibo, WeChat, and online comments are all text information with a limited number of words. These data are typical text passages, ie short texts. By mining the short text data of Weibo, it i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/355G06F18/24155G06F18/2411
Inventor 鲍军鹏蒋立华袁瑞玉骆玉忠
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products