Unlock instant, AI-driven research and patent intelligence for your innovation.

A text classification method and system based on a supervised topic model

A text classification and topic model technology, applied in the field of data classification, can solve problems such as increasing the complexity of the model, and achieve the effect of improving accuracy and improving time efficiency

Active Publication Date: 2020-06-02
SHANDONG INST OF BUSINESS & TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Existing improved methods, such as the Labeled-LDA model proposed by Li et al., the inventors found that the model trains an LDA model for each type of document, and the parameters to be estimated are multiplied, increasing the complexity of the model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A text classification method and system based on a supervised topic model
  • A text classification method and system based on a supervised topic model
  • A text classification method and system based on a supervised topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

[0039] It should be noted that the terminology used herein is only for describing specific embodiments, and is not intended to limit the exemplary embodiments according to the present disclosure. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and / or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and / or combinations thereof.

[0040] Explanation of terms:

[0041] Dirichlet distribution: The Dirichlet distribution is a set of continuous multivariate p...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The disclosure provides a text classification method and system based on a supervised topic model. Among them, a text classification method based on a supervised topic model includes: constructing an SLDA-TC text classification model; in the process of training the SLDA-TC text classification model, according to the SLDA-TC-Gibbs algorithm for each word's implicit The topic is sampled, and the hidden topic sampling is only performed from other training texts with the same text category label as the word; after determining the hidden topic of each word, the text-topic probability distribution, topic ‑word probability distribution and topic‑category probability distribution; establish an accurate mapping between topics and categories; input the text to be tested into the SLDA‑TC text classification model generated by training, infer the topic of the text to be tested, and then predict the category of the text .

Description

technical field [0001] The present disclosure relates to the field of data classification, in particular to a text classification method and system based on a supervised topic model. Background technique [0002] The statements in this section merely provide background information related to the present disclosure and do not necessarily constitute prior art. [0003] Text representation is an important step in text mining. Currently, the most widely used text representation method is Bag-of-word (BOW). The bag-of-words method regards a text as a collection of words, and assumes that the appearance of each word is independent and does not depend on other words, and ignores information such as word order and syntax. Based on BOW, a text is represented by an n-dimensional vector, and each dimension corresponds to a word, usually a weight related to the frequency of the word. This is the most commonly used vector space model (VSM). Due to the complexity of natural language, te...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35G06F16/332G06K9/62
CPCG06F18/2411
Inventor 唐焕玲窦全胜于立萍宋英杰鲁眀羽
Owner SHANDONG INST OF BUSINESS & TECH