Hybrid theme model construction method for deep learning

A topic model and construction method technology, which is applied in neural learning methods, biological neural network models, unstructured text data retrieval, etc., can solve problems such as insufficient feature extraction, low sample efficiency, long training time, etc., and achieve transferability Strong, low classification error rate, good overall classification effect of the model

Active Publication Date: 2020-01-10
ANHUI POLYTECHNIC UNIV MECHANICAL & ELECTRICAL COLLEGE
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, the five typical topic models of LSA, pLSA, LDA, HDP, and lda2vec have problems such as the need to pre-set the number of topics for model training, relatively long training time, insufficient feature extraction, and low sample efficiency.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid theme model construction method for deep learning
  • Hybrid theme model construction method for deep learning
  • Hybrid theme model construction method for deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] The present invention realizes by following technical method scheme, as figure 1 As shown, a deep learning mixed topic model construction method is applied to semantic analysis and text mining in the field of natural language processing. It has been extended to the field of bioinformatics, and topic models are often applied to text representation, Dimensionality reduction processing, clustering text by topic, and forming a text recommendation system based on user preferences, etc.

[0031] Currently, there are five topic models: LSA, pLSA, LDA, HDP, and lda2vec, among which:

[0032] LSA is latent semantic analysis (Latent Semantic Analysis), which is one of the foundations of topic modeling. It mainly uses linear algebra theory for semantic analysis. Its core idea is to decompose the owned "document-term" matrix into mutually independent "document-term" Topic" matrix and "topic-term" matrix, the more frequently a term appears in a document, the greater its weight.

...

Embodiment 2

[0055] Using the HTM hybrid topic model not only has good transfer learning ability, but also has strong feature extraction and resource representation capabilities, which can greatly improve the efficiency of sample usage, so that less sample data can achieve optimal performance. Assume that the number of hidden layers in the convolutional network CNN is set to 1, num_filtes is set to 100, the convolution kernel filter_size is set to 3, max_len is set to 50, and the value range of the dropout method is used to solve the overfitting problem In [0.4,0.6], the experiment chooses 0.5 by default, the purpose is to reduce the complex co-adaptability between neurons and improve the generalization ability of the model. Let each neuron not work with a probability of 50%, that is, it is in a sleep state, and does not perform forward score propagation or reverse error transmission. The two groups of original data used in the present invention are respectively from the Huawei cloud commu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of computer deep learning, and provides a hybrid theme model construction method for deep learning. The method comprises the following steps of S1, preprocessing; S2, representing the text information; S3, supplementing a background information sub-network; and S4, dividing the theme of a full connection layer network, and outputting a label classification probability. According to the invention, the theme of the data of a Huawei cloud platform and an intelligent learning platform is mined, a hybrid theme model HTM based on deep learning is discovered, the required data volume in the field of theme classification is smaller, and the texts of different lengths can be converted effectively via a Bi-LSTM framework to obtain the better migration capability, so that the migration capability of the model is high, the classification error rate is low, and the overall classification effect of the model is good, and the beneficial attempts are made for the theme classification model of deep learning in small sample learning and transfer learning in future.

Description

technical field [0001] The invention relates to the technical field of computer deep learning, in particular to a method for constructing a mixed topic model for deep learning. Background technique [0002] At present, the five typical topic models of LSA, pLSA, LDA, HDP, and lda2vec have problems such as the need to pre-set the number of topics for model training, relatively long training time, insufficient feature extraction, and low sample efficiency. Contents of the invention [0003] The purpose of the present invention is to improve the deficiencies in the prior art and provide a method for constructing a mixed topic model of deep learning. [0004] In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings used in the embodiments. It should be understood that the following drawings only show some embodiments of the present invention, so It should not be rega...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/35G06N3/08
CPCG06F16/35G06N3/08
Inventor 万家山
Owner ANHUI POLYTECHNIC UNIV MECHANICAL & ELECTRICAL COLLEGE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products