Method for analyzing and predicting online public opinion based on LDA topic models

A topic model and predictive network technology, which is applied in network data retrieval, website content management, natural language data processing, etc., can solve the problems of not using the document generation time, and the model can not reflect the change trend of documents, topics, words, etc. , to achieve the effect of convenient subdivision and strong practicability

Inactive Publication Date: 2016-07-13
INSPUR SOFTWARE CO LTD
View PDF2 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional training method does not use the information of the generation time of the document

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0032] A method of analyzing and predicting network public opinion based on the LDA topic model of the present invention, based on the LDA topic model of time information, obtains the training results on different time slices, so as to realize the dynamic analysis and prediction function of network public opinion; the steps are as follows:

[0033] First, according to the time information of the LDA topic model, the documents in the corpus are discretized into the corresponding time window on the time series, and the matrix is ​​processed in parallel by using the distributed cloud computing architecture to process the corpus;

[0034] Then process the document collection on each time window sequentially to obtain the training results on different time slices, and use the training results of the previous corpus as the prior parameters in the subsequent corpus training process;

[0035] Finally, from the training results, the trend of the strength of each LDA topic model over tim...

Embodiment 2

[0037] A method of analyzing and predicting network public opinion based on the LDA topic model of the present invention, based on the LDA topic model of time information, obtains training results on different time slices, so as to realize the dynamic analysis and prediction function of network public opinion; different time in the corpus The order of the documents in the segment is affected. According to the Markov principle, each state s in the random state t , only with its previous state s t-1 are directly related to:

[0038] P(s t |s 1 ,s 2 ,s 3 ,...,s t-1 ) = P(s t |s t-1 );

[0039] The concrete steps of described method are as follows:

[0040] Step 1: Segment the acquired corpus by time slice D 1 ,D 2 ,D 3 ,...,D T ;

[0041] The second step: in the corpus D t Perform LDA modeling on the above to get the doc-topic matrix θ t,m with topic-word matrix to theta t,m Take the mean value of the columns, and get the vector α t ;

[0042] The third step:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for analyzing and predicting an online public opinion based on LDA topic models, and belongs to the technical field of big data analysis.The method is used for solving the problem about how to effectively organize large-scale documents, acquire evolution of topics in text sets according to the time sequence, and thus help a user track interesting topics.According to the technical scheme, documents in a corpus are firstly dispersed into corresponding time windows in a time sequence according to time information of the LDA topic models; then document sets in all the time windows are sequentially processed, training results on different time slices are obtained, and the training result of the former corpus is adopted as a prior parameter of a later corpus in the training process; finally, the change trend of the strength of all the LDA topic models over time is obtained from the training results, and a dynamic analysis and prediction function on the online public opinion is achieved.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a method for analyzing and predicting network public opinion based on an LDA topic model. Background technique [0002] Natural language processing is an important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers using natural language. Natural language processing is a science that combines linguistics, computer science, and mathematics. Research in this field will therefore involve natural language, the language that people use every day, so it is closely related to the study of linguistics, but has important differences. Natural language processing is not the general study of natural language, but the development of computer systems that can effectively realize natural language communication, especially the software systems. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/958G06F40/205
Inventor 高峰王茂帅于文才柳廷娜甄教明
Owner INSPUR SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products