Method for determining optimal topic number of LDA topic model based on vocabulary similarity
A topic model and determination method technology, which is applied in the fields of digital data processing, character and pattern recognition, special data processing applications, etc. Model clustering effect, effect of solving selection problem
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0023] In order to facilitate those of ordinary skill in the art to understand and implement the present invention, the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the implementation examples described here are only used to illustrate and explain the present invention, and are not intended to limit this invention.
[0024] please see figure 1 , a kind of LDA subject model optimal subject number determination method based on lexical similarity provided by the present invention, comprises the following steps:
[0025] Step 1: Select the initial k value as the initial topic number of the LDA topic model;
[0026] Step 2: Carry out document topic separation, sample topics until convergence;
[0027] In this embodiment, firstly, the text data to be analyzed is preprocessed, word-segmented and stop words are removed. Then apply the LDA model, according to the Gibbs sampling...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


