Traditional Chinese medicine data mining method based on LDA (Latent Dirichlet Allocation) topic model

A topic model and data mining technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as the inability to effectively obtain the relationship information of traditional Chinese medicine prescriptions, achieve easy visual operation, simplify the derivation process, reduce The effect of data processing time

Active Publication Date: 2013-10-23
ZHEJIANG UNIV
View PDF4 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The present invention aims at the shortcoming that the existing method cannot effectively obtain the implicit relationship information in the prescription of traditional Chinese medicine, and provides a novel data mining method for traditional Chinese medicine based on the LDA topic model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Traditional Chinese medicine data mining method based on LDA (Latent Dirichlet Allocation) topic model
  • Traditional Chinese medicine data mining method based on LDA (Latent Dirichlet Allocation) topic model
  • Traditional Chinese medicine data mining method based on LDA (Latent Dirichlet Allocation) topic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] The present invention uses the data mining system based on B / S structure, as image 3 As shown, the application system includes a server and a client, wherein the client is an application layer, including a data mining application module of a third-party platform, a data mining scheme formulation module and a scheme execution module. The server includes service layer, aggregation layer and resource layer. The service layer includes public data mining interface and DartSpora system call interface. The aggregation layer includes resource management module, authority management module and mining scheme management module. The resource layer includes database, local file system, distributed file system, data mining algorithm library, parallel distributed data mining algorithm library and domain-related data mining algorithm library.

[0031] On the server side, the data transmission format between the resource layer and the aggregation layer is JDBC, JSDL, ExampleSet and oth...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of traditional Chinese medicine information search and discloses a traditional Chinese medicine data mining method based on an LDA (Latent Dirichlet Allocation) topic model. The method comprises the following specific steps: 1) determining two groups of priors, namely prescription-topic and topic-medicament, in the LDA model, and conducting priori assumptions to the two groups of priors by an AS (Asymmetry Symmetry) method, wherein the prescription-topic and the topic-medicament are determined respectively by Alpha and Beta; 2) determining the number of topics in the LDA model; 3) solving the LDA model by a Gibbs sampling method; 4) generating a semantic RDF (Resource Description Framework) document of the LDA model, mapping the result of the LDA model to a tetrad, and expressing the result with the semantic RDF document; 5) associating medicaments with prescriptions to build a prescription-topic-medicament with visual structure network G. The method has the advantages of being suitable for handling and mining a great quantity of traditional Chinese medicine prescriptions and capable of obtaining visual structure models.

Description

technical field [0001] The invention relates to the field of traditional Chinese medicine information retrieval, in particular to a traditional Chinese medicine data mining method based on an LDA topic model. Background technique [0002] The present invention relates to relevant content of topic models in the field of machine learning, mainly including vector space models, singular value decomposition and LSA, probabilistic implicit semantic analysis pLSA, latent Dirichlet allocation LDA, and the like. [0003] The vector space model is widely used in the field of information retrieval. Salton initially used the BOW (Bag Of Words) model in the TREC project, that is, words in documents have exchangeability (Exchangeability) to describe the relationship between words and text. In his model, the semantics of words is independent of the text, each word is a dimension in the word space, and in this way the entire corpus (collection of documents) can be described. [0004] Laten...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 姜晓红严海明商任翔吴朝晖陈英芝
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products