Theme map generation method, device and equipment suitable for text analysis or data mining and computer storage medium
A text analysis and data mining technology, applied in the computer field, can solve problems such as insufficient visualization and weak topic relevance, and achieve the effect of improving depth and breadth, improving efficiency and accuracy, and searching accurately
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0062] Such as Figure 1-7 As shown, in the method for generating a topic map suitable for text analysis or data mining provided in this embodiment, the main body of the executed software is a topic map engine, and may but not be limited to include the following steps.
[0063] S101. Acquire a corpus containing a large number of documents.
[0064] In the step S101, the corpus is used to provide a sufficient amount of training corpus for the training process of the LDA topic model, and the training corpus can be provided by the user or constitute various document data collected by existing acquisition software, each A document may, but is not limited to, consist of a part or several fields of title, abstract, keywords, text, attachment title, attachment content, and author information. In addition, the mass of documents is generally more than 10,000 documents, for example, 100,000 documents are selected to form the corpus.
[0065] S102. Carry out numerical processing on the...
Embodiment 2
[0087] Such as Figure 8 As shown, this embodiment provides a hardware device that implements the method for generating a topic map suitable for text analysis or data mining described in Embodiment 1, including an acquisition module, a training module, an analysis module, a search module, and a generation module; the acquisition module is used to obtain a corpus that includes a large amount of documents; the training module is used to carry out numerical processing to the word collection of each document in the corpus, and then import the numerical processing results into the LDA theme as a training sample The model is trained to obtain a topic-word matrix and a document-topic matrix, wherein the topic-word matrix represents the probability of each word appearing in each topic, and the document-topic matrix represents the probability of each topic appearing in each document The probability of; the analysis module is used to obtain the feature word set of each topic accordin...
Embodiment 3
[0090] Such as Figure 9 As shown, this embodiment provides a hardware device for implementing the method for generating a topic map suitable for text analysis or data mining described in Embodiment 1, including a memory and a processor connected by communication, wherein the memory is used to store computer program, and the processor is used to execute the computer program to realize the steps of the method for generating a topic map suitable for text analysis or data mining as described in Embodiment 1.
[0091] For the working process, working details and technical effects of the topic map generating device provided in this embodiment, please refer to Embodiment 1, and details are not repeated here.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com