Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Short text topic distribution reasoning method and system, computer equipment and storage medium

A technology of topic distribution and reasoning methods, applied in the field of big data, can solve problems such as ignoring important information

Active Publication Date: 2021-01-05
HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN (INSTITUTE OF SCIENCE AND TECHNOLOGY INNOVATION HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN)
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The problem with this strategy lies in two points: 1. How to reasonably use external information to guide the work of topic models; 2. Existing topic models usually only consider semantic information as external information, while ignoring some other important information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text topic distribution reasoning method and system, computer equipment and storage medium
  • Short text topic distribution reasoning method and system, computer equipment and storage medium
  • Short text topic distribution reasoning method and system, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0053] The topic model is a kind of unsupervised method to mine and extract topics from text data, and obtain the topic distribution of a single text at the same time. In the topic model, a topic is represented as a probability distribution on the vocabulary, and each text can be represented as a probability distribution on a topic group.

[0054] In one embodiment, such as figure 1 As shown, a reasoning method for short text topic distribution is provided, and this method is taken as an example to illustrate, including the following steps:

[0055] Step 102, extracting ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a short text topic distribution reasoning method and system, computer equipment and a storage medium. The method comprises the steps: extracting co-occurrence word pairs appearing in a short text in unit time, and integrating the co-occurrence word pairs to obtain a word group set; associating the phrase set according to the semantic similarity and the historical co-occurrence degree to obtain a dynamic association degree of the phrase set, and storing the dynamic association degree in a phrase matrix form; extracting theme names from the phrase set, and correcting thetheme names according to the dynamic correlation degree; counting theme names in the corrected short text, and obtaining theme distribution of the short text. Through the index of the designed dynamic correlation degree, each co-occurrence word is endowed with different importance. Moreover, the extraction of the topic names in the method has a biased topic model, so various more continuous and compact topic names can be extracted, and the topic distribution of each short text can be reasoned more accurately.

Description

technical field [0001] This application relates to the field of big data, in particular to a reasoning method, system, computer equipment and storage medium for topic distribution of short texts. Background technique [0002] Topic model is a kind of topic mining and extraction from text data. In order to design a topic model suitable for short text data, researchers usually use several useful strategies. The first one is to limit the number of topics in each short text, which is obtained through the Dirichlet Multinomial Mixture Model (DMM) model, in which it is assumed that each short text contains only one topic through this model. This strategy further simplifies the topic model by limiting the topic information in the target data, in order to more accurately mine, extract and assign topics. Second, build topic models on word patterns that contain sufficient topic information. A typical representative is the Attentional Segments Topic Model (ASTM). ASTM will extract th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F16/36G06F16/383
CPCG06F16/36G06F16/383G06F40/30
Inventor 廖清郭颐冰黄裕涛漆舒汉刘洋
Owner HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN (INSTITUTE OF SCIENCE AND TECHNOLOGY INNOVATION HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products