Reasoning methods, systems, computer equipment and storage media of short text theme distribution

A technology of topic distribution and reasoning methods, applied in the field of big data, can solve problems such as ignoring important information

Active Publication Date: 2021-06-22
HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN (INSTITUTE OF SCIENCE AND TECHNOLOGY INNOVATION HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN)
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The problem with this strategy lies in two points: 1. How to reasonably use external information to guide the work of topic models; 2. Existing topic models usually only consider semantic information as external information, while ignoring some other important information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reasoning methods, systems, computer equipment and storage media of short text theme distribution
  • Reasoning methods, systems, computer equipment and storage media of short text theme distribution
  • Reasoning methods, systems, computer equipment and storage media of short text theme distribution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application.

[0053] The topic model is a kind of unsupervised method to mine and extract topics from text data, and obtain the topic distribution of a single text at the same time. In the topic model, a topic is represented as a probability distribution on the vocabulary, and each text can be represented as a probability distribution on a topic group.

[0054] In one embodiment, such as figure 1 As shown, a reasoning method for short text topic distribution is provided, and this method is taken as an example to illustrate, including the following steps:

[0055] Step 102, extracting ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present application relates to a reasoning method, system, computer equipment and storage medium for topic distribution of short texts. The method includes: extracting the co-occurrence word pairs appearing in the short text within a unit time, integrating the co-occurrence word pairs to obtain a phrase set; associating the phrase sets according to the semantic similarity and historical co-occurrence degree, and obtaining the dynamic association of the phrase sets degree, and store the dynamic association degree in the form of a phrase matrix; extract the subject name from the phrase set, and modify the subject name according to the dynamic association degree; count the subject name in the short text after the correction, and obtain Topic distribution of the short text. Through the designed index of dynamic correlation, different importance is given to each co-occurrence word pair. In addition, the topic name extraction in this method has a biased topic model, so that more continuous and compact topic names can be extracted, and the topic distribution of each short text can be inferred more accurately.

Description

technical field [0001] This application relates to the field of big data, in particular to a reasoning method, system, computer equipment and storage medium for topic distribution of short texts. Background technique [0002] Topic model is a kind of topic mining and extraction from text data. In order to design a topic model suitable for short text data, researchers usually use several useful strategies. The first one is to limit the number of topics in each short text, which is obtained through the Dirichlet Multinomial Mixture Model (DMM) model, in which it is assumed that each short text contains only one topic through this model. This strategy further simplifies the topic model by limiting the topic information in the target data, in order to more accurately mine, extract and assign topics. Second, build topic models on word patterns that contain sufficient topic information. A typical representative is the Attentional Segments Topic Model (ASTM). ASTM will extract th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30G06F16/36G06F16/383
CPCG06F16/36G06F16/383G06F40/30
Inventor 廖清郭颐冰黄裕涛漆舒汉刘洋
Owner HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN (INSTITUTE OF SCIENCE AND TECHNOLOGY INNOVATION HARBIN INSTITUTE OF TECHNOLOGY SHENZHEN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products