Short text recommendation method for user-based biterm topic model

A topic model and recommendation method technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problem of not considering the author information of short texts, the quality of topic analysis is difficult to meet the requirements of short text recommendation, loss and so on

Inactive Publication Date: 2016-05-25
NANJING UNIV
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this type of method has an obvious defect. It does not consider the author information of the short text, and only relies on the co-occurrence of two words in the text t

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text recommendation method for user-based biterm topic model
  • Short text recommendation method for user-based biterm topic model
  • Short text recommendation method for user-based biterm topic model

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0089] Example 1, Quantitative evaluation of the theme analysis ability of UBTM of the present invention

[0090] 1. Input and output data description

[0091] We apply the method of the present invention to the anonymized data of the actual microblog. The input is a set of microblog data, and the statistics are shown in Table 1: the data set has 101212 short texts, which are divided into 738 according to different users group, each group has an average of 137.14 documents, and the average word length of each document is 29. Several samples of the data are listed below.

[0092] A few samples of short text data

[0093]

[0094]

[0095] The output is the topic analysis quality evaluation index of the UBTM topic model of the present invention.

[0096] 2. Model learning and parameter inference

[0097] First read all microblogs and users corresponding to the microblog, and read a list of stop words in Chinese at the same time. For each microblog, use the stop word l...

example 2

[0111] Example 2, application evaluation in the Weibo recommendation scenario

[0112] 1. Input and output data description

[0113] In this example, we apply the topic analysis of the present invention to the practical application scenario of microblog recommendation. From the 6-month Weibo data, we selected more than 7,000 Weibo with relatively high popularity, and observed 380,000 records that more than 20,000 users reposted or did not repost these 7,000 Weibo. Retweeting can be used as a factual basis for users to like this Weibo. Predicting the behavior of retweeting is the purpose of this experiment: we recommend Weibo to users based on UBTM, and measure the recommendation accuracy and recall according to whether users retweet or not. Rate.

[0114] The selection rules of 380,000 records are as follows: First, we divide the data into training set and test set according to time. For each user, arrange the microblogs forwarded by the user according to time, and select th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text topic analysis technology based short text recommendation method. Information forwarded or published by a user is subjected to topic analysis by utilizing a text topic model to obtain topic preferences of the user, and information meeting the user preferences is recommended from large amounts of unread information, so that the information overloading problem of a system is better solved. Based on a biterm topic model (BTM) and a short text based aggregation method, a new short text topic analysis-oriented topic model, namely, a user-based biterm topic model (UBTM), is proposed; and an experiment in a real data set from microblog shows that the UBTM can obtain a topic with higher quality in comparison with a conventional short text topic analysis method. A UBTM based short text recommendation experiment also shows that the short text recommendation method proposed by the invention has a better recommendation effect.

Description

technical field [0001] The present invention relates to text recommendation, especially focusing on short text recommendation. On the basis of topic analysis technology, the biword model is extended, and the author information of the text is used to effectively enhance its topic extraction ability in short text scenarios and improve its prediction accuracy in short text recommendation systems. Background technique [0002] In recent years, with the rapid development of the Internet and smart mobile devices, social media applications represented by Twitter and Weibo have become more and more popular. Personal websites, blogs, social networking sites and other applications will emerge a large amount of information every day , making it difficult for users to obtain effective information, which leads to a serious problem of information overload, and it is difficult for users to find the content they are interested in in a large amount of generated information. Text recommendat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/9535
Inventor 吕建徐锋魏杰
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products