Similar articles recommendation method based on theme model

A topic model and recommendation method technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of high manual labeling cost, small scope of application, poor recommendation diversity, etc., and achieve low manual labeling cost and recommendation. The effect of good diversity and wide application range

Inactive Publication Date: 2018-05-04
SUN YAT SEN UNIV
View PDF8 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although relevant research can achieve certain results in some application scenarios, problems such as high complexity, small scope of application, high cost of manual marking, and poor recommendation diversity limit the application of article recommendation algorithms.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similar articles recommendation method based on theme model
  • Similar articles recommendation method based on theme model
  • Similar articles recommendation method based on theme model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with specific embodiment:

[0029] See attached figure 1 As shown, a method for recommending similar articles based on a topic model described in this embodiment includes the following steps:

[0030] A. Preprocess the original text of the article through parsing technology to extract pure article content;

[0031] B. Use the stammer word segmentation tool to segment the content of the article;

[0032] C. Filter the words that retain the part of speech of nouns through part-of-speech analysis;

[0033] D. Perform word bag extraction on the word set of the article;

[0034] E. Use the frequency of each word in the dictionary to appear in the article to represent the word set of each article, so as to obtain the word feature vector after the article bag of words is extracted;

[0035] F. Use the word feature vectors of all articles to train the TFIDF model, and calculate the word feature vectors of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a similar articles recommendation method based on a theme model. Firstly, an original text preprocessing is conducted, a simple content of an article is extracted; then, the content of the article is divided into words, an analysis of the properties of the words is conducted, noun phrases are filtered out, a word bag is extracted, a main word feature vector is formed; next, a TFIDF model is trained with the word feature vectors of all articles, based on the TFIDF model, the word feature vector of each article is calculated, a TFIDF feature vector is formed; again, an LSI theme model is trained with the TFIDF feature vectors of all articles; finally, the LSI model is used to calculate, potential theme feature vectors of the article are obtained, similar articles canbe obtained from the calculation of the vector similarity. The method can help Internet users to find interesting articles effectively, and has the advantages of a wider range of application, lower manual marking cost, better recommendation diversity and the like.

Description

technical field [0001] The invention relates to the technical field of Internet information mining, in particular to a method for recommending similar articles based on a topic model. Background technique [0002] With the continuous development of the Internet, people's living habits and lifestyles are undergoing revolutionary changes. The development of the Internet not only facilitates people's lives, but also greatly increases the channels for people to obtain information. China Internet Network Information Center (CNNIC) mentioned in the "36th Statistical Report on Internet Development in China" that as of June 2015, the number of online news users in my country was 555 million, of which the number of mobile phone network news users was 460 million; As an important application for information acquisition, online news ranks second in usage rate after instant messaging. [0003] In the social background of big data, search engines represented by Google and Baidu allow use...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/2462G06F16/9535G06F40/284
Inventor 郑子彬黄炼楷
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products