Text label recommendation method based on supervision topic model

A text label and topic model technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as low efficiency and low accuracy of text label recommendation, and achieve the effect of improving accuracy and efficiency

Active Publication Date: 2017-10-10
NANJING UNIV
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For this reason, the present invention proposes a new text label recommendation method on the basis of utilizing the observation infor...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text label recommendation method based on supervision topic model
  • Text label recommendation method based on supervision topic model
  • Text label recommendation method based on supervision topic model

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0038] Example 1, quantitative evaluation of the label recommendation ability of the Sim2Word method of the present invention

[0039] 1. Input and output data description

[0040] We apply the method of the present invention to real data sets such as StackOverflow. The input is a set of text data, each text has data such as title, body, label, etc. The statistics are shown in Table 1: we randomly selected 90% of the The data is used as training data, and the remaining 10% of the data is used as test data. However, because the overall StackOverflow dataset is too large, some of the methods we selected for comparison cannot calculate the results, so we also randomly selected some data from the entire StackOverflow dataset as sub-datasets (SO-10K and SO-100K) is used to compare the results of different methods

[0041] The output is the tag recommendation quality evaluation index of the Sim2Word method of the present invention.

[0042] 2. Model learning, parameter inference ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text label recommendation method based on a supervision topic model. According to the method, the new supervision text topic model Sim2Word is proposed by taking into account the characteristics that tags and related words frequently appear in corresponding texts, so that the problems that text keyword extraction methods have low prediction efficiency, and text topic analysis methods have low prediction accuracy are solved. The method includes the two main steps of firstly, acquiring the related word data of the existing tags based on a word vector technology, and then using the tags and the related words to train a tag prediction model; finally predicting the tags of new texts based on the model. Experiments on real datasets such as StackOverflow show that the method has higher recognition accuracy compared with traditional text tag recommendation techniques.

Description

technical field [0001] The present invention relates to tag recommendation, especially text tag recommendation. In many online websites with a large amount of text, tags and their related words appear frequently or repeatedly in the text content corresponding to the tags, and these words often occupy an important position in the text content. Using this observation information, on the basis of word vector technology, the relevant words of the tag are obtained, and the supervised topic model is trained based on the tags and related words, which effectively enhances the tag recommendation ability of new texts and improves the recommendation accuracy of the text tag recommendation system. Background technique [0002] In recent years, with the rapid development of computer Internet technology, labeling systems have been widely used on the Internet. On the one hand, tags usually represent keywords and are used to describe and summarize online content, making it easier to organi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/9535G06F40/30
Inventor 吕建徐锋姚远吴勇
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products