Time-based microblog document expansion method oriented to microblog retrieval

An extension method and microblog technology, applied in the field of microblog retrieval, can solve problems such as weakening

Active Publication Date: 2016-08-31
HEILONGJIANG INST OF TECH
View PDF1 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In order to solve the problem that short microblogs have adverse effects on document expansion, which will weaken the effect of document expansion on estimating microblog document mod

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time-based microblog document expansion method oriented to microblog retrieval
  • Time-based microblog document expansion method oriented to microblog retrieval
  • Time-based microblog document expansion method oriented to microblog retrieval

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0077] Specific implementation mode one: as Figures 1 to 3 As shown, this embodiment is described in detail as follows for the time-based microblog document extension method oriented to microblog retrieval:

[0078] In the first part of this embodiment, the temporal characteristics of related microblogs are analyzed. In the second part, a time-based document expansion model is proposed, a time-based document expansion and a temporal proximity-based document expansion are given, and a machine learning-based document expansion word selection method is given. This embodiment also provides two temporal document language models for optimizing time overhead, introduces experimental data, evaluation indicators, baseline methods and model training, and provides experimental results and analysis.

[0079] 1. Analysis of time characteristics of related microblogs

[0080] This section takes 9251 related microblogs of 110 queries in the TREC microblog evaluation as an example, and ana...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a time-based microblog document expansion method oriented to microblog retrieval, and relates to the technical field of microblog retrieval. The microblog document expansion method aims to solve the problem that the improvement of microblog retrieval performance is limited since a function on estimating a microblog document model by the document expansion can be weakened due to adverse effects brought to the document expansion by the short microblog. The time-based microblog document expansion method utilizes the time characteristic expansion document of a relevant microblog, and puts forward the time-based microblog document model. The model comprehensively considers the characteristics of the explosiveness, which is integrally presented on an aspect of time, of the relevant microblog and the adjacency, which is presented on an aspect of time, of an individual, utilizes the distribution of words on a microblog in an outbreak period and the microblog which is in near neighbor with the microblog in the outbreak period to obtain the weight of a document expansion word, and selects a query expansion word by a method based on machine learning so as to estimate an accurate document model. The method can better avoid influence brought to the document expansion by the short microblog.

Description

technical field [0001] The invention relates to a microblog document extension method and relates to the technical field of microblog retrieval. Background technique [0002] With the rapid development of social media and mobile Internet, the processing technology of short text information flow represented by Weibo has become more and more important. In the face of massive microblogs, numerous users and their diverse information needs, short-text social media retrieval has become an indispensable and important part of Internet applications. [0003] The main problem of short text retrieval is that the content is too small. For example, under normal circumstances, a Weibo consists of 140 characters. This makes it difficult for microblogs related to the query to be retrieved only by matching the words in the original microblog with the query words. How to add words related to Weibo to the original Weibo, enrich the Weibo document model, and alleviate the problem of word mism...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/951G06F16/9535
Inventor 韩中元孔蕾蕾杨沐昀齐浩亮李生
Owner HEILONGJIANG INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products