Multi-dimension searching sequencing optimization algorithm and tool based on microblog data

A technology of retrieval sorting and optimization algorithm, which is applied in the fields of electric digital data processing, special data processing application, calculation, etc.

Inactive Publication Date: 2014-05-28
BEIJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Obviously, a single word frequency position weighted retr

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-dimension searching sequencing optimization algorithm and tool based on microblog data
  • Multi-dimension searching sequencing optimization algorithm and tool based on microblog data
  • Multi-dimension searching sequencing optimization algorithm and tool based on microblog data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0144] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings.

[0145] See figure 1 , The user enters the query sentence, including the query keywords and the query tags that indicate the user's query intention; Lucene parses the query keywords, retrieves the index file, and obtains the query result-the Weibo ID list weibo_list, and at the same time obtains the original Lucene ranking score ; The sorting optimization part mainly includes:

[0146] (1) Attr optimization (microblog data feature optimization) According to weibo_list, get the Attr of Weibo score ;

[0147] (2) Tag optimization (search tag optimization) According to weibo_list and tags, get the tags of Weibo in the list score ;

[0148] (3)Log optimization (search log optimization) query the user's search log, optimize and score the microblogs with records in the log, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-dimension searching sequencing optimization algorithm and tool based on microblog data. The algorithm optimizes original searching sequencing results of the Lucene on the aspects of three dimensions of data characteristics, user characteristics and application characteristics, and the optimized sequencing result is improved on the aspects of representing the data characteristics, according with the inquiry intention of a user, according with the application theme and the like. The multi-dimension searching sequencing optimization tool is achieved based on financial microblog data and is divided into three modules, and each module finishes the optimization in one dimension. A sequencing optimization module for the data characteristics achieves data characteristic optimization at the dimension I, a sequencing optimization module for the user characteristics achieves user characteristic optimization at the dimension II and comprises two sub-modules which are a searching label optimization sub-module and a searching log optimization sub-module, and a sequencing optimization module for the application characteristics achieves application characteristic optimization at the dimension III. The tool is suitable for optimizing all basic searching results of the Lucene, can achieve optimization sequencing of the original searching results of the Lucene in a system searching module, and provides better searching experience for the user.

Description

Technical field [0001] The invention relates to a multi-dimensional search and sort optimization algorithm, which optimizes Lucene search and sort results from the three dimensions of search data characteristics, search user characteristics and system application characteristics, and realizes a search optimization tool for a Web system based on financial microblog data . Background technique [0002] Social network data is becoming an important source of information for people, and Weibo data occupies a very important position in social data. Facing the massive amount of data on the Internet, it is necessary and important to use a data retrieval system for information retrieval. For certain keywords, the data retrieval system returns result data containing these keywords by searching the index. Generally, the data retrieval system uses certain formulas to calculate the correlation between the result data and the search keywords, and the higher correlation is returned to the user...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/3331G06F16/313
Inventor 闫丹凤张丽莹徐佳
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products