Unlock instant, AI-driven research and patent intelligence for your innovation.

A Search Dimension Mining Method for Query Words in Massive Data

A technology of massive data and query words, applied in network data retrieval, network data indexing, and other database retrieval directions, it can solve problems such as inapplicability of data, few term lists, and inability to extract, and achieve the effect of perfect query dimensions

Active Publication Date: 2018-09-04
北京一览群智数据科技有限责任公司
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, in our previous research work, the search dimension mining method for query words in massive data mainly has the following four steps: (1) Extract the list of terms ( List); (2) score the term list and evaluate the importance of the term list; (3) merge similar term lists to form a query dimension; (4) calculate different query facets and term lists The above-mentioned scheme mainly has the following problems: there are many webpages (news data, Weibo blog posts, etc.) The list of items will be low, or unavailable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Search Dimension Mining Method for Query Words in Massive Data
  • A Search Dimension Mining Method for Query Words in Massive Data
  • A Search Dimension Mining Method for Query Words in Massive Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The application will be further described below in conjunction with the accompanying drawings.

[0036] With the rapid development of the Internet, the amount of information on the Internet is increasing. Faced with all kinds of information, it is often difficult for users to quickly obtain desired information. In order to facilitate users to quickly obtain the desired information, we process a large amount of search information, classify it according to the query dimension of the information, and then present it to the user. The query dimension is a series of words used to describe an important aspect of a query word , this series of words is a group of semantically related parallel terms, which is called a list of terms (List) in the present invention. For example, watches can classify a large amount of retrieved information according to query dimensions such as brand, feature, performance, and model. A TV series "Lost" can be classified according to the dimensions of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a search dimension excavation method of query terms in mass data. The method comprises the following steps of (1) extracting Lists from each web page in a crawled data set on basis of modes such as texts, HTML tags and repeat regions; (2) increasing extraction mechanisms to effectively expand the Lists extracted in the step (1); (3) evaluating significance of each extracted List; (4) clustering the Lists, namely combining the similar Lists to form a search dimension; and (5) ordering the search dimension and the Lists, namely calculating significance of different search facets and lexical items. According to the method, more effective Lists can be obtained, the new Lists are graded after the replenished Lists are obtained, the similar Lists are combined and classified, the significance of different search facets and Lists are calculated, finally the excavated search dimension can be more perfect, and users can obtain more complete information.

Description

technical field [0001] The invention relates to a search dimension mining method for query words in massive data. Background technique [0002] At present, in our previous research work, the search dimension mining method for query words in massive data mainly has the following four steps: (1) Extract the list of terms ( List); (2) score the term list and evaluate the importance of the term list; (3) merge similar term lists to form a query dimension; (4) calculate different query facets and term lists The above scheme mainly has the following problems: there are many webpages (news data, Weibo blog posts, etc.) without repeated regions and HTML tags, and the existing methods are not suitable for these data, especially news data, the extracted words The list of items will be few or nonexistent. [0003] Therefore, how to solve the above problems has become an urgent technical problem for those skilled in the art. Contents of the invention [0004] For the problems exist...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/35G06F16/951
Inventor 窦志成文继荣李谨秀
Owner 北京一览群智数据科技有限责任公司