Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Big data query method based on data distribution

A query method and data distribution technology, applied in the field of big data query, can solve problems such as high cost, query response time exceeding users, and query algorithm is difficult to further optimize, to ensure randomness, good scalability and maintainability, improve The effect of query efficiency

Active Publication Date: 2018-09-11
NORTHEASTERN UNIV
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, if the traditional accurate query technology is still used today to process massive data, on the one hand, it is difficult to further optimize the query algorithm;
In addition, narrowing the search scope (Search Scope) is the main idea to optimize the query. Traditional data partitioning and indexing technologies can accurately narrow the search scope and improve the query hit rate. However, these technologies rely on good division of data value domains and fine data structure, maintaining an accurate data partition and index is expensive in a big data environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data query method based on data distribution
  • Big data query method based on data distribution
  • Big data query method based on data distribution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The specific embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and embodiments. The following examples are intended to illustrate the present invention, but not to limit the scope of the present invention.

[0047] This implementation uses the data requested to access the 1998 World Cup sports event web site (WorldCup98, at http: / / ita.ee.lbl.gov / html / contrib / WorldCup.html) as an example, and uses the data distribution based data distribution of the present invention as an example. The big data query method queries the data. The data is massive data with multiple attribute values, including attribute values ​​such as timestamp, server, visitor IP address, and data request type.

[0048] A big data query method based on data distribution, such as figure 1 shown, including the following steps:

[0049] Step 1. Divide the data to be queried into data segments according to the amount of data, and cal...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a big data query method based on data distribution, and relates to the technical field of big data query. The method firstly divides data segments of data to be queried, calculates a segment potential distribution function of a whole data segment according to the acceleration ratio to determine the quantitative proportion relationship of each data segment, according to a constraint condition, loads the data to be queried into each data segment, finally, according to the time limit requirement of a user query request, adopts a data segment selection algorithm to determinethe data segments participating in a query, and feeds back a query result, actual recall rate and confidence. The method ensures randomness, performance and approximation evaluation of various queries in a distributed environment, and is compatible with an accurate query, newly added data does not affect the query effect, and good expansibility and maintainability are achieved.

Description

technical field [0001] The invention relates to the technical field of big data query, in particular to a big data query method based on data distribution. Background technique [0002] The high integration of the ternary world of human, machine, and matter has led to the explosive growth of data scale and the high complexity of data models. The world has entered the era of networked big data (Big Data). Faced with such a huge amount of data, how to find the target data within a tolerable time frame is crucial. [0003] Early research on query processing technology mainly focused on the optimization and scheduling of precise queries, and a lot of results have been achieved. However, if the traditional precise query technology is still used today to process massive data, on the one hand, it is difficult to further optimize the query algorithm, and on the other hand, the precise query task will be extremely heavy, resulting in the response time of the entire query exceeding t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 宋杰董伟徐超王蓓蕾
Owner NORTHEASTERN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products