Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Big Data Query Method Based on Data Distribution

A query method and data distribution technology, applied in the field of big data query, can solve problems such as high cost, query response time exceeding users, and query algorithm is difficult to further optimize, to ensure randomness, good scalability and maintainability, improve The effect of query efficiency

Active Publication Date: 2020-03-31
NORTHEASTERN UNIV LIAONING
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, if the traditional accurate query technology is still used today to process massive data, on the one hand, it is difficult to further optimize the query algorithm;
In addition, narrowing the search scope (Search Scope) is the main idea to optimize the query. Traditional data partitioning and indexing technologies can accurately narrow the search scope and improve the query hit rate. However, these technologies rely on good division of data value domains and fine data structure, maintaining an accurate data partition and index is expensive in a big data environment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Big Data Query Method Based on Data Distribution
  • A Big Data Query Method Based on Data Distribution
  • A Big Data Query Method Based on Data Distribution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0047] This implementation uses the data requested to visit the 1998 World Cup sports event web site (WorldCup98, the website is http: / / ita.ee.lbl.gov / html / contrib / WorldCup.html) as an example, using the data distribution based method of the present invention The big data query method queries the data. This data is massive data with multiple attribute values, including attribute values ​​such as time stamp, server, visitor IP address, and data request type.

[0048] A big data query method based on data distribution, such as figure 1 shown, including the following steps:

[0049] Step 1. Divide the data to be queried into data segments according to the data volume, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a big data query method based on data distribution and relates to the technical field of big data query. A big data query method based on data distribution. First, the data to be queried is divided into data segments, and the segment potential distribution function of the overall data segment is calculated according to the acceleration ratio to determine the quantity proportion of each data segment; then the data to be queried is divided into segments according to the constraints. The query data is loaded into each data segment; finally, according to the time limit requirements of the user's query request, the data segment selection algorithm is used to determine the data segments participating in the query, and the query results, actual recall rate and confidence are returned. The big data query method based on data distribution provided by the present invention ensures the randomness, performance and approximation evaluation of various queries in a distributed environment, is compatible with precise queries, and the new data will not affect the query effect, and has good Scalability and maintainability.

Description

technical field [0001] The invention relates to the technical field of big data query, in particular to a big data query method based on data distribution. Background technique [0002] The high degree of integration of the ternary world of man, machine, and things has triggered an explosive growth in data scale and a high degree of complexity in data patterns. The world has entered the era of networked Big Data (Big Data). Facing such a huge amount of data, how to find the target data within a tolerable time range is crucial. [0003] Early research on query processing technology mainly focused on the optimization and scheduling of precise queries, and a lot of results have been obtained. However, if the traditional accurate query technology is still used to process massive data, on the one hand, it is difficult to further optimize the query algorithm; In addition, narrowing the search scope (Search Scope) is the main idea to optimize the query. Traditional data partition...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/24
Inventor 宋杰董伟徐超王蓓蕾
Owner NORTHEASTERN UNIV LIAONING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products