Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A data query model and method for mapreduce paradigm

A data query and paradigm technology, applied in the field of information technology processing, can solve problems such as unsatisfactory processing efficiency and other problems

Active Publication Date: 2018-07-24
HOHAI UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in order to be able to support a wide range of query requirements, the compiled processing program is not satisfactory in terms of efficiency
For example, the Hadoop program generated by the PigLatin interpreter is only half as efficient as an optimal processing program

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data query model and method for mapreduce paradigm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0037] A data query model oriented to the MapReduce paradigm, including Map and Reduce data processing processes defined explicitly using UDF; among them, the UDF process is called a subquery statement, and the other statements are called the main query statement , both of which are described according to the existing database query language format; in the subsequent query processing process, the Hadoop platform distributes the subquery statements to each cluster computer according to the MapReduce...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data query model and a data query method for a MapReduce pattern, and belongs to the field of processing of information technology. The MapReduce pattern is organically combined with the current database query description language so as to uniformly process parallel and serial query sentences. To meet different parallel query requirements, the invention confirms the basic flow of standard description and processing in an environment where a Hadoop platform is combined with the current integrative database. By the method, distributed data locations stored in different systems are represented uniformly; and the query processes in Map and Reduce stages are described by a UDF (User Defined Function), so as to achieve the goals of performing distributed storage and parallel processing on various data objects according to a description format of the current database query language.

Description

technical field [0001] The invention relates to a data query model and method oriented to the MapReduce paradigm, belonging to the field of information technology processing. Background technique [0002] The Hadoop platform is an open source software platform implemented by the Apache Foundation based on the MapReduce parallel processing model. It has good scalability and can be easily and quickly deployed on a cluster platform consisting of dozens or even thousands of computers, so as to perform efficient parallel processing of massive data in batches. The Hadoop platform shields the important but cumbersome details of task scheduling, data storage and transmission in the parallel processing process, and requires users to write corresponding processing programs according to the MapReduce programming paradigm, so that data distributed and stored in the form of key-value can be realized. Purpose of processing. [0003] However, compared to database query description langua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/2425G06F16/24532G06F16/2471
Inventor 陆佳民冯钧
Owner HOHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products