Parallel task execution method and device based on Hive

A technology for performing tasks and simulating execution, which is applied in the field of computer communication to achieve the effect of making full use of and improving execution efficiency

Active Publication Date: 2014-07-23
GUANGZHOU PINWEI SOFTWARE
View PDF5 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0024] The purpose of the present invention is to propose a Hive-based parallel task execution method, which can solve the problem that traditional Hive can only execute tasks serially

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel task execution method and device based on Hive
  • Parallel task execution method and device based on Hive
  • Parallel task execution method and device based on Hive

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] In the following, the present invention will be further described in conjunction with the drawings and specific embodiments.

[0057] Such as figure 1 As shown, a Hive-based parallel task execution method includes the following steps:

[0058] Step S1, running the Hive script, the Hive script has multiple code segments. The code segment includes at least one SQL statement, and multiple SQL statements can also be packaged into an SQL script, and the code segment can also be an SQL script. In fact, the code segment can also be empty, and the empty code segment will not affect the running of the program.

[0059] Step S2, judging whether there is a startup execution command or a simulated execution command in the Hive script, if it is a startup execution command, then step S3 to step S5 is started and executed, that is, the code segment is actually executed; if it is a simulation execution command, Then step S7 is executed.

[0060] Step S3, identifying sequ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a parallel task execution method and device based on the Hive. The method includes the steps that a Hive script is operated; sequence marks of process control labels are recognized; all the sequence marks are compared, so that an execution sequence of code segments is obtained; the code segments are executed according to the execution sequence, wherein parallel execution is performed on the code segments with the same sequence mark. According to the parallel task execution method and device based on the Hive, parallel and serial relations between the code segments in the Hive scrip can be freely controlled by a developer, the execution efficiency of the Hive script is greatly improved, and the calculation capability of a Hadoop cluster can be utilized more fully.

Description

[0001] technical field [0002] The invention relates to computer communication technology, in particular to Hive data processing technology. [0003] Background technique [0004] The rapid development of the mobile Internet has led to a rapid increase in the data generated and applied by users. The emergence of massive data and changes in data structures have brought huge challenges to operators in the telecom industry to manage, analyze and process data. Traditional processing methods based on relational databases have been unable to effectively store and process growing and new types of business data. The development of Hadoop distributed technology provides technical means to solve the above problems. [0005] Hadoop is an open source project managed by the Apache organization. It is a software implementation based on Google cloud computing theory Big Table, MapReduce and GFS. Hadoop enables users to develop MapReduce programs without knowing the underlying details, a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/46G06F17/30
Inventor 张永亮
Owner GUANGZHOU PINWEI SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products