Unlock instant, AI-driven research and patent intelligence for your innovation.

A data parallel processing method and system

A parallel processing and data technology, applied in the field of big data processing, can solve the problem of lack of descriptive semantics in MapReduce, and achieve the effect of ensuring scalability

Active Publication Date: 2017-09-29
上海浪潮云计算服务有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, MapReduce lacks descriptive semantics similar to SQL. Developers need to implement algorithm details by themselves, and consider issues such as query optimization, load balancing, data merging and sorting methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data parallel processing method and system
  • A data parallel processing method and system
  • A data parallel processing method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0057] In order to make the purpose, technical solution and advantages of the present invention more clear, the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined arbitrarily with each other.

[0058] Such as figure 1 As shown, the embodiment of the present invention provides a data parallel processing method, the method comprising:

[0059] S10, one or more Map nodes read the fragmented data of the account log data, select candidate data records whose state duration meets the query date requirements from the read fragmented data, and generate the first candidate data records of the selected candidate data records An output parameter and a second output parameter; wherein, the first output parameter of the candidate data record includes at least an account ID, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a parallel processing method of data. The method comprises the steps that fragment data of account log data are read by one or more Map nodes, candidate data records with state lasting time satisfying an inquiry data requirement are selected from the fragment data, and therefore first output parameters and second output parameters of the selected candidate data records are generated, wherein the first output parameters at least comprise account ID and the second output parameters at least comprise a state starting date, a state finishing date and a state value; the different candidate data records processed by the Map nodes are read by one or more Reduce nodes and a complete historical state record of each account within the inquiry data range is generated according to the first output parameters and the second output parameters of the candidate data records; the candidate data records with the same account ID in the first output parameters are read by the same Reduce node. The parallel processing method can improve the processing efficiency of large-scale log data. The invention further discloses a parallel processing system of the data.

Description

technical field [0001] The present invention relates to the technical field of big data processing, in particular to a data parallel processing method and system. Background technique [0002] As human society enters the information age, data has become an equally important strategic resource like water and oil. By mining massive amounts of data, the operational decisions of governments and enterprises can be based on a more scientific basis, improving decision-making efficiency, crisis response capabilities, and public service levels. [0003] However, although big data is extremely valuable, due to its complex types and huge scale, traditional data warehouses and distributed processing technologies have specific shortcomings, and they face problems such as continuous scalability and sky-high costs. For example, the historical status data commonly used in data warehouses to record the behavior of an object, in the era of big data, with the rapid increase in the number of o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/24532
Inventor 亓开元赵仁明辛国茂房体盈
Owner 上海浪潮云计算服务有限公司