Data processing method and device based on mapping reduction
A data processing and data collection technology, applied in the field of cloud computing, can solve problems such as low connection efficiency, achieve the effects of saving reading and transmission, improving operation efficiency, and saving IO overhead
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0032] Such as figure 2 As shown, the flow chart of the data processing method based on map reduction provided by the embodiment of the present application includes the following steps:
[0033] S201: Receive multi-way data sets and connection field information for performing associated queries on the multi-way data sets.
[0034] Here, multi-way data sets such as R, S, T, etc., and connection fields such as age, name, etc. when performing associated queries.
[0035] S202: Perform a mapping operation on each data set to obtain multiple intermediate result sets, and for each intermediate result set, determine at least one Reduce node corresponding to the intermediate result set according to the partition function set for each connection field, and use the intermediate result set The set is sent to each identified Reduce node.
[0036] In the specific implementation process, multiple intermediate result sets obtained by performing the mapping operation are key-value (key-val...
Embodiment 2
[0051] In order to solve the problem of large disk I / O overhead and high network communication cost in the traditional MapReduce framework when realizing the connection of multi-channel data sets, the embodiment of the present application transforms the partition function interface in the existing MapReduce framework, and after transformation, a MapReduce The task can complete the connection task of multiple data sets, and the intermediate result sets satisfying all connection fields are sent to the same Reduce node to save IO overhead and network resources, and the algorithm efficiency is significantly improved.
[0052] The basic idea of the embodiment of the present application is: when using the MapReduce framework to connect multiple data sets, the intermediate result sets that meet the connection conditions in the multiple data sets can be sent to the same Reduce node for connection processing, instead of It is necessary to split the connection task of this multi-way da...
Embodiment 3
[0070] Based on the same inventive concept, the embodiment of this application also provides a data processing device based on map reduction corresponding to the data processing method based on map reduction. The data processing method of the contract is similar, so the implementation of the device can refer to the implementation of the method, and the repetition will not be repeated.
[0071] Such as Figure 5 As shown, the map reduction-based data processing device structure diagram provided by the embodiment of the present application includes:
[0072] A receiving module 501, configured to receive multi-way data sets and connection field information for performing associated queries on the multi-way data sets;
[0073] The sending module 502 is configured to perform a mapping operation on each data set to obtain multiple intermediate result sets, and for each intermediate result set, determine at least one Reduce node corresponding to the intermediate result set according...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More - R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com



