Streamed data processing method in big data environment

A technology of streaming data and processing methods, which is applied in the field of cloud computing, can solve the problems of lack of effective methods for streaming data processing, and achieve the effects of optimizing parallel processing, refining granularity, and increasing throughput

Inactive Publication Date: 2013-10-09
FOCUS TECH +1
View PDF2 Cites 50 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patented method allows users to easily access powerful tools from large amounts of data without having them slow down or even stop working altogether due to insufficient computing power. It also improves efficiency when handling streams of data quickly while maintaining balance across tasks. Overall, these technical improvements make streamed data platforms more efficient at serving various needs like recommending products based on their preferences over time.

Problems solved by technology

Technological Problem addressed in this patents relates to efficiently handling massively growing web or website content that requires quick analysis and response times while still being able to handle streamed data rapidly without overwhelming any memory resources needed. Existing techniques like map/reduce cannot effectively work when dealing with massive amounts of data due to their slowness and complex nature.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Streamed data processing method in big data environment
  • Streamed data processing method in big data environment
  • Streamed data processing method in big data environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The streaming data processing in the big data environment is based on the existing Hadoop—a framework that implements MapReduce—and adds three modules to the existing functions to support the processing of streaming data. It is especially suitable for real-time streaming data processing based on a large amount of complex historical data, including data localization module, pipeline scheduling module and memory management module. The specific implementation methods are as follows:

[0032] In the data localization module, the key value interval is generally divided by the Hash function, so that each node can store data without redundancy. In order to ensure the balance of data division, the method based on probability statistics is used to divide the data, so that the data basically obeys the uniform distribution. Perform the following steps:

[0033] Step 1: Randomly collect some historical data or streaming data as samples, sort by keywords, if the keywords are strings...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a streamed data processing method in a big data environment and mainly relates to an improvement of a MapReduce computing model. The method concretely includes the steps that according to a localized non-redundancy storage and processing mechanism of data, every computing node is made to merely store and process data in a corresponding interval; relevant threads of Map and Reduce are scheduled in the mode of a flow line to accelerate the processing; an internal storage mechanism of intermediate results is used for ensuring effective implementation of data localization and the flow line and providing a high-speed, convenient and fast internal storage access mode. Through the three modules, the method guarantees that in the bag data environment, the reliability and the high efficiency of data stream processing meet the requirements for data processing in actual application.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner FOCUS TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products