Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hadoop platform time series data incremental computation method and system

A technology of time series data and incremental calculation, applied in the computer field, can solve the problems of reducing data processing efficiency, less incremental calculation of time series data, and repeated calculation of time series data, so as to avoid repeated calculation and improve efficiency.

Active Publication Date: 2014-12-10
UNIV OF SCI & TECH OF CHINA
View PDF3 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, the Hadoop platform does not provide good support for time series data processing, and there are relatively few studies on the incremental calculation of time series data, which leads to the need for repeated calculations when new time series data is added, thereby reducing the efficiency of data processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop platform time series data incremental computation method and system
  • Hadoop platform time series data incremental computation method and system
  • Hadoop platform time series data incremental computation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] figure 1 This is a flowchart of a method for incremental calculation of time series data on a Hadoop platform provided in the first embodiment of the present invention. Such as figure 1 As shown, the method mainly includes the following steps:

[0026] Step 11. When the time-series data incremental calculation task is started, obtain the historical calculation state of the time-series data from the cache server.

[0027] Wherein, for the time series data, the continuous time series data is divided into multiple segments with a certain time period as a unit, then the time series data operation in each unit time period is a sub-operation; and the time series data after the segmentation needs Satisfy the monoid nature.

[0028] The time series data increment calculation task indicates that there is newly added segmented time series data.

[0029] Step 12: Perform incremental calculation using a segmented time series data incremental calculation method including SubCp and ReduceCP...

Embodiment 2

[0047] Image 6 It is a schematic diagram of a time-series data incremental calculation system on a Hadoop platform provided in the second embodiment of the present invention. Such as Image 6 As shown, the system mainly includes:

[0048] The time series data increment processing module TSI11 is used to obtain the historical calculation status of the time series data from the cache server when the time series data increment calculation task is started; according to the historical calculation status, use the segmented time series including SubCp and ReduceCP sub-operations The data increment calculation method is incremental calculation; among them, the SubCp sub-operation is to perform custom sub-operations on the segmented time series data and save the intermediate results; the ReduceCP sub-operation is the operation merging stage, and the sub-operations are divided according to the custom operation. The calculation results of the segment time series data are merged, and the ca...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Hadoop platform time series data incremental computation method and system. The method includes the steps that when a time series data incremental computation task is started, the historical computational state of time series data is obtained from a cache server; incremental computation is carried out by means of a segmented time series data incremental computation method containing a SubCp sub-operation and a ReduceCp sub-operation according to the historical computational state, wherein the SubCp sub-operation is used for self-definition of segmented time series data and storage of intermediate results, the ReduceCp sub-operation is carried out in an operation merging stage and used for merging operation on computed results of the segmented time series data according to the self-defined operation, and the computational state of the SubCp sub-operation and the computational state of the ReduceCp sub-operation are maintained through the cache server. By the adoption of the method and system, plenty of unnecessary repetitive computation can be saved through incremental computation, and therefore data processing efficiency is improved.

Description

Technical field [0001] The invention relates to the field of computer technology, in particular to a method for incremental calculation of time series data on a Hadoop platform. Background technique [0002] With the rapid development of today's Internet technology and the wide application of information collection technology, a large amount of various data in the form of time series has been generated and accumulated in many scientific and industrial fields such as telecommunications, meteorology, geology, electricity, and finance. The traditional time series processing method is generally to choose Matlab and other related mathematical calculation tools, but when the scale of the problem to be processed becomes larger, the problem calculation time is often unbearable. [0003] At present, as big data processing is gradually being valued by people, some companies and research institutions have also begun research in this area, and related work is mainly concentrated on the Hadoop ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/21
Inventor 孙广中王丹
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products