A Parallel Loading Method of Power Grid Time Series Big Data

A big data and time series technology, applied in the field of parallel loading of massive historical time series data, can solve problems such as inability to load in parallel, consume a lot of time, and network communication overhead, etc., and achieve the effect of efficient parallel loading and reduced time

Active Publication Date: 2018-02-02
CHINA REALTIME DATABASE
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Loading a large amount of historical time series data from a single client cannot exert the ability of distributed concurrent processing, and it takes a lot of time to complete, while the general multi-client parallel loading will encounter multiple clients reading and writing the index mapping table at the same time during data loading The file generates a large number of disk IO conflicts and the network communication overhead between different nodes of the cluster can not be loaded in parallel and the resulting waiting phenomenon is caused; after a preliminary search, no technical solutions to solve the above technical problems have been found.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Parallel Loading Method of Power Grid Time Series Big Data
  • A Parallel Loading Method of Power Grid Time Series Big Data
  • A Parallel Loading Method of Power Grid Time Series Big Data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0019] This embodiment describes the present invention by using an application example in a power grid business scenario. Assume that the following cluster based on Hadoop and HBase consists of 5 machines and a high-availability HA configuration is performed on the cluster. The configuration of each machine is shown in Table 1. In this application scenario, there are 600,000 measurement points, the data collection frequency is 60 frames / min, and each data record collected is about 70 bytes, so the 600,000 measurement points will generate 3.3T bytes per day (24 hours). The data. The following describes the implementation of the method by taking loading 3.3T data into a big data system as an example.

[0020]

[0021] Table 1 Configuration of each machine in the cluster

[0022] The flow chart of the inventive method is as figure 2 Shown...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for parallel loading of time-series large data of a power grid to solve the problem that parallel loading of massive historical time-series data cannot be performed by multiple clients and the waiting phenomenon occurs. The present invention performs partition processing on the index mapping table, performs partition preprocessing on the historical time series data storage table according to the size of the data volume to be loaded, and according to the range of partitions of the historical time series data storage table allocated on each data node, the data to be loaded Massive historical time-series data is processed to maintain data locality. After the above-mentioned processing, it can effectively reduce the disk IO conflicts encountered by multiple clients when reading and writing index mapping table files and the network communication overhead between different nodes of the cluster when loading massive historical time-series data in parallel. , to avoid performance problems caused by heavy load when a single node loads massive historical time series data. This method can make full use of the distributed parallel processing capability, and greatly reduce the loading time of massive historical time series data.

Description

technical field [0001] The invention relates to a data parallel loading method, which belongs to the field of big data processing and distributed real-time databases, and is particularly suitable for the parallel loading method of massive historical time series data in smart grids and the Internet of Things. Background technique [0002] With the continuous development of industrialization and informatization, large-scale process industry enterprises generate more and more massive historical time series data in the process of production informatization. Taking the power system as an example, on the one hand, the scale of measuring points is getting bigger and bigger, and it is expected to reach tens of millions or even more than one hundred million; This puts forward higher requirements on the processing scale and processing speed of the real-time database. [0003] Constrained by its traditional software architecture, traditional real-time databases can no longer meet the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/445
Inventor 王远袁军包建国胡健张珂珩
Owner CHINA REALTIME DATABASE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products