A power grid timing sequence large data parallel loading method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A time-series data and big data technology, applied in the direction of program loading/starting, program control devices, etc., can solve problems such as inability to load in parallel, time-consuming, waiting, etc., and achieve the effect of reduced time and efficient parallel loading

Active Publication Date: 2015-03-11

CHINA REALTIME DATABASE

View PDF5 Cites 16 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Loading a large amount of historical time series data from a single client cannot exert the ability of distributed concurrent processing, and it takes a lot of time to complete, while the general multi-client parallel loading will encounter multiple clients reading and writing the index mapping table at the same time during data loading The file generates a large number of disk IO conflicts and the network communication overhead between different nodes of the cluster can not be loaded in parallel and the resulting waiting phenomenon is caused; after a preliminary search, no technical solutions to solve the above technical problems have been found.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0018] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0019] This embodiment describes the present invention by using an application example in a power grid business scenario. Assume that the following cluster based on Hadoop and HBase consists of 5 machines and a high-availability HA configuration is performed on the cluster. The configuration of each machine is shown in Table 1. In this application scenario, there are 600,000 measurement points, the data collection frequency is 60 frames / min, and each data record collected is about 70 bytes, so the 600,000 measurement points will generate 3.3T bytes per day (24 hours). The data. The following describes the implementation of the method by taking loading 3.3T data into a big data system as an example.

[0020]

[0021] Table 1 Configuration of each machine in the cluster

[0022] The flow chart of the inventive method is as figure 2 Shown...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention discloses a power grid timing sequence large data parallel loading method, to solve the problem, i.e., waiting phenomena occurs due to failing in parallel loading when multiple clients parallel loads an abundant amount of historical timing sequence data. By partitioning an index mapping table, the present invention performs partitioning preprocessing on a historical timing sequence data storage table according to the size of the amount of data to be loaded, and performs the processing of maintaining data locality of the abundant amount of historical timing sequence data to be loaded according to the range of the partition of the historical timing sequence data storage table distributed on each data node. After the abovementioned processes, disk IO conflicts and the network communication overheads between different nodes of the cluster encountered by the multiple clients when reading the index mapping data file can be effectively reduced when parallel loading the abundant amount of historical timing sequence data, therefore, performance issues caused by overload when loading the abundant amount of historical timing sequence data by a single node. The present method can fully use the distributed parallel processing ability to greatly reduce the time for loading the abundant amount of historical timing sequence data.

Description

technical field [0001] The invention relates to a data parallel loading method, which belongs to the field of big data processing and distributed real-time databases, and is particularly suitable for the parallel loading method of massive historical time series data in smart grids and the Internet of Things. Background technique [0002] With the continuous development of industrialization and informatization, large-scale process industry enterprises generate more and more massive historical time series data in the process of production informatization. Taking the power system as an example, on the one hand, the scale of measuring points is getting bigger and bigger, and it is expected to reach tens of millions or even more than one hundred million; This puts forward higher requirements on the processing scale and processing speed of the real-time database. [0003] Constrained by its traditional software architecture, traditional real-time databases can no longer meet the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F9/445

Inventor 王远袁军包建国胡健张珂珩

Owner CHINA REALTIME DATABASE

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A power grid timing sequence large data parallel loading method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology