Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Offline analysis method for massive data

A technology for off-line analysis and massive data, applied in the field of off-line analysis, can solve problems such as increasing data robustness and cleanliness, achieve the effect of solving the bottleneck of data collection, improving collection efficiency, and improving efficiency

Inactive Publication Date: 2016-07-20
STATE GRID CORP OF CHINA +3
View PDF3 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Most of these data are stored in the Oracle system that is closely integrated with the business. How to effectively analyze and utilize these data and increase the robustness and cleanliness of the data are the problems and challenges faced by the distribution network data analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Offline analysis method for massive data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0042] The present invention utilizes the distributed storage of Hadoop platform cluster mass data and the efficient and fast parallel computing capability, specifically:

[0043] Data acquisition and preprocessing. For different types of electric power big data, a variety of data collection modes can be adopted. For streaming data, the Kafka collection tool is used to aggregate the streaming data to the Kafka cluster, and then stored and processed by HBase. For relational databases, the Sqoop data exchange tool is used, combined with the data cleaning module that can be customized and configured, and the distributed import of relational data to HDFS or HBase is realized through the Map-Reduce distributed computing framework. At the same time, Sqoop provides the incremental import function of data. For large data files, they need to be imported into Hadoop using FTP protocol or localized upload. After the power big data is stored in HDFS, ETL tools can be used to perform da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an offline analysis method for massive data. The method comprises the following steps: (1) collecting massive power data; (2) preprocessing the massive power data; (3) offline analyzing the massive power data; and (4) displaying an analysis result. By adopting the offline analysis method for the massive data provided by the invention, centralized storage, unified management and sharing of power data resources of different times and spaces, different business and different scenes can be realized; and scientific and reasonable reference is provided for power decision of a management layer by the analysis and the mining of the massive historical data.

Description

technical field [0001] The invention relates to a method for off-line analysis, in particular to a method for realizing off-line analysis of massive data. Background technique [0002] Electric power information technology is developing in the direction of intelligent integration of data and information applications, and its operation mode will move towards a new stage centered on services. Smarter. Big data is the concentrated expression of technology-integrated development and intelligent application concepts under the new situation. Value-added content Value-added service as the goal of the application model, its core is the development and utilization of information resources. [0003] With the continuous deepening and advancement of smart grid construction, the amount of data generated by power grid operation and equipment inspection / monitoring is increasing exponentially, especially the massive heterogeneous and polymorphic data generated in the process of distributi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 潘森周爱华朱力鹏饶伟黄进蔡皓
Owner STATE GRID CORP OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products