Large-data storage and optimization method

A technology of big data storage and optimization method, applied in the field of data processing, can solve problems such as insufficient performance response and cumbersomeness, and achieve the effect of good response speed, good reliability, and efficient big data storage optimization method

Inactive Publication Date: 2013-12-11
GUANGDONG ELECTRONICS IND INST
View PDF2 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, opportunities and challenges coexist, and the open source distributed architecture is particularly cumbersome when solving distributed applications, especially for large data storage and frequent file writing and reading operations. The performance response is insufficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large-data storage and optimization method
  • Large-data storage and optimization method
  • Large-data storage and optimization method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] The present invention proposes a large data storage optimization method based on the G-cloud cloud platform. The JobClient client submits data to the data acquisition system. Mass data adopts data preprocessing technology to standardize the data submitted by the JobClient client. The data compression technology adopts The efficient storage structure RCFile splits the data horizontally, and introduces the block and slice mechanism, that is, first divides into blocks and then slices. The blocks are stored by rows, and the slices are stored by columns; mass data processing optimization introduces CCIndex, and the data Random traversal is transformed into efficient row-by-index traversal, and CCT is introduced to perform horizontal line-by-line copying of records to complete incremental data backup; parallel computing components complete HDFS file system and Map / Reduce computing model configuration optimization, providing highly fault-tolerant and high-throughput The mass da...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data processing, in particular to a large-data storage and optimization method facing to sea-cloud coordination. The method comprises the following steps of data preprocessing, calculation optimization and mass data optimization, wherein the step of data preprocessing comprises data collection, multi-source data organization and gathering, data redundant processing and data compression storage; the calculation optimization comprises HDFS (hadoop distributed file system) file transmission and optimization and Map / Reduce parallel calculation and optimization; and the step of mass data optimization comprises data backup for disaster recovery, data encryption, CC index and CCT backup. The large-data storage and optimization method disclosed by the invention can be applied to large-data storage of a cloud platform.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a big data storage optimization method for sea-cloud collaboration. Background technique [0002] With the rapid development of information technology, the traditional persistent storage scheme has become more and more difficult to adapt to the development of information business in terms of architecture; the Hadoop distributed system distributes data access and storage among a large number of servers through distributed algorithms. It is a subversive development of the traditional storage architecture to reliably store multiple backups while distributing access to each server in the cluster. However, opportunities and challenges coexist, and the open source distributed architecture is particularly cumbersome when solving distributed applications, especially for large data storage and frequent file writing and reading operations. The performance response is insufficient. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F11/14
Inventor 安宏伟季统凯
Owner GUANGDONG ELECTRONICS IND INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products