Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data storage system based on Hadoop architecture

A data storage system and Hadoop cluster technology, applied in the field of big data storage, can solve the problem that the real-time performance of MapReduce-led data processing cannot meet the requirements, the scalability and fault tolerance have not changed, and the parallel database-led scalability and fault tolerance Poor and other issues

Inactive Publication Date: 2018-03-13
GUANGDONG AOFEI DATA TECHNOLOGY CO LTD
View PDF9 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The parallel database-dominated type uses MapReduce to enhance the data processing function of the parallel database, such as EMC's Greenplum, Aster Data, but its scalability and fault tolerance have not changed; the MapReduce-dominated type uses SQL (Structure Query Language, Structured Query Language) interface and support for patterns to improve the ease of use of MapReduce, such as Hive, Pig Latin, but it still can not meet the demand for real-time data processing; parallel database and MapReduce integration is based on the Hadoop framework Obtain better fault tolerance and support for heterogeneous environments, and at the same time obtain the performance advantages of relational databases, but there are no application cases at present. The reason is that the work cannot be pushed to a suitable execution engine
[0004] To sum up, among the existing big data storage technologies, the parallel database-led type has poor scalability and fault tolerance; the real-time performance of MapReduce-led data processing still cannot meet the requirements; the parallel database and MapReduce integrated type cannot push to the appropriate execution engine

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data storage system based on Hadoop architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be described in detail below. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other implementations obtained by persons of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.

[0031] A data storage system based on Hadoop architecture, the storage system includes at least one application server, backup server, database cluster and at least one core layer switch;

[0032] The database cluster includes a first sub-storage cluster and a second sub-storage cluster; the basic data of the structured data is stored in the first sub-storage cluster, and the unstructured and semi-structured loose data is stored in the second sub-storage cluster...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a data storage system based on a Hadoop architecture. The storage system comprises at least one application server, a backup server, a database cluster and at least one core layer switch; the database cluster comprises a first sub-storage cluster and a second sub-storage cluster; basic data of structured data are stored in the first sub-storage cluster, and unstructured and semi-structured loose data are stored in the second sub-storage cluster; the application server, the backup server and the database cluster are respectively connected to the core layer switch; and the application server is connected with the backup server and the database cluster, and used for managing real time data of the backup server and the database cluster. The data storage system based onthe Hadoop architecture provided by the invention uses a distributed storage mode to store the data and uses redundant storage to ensure the reliability of the data. An HDFS (Hadoop Distributed FileSystem) module can reliably store massive files across machines, and stores various files as data block sequences of the same size.

Description

technical field [0001] The invention belongs to the technical field of big data storage, and in particular relates to a data storage system based on Hadoop architecture. Background technique [0002] Data can be divided into structured data, semi-structured data and unstructured data by type. Structured data refers to a data type that can be expressed in a two-dimensional structure and can be stored in a relational database; semi-structured data refers to A data type with a certain structure but not clear semantics, such as emails, HTML web pages, etc. Some of their fields are definite, and some of them are not. Unstructured data refers to a data type that cannot be represented by a two-dimensional structure. Various data types, mainly including office documents, texts, pictures, audio and video files, etc., cannot be processed by relational databases. With the rise and development of social networks, a large amount of UGC (User Generated Content, User Generated Content) ha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L29/08G06F17/30
CPCH04L67/1095H04L67/1097G06F16/182H04L67/10H04L67/145H04L67/1001H04L67/561
Inventor 何烈军杨培锋苏灿廷
Owner GUANGDONG AOFEI DATA TECHNOLOGY CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products