Big data storage method and device

A big data storage and data technology, applied in the field of big data storage methods and devices, can solve the problem of poor scalability and fault tolerance of the parallel database-led type, the inability of the parallel database and MapReduce integrated type to push to the execution engine, and scalability Improve flexibility and ease of use, reduce management costs, and reduce learning costs

Inactive Publication Date: 2013-12-11
SUGON INFORMATION IND
View PDF4 Cites 34 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The parallel database-dominated type uses MapReduce to enhance the data processing function of the parallel database, such as EMC's Greenplum, Aster Data, but its scalability and fault tolerance have not changed; the MapReduce-dominated type uses SQL (Structure Query Language, Structured Query Language) interface and support for patterns to improve the ease of use of MapReduce, such as Hive, Pig Latin, but it still can not meet the demand for real-time data processing; parallel database and MapReduce integration is based on the Hadoop framework Obtain better fault tolerance and support for heterogeneous environments, and at the same time obtain the performance advantages of relational databases, but there are no application cases at present. The reason is that the work cannot be pushed to a suitable execution engine
[0004] To sum up, among the existing big data storage technologies, the parallel database-led type has poor scalability and fault tolerance; the real-time performance of MapReduce-led data processing still cannot meet the requirements; the parallel database and MapReduce integrated type cannot push to the appropriate execution engine

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data storage method and device
  • Big data storage method and device
  • Big data storage method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] like figure 1 As shown, the present invention provides a method for storing big data. First, the object data is received; the attribute information of the object data is identified, that is, whether the data is structured data, semi-structured data, or unstructured data, according to The attribute information of the data is stored in the corresponding storage unit. The object data mentioned above can be large batches of data generated by users, and its composition is relatively complex, including structured data, semi-structured data, and possibly unstructured data, which are stored using existing technologies, or exist Issues with scalability and fault tolerance, or poor real-time processing, or inability to combine with a suitable execution engine.

[0024] like figure 2 As shown, the storage system includes a first storage subsystem for storing object data and a second storage subsystem for storing association relationships and modes of object data. Wherein the f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a big data storage method and device. The method includes the steps of receiving object data, recognizing attribute information of the object data and storing the object data in a first storage sub system in a storage system according to the attribute information of the object data. According to the big data storage method and device, structural data, semi-structural data and non-structural data are unified to be stored in a database platform and an Hadoop platform as objects, performance advantages of a relational database, fault tolerance of the Hadoop platform and an MapReduce framework and support for dynamic data models are effectively made use of, and data modes of the objects and corresponding attribute information are stored in metadata so that the data can be conveniently sent to a proper executing engine to complete an inquiry when data analysis is carried out. Therefore, unified management of large quantities of the structural data and the non-structural data is achieved, management cost is reduced, flexibility and usability of data processing are facilitated, and learning cost of a user who uses the big data storage device is reduced.

Description

technical field [0001] The invention relates to the field of data storage, in particular to a large data storage method and device. Background technique [0002] Data can be divided into structured data, semi-structured data and unstructured data by type. Structured data refers to a data type that can be expressed in a two-dimensional structure and can be stored in a relational database; semi-structured data refers to A data type with a certain structure but not clear semantics, such as emails, HTML web pages, etc. Some of their fields are definite, and some of them are not. Unstructured data refers to a data type that cannot be represented by a two-dimensional structure. Various data types, mainly including office documents, texts, pictures, audio and video files, etc., cannot be processed by relational databases. With the rise and development of social networks, a large amount of UGC (User Generated Content, User Generated Content) has been produced, including unstructure...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 狄静舒王颖宋怀明苗艳超刘新春邵宗有
Owner SUGON INFORMATION IND
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products