Row-type and column-type storage method and system for tree data

A column-based storage and row-based technology, applied in the field of data processing, can solve problems such as insufficient coding and query efficiency, low system function and use efficiency, and achieve the effect of improving efficiency

Active Publication Date: 2020-04-03
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0015] 5. In the key-value pair mentioned in 2 above, the value of the key can only be of type (string)
[0045] 2) NoSQL data processing system is not efficient enough to encode and query data
These will lead to some additional restrictions on the function and use of the system and will cause its execution to be inefficient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Row-type and column-type storage method and system for tree data
  • Row-type and column-type storage method and system for tree data
  • Row-type and column-type storage method and system for tree data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0105] In view of the above deficiencies in the prior art, the present invention redesigns and implements a semi-structured data processing system STEED. The following introduces the overall architecture of the STEED system and briefly introduces the functional requirements of each module, then analyzes the interface definitions between these modules, and briefly explains how STEED internally processes and stores data.

[0106] Such as image 3 As shown, STEED mainly consists of three modules:

[0107] (1) Data analysis module:

[0108] Read text data and parse it into row or column binary format data, which is stored in the data storage module. In the process of data parsing, a syntax tree is dynamically generated to store the definition of semi-structured data. When parsing the data in JSON format, because it does not define corresponding data format (syntax tree, schema tree), so the present invention can only dynamically generate the definition of data format in the pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a row and column storage method and system of tree-shaped data. The method supports reading and analyzing text data in a tree-shaped structure into a row or column binary format for storage, wherein in an analyzing process, a syntax tree is generated dynamically, and a definition of semi-structured data is stored; in an inquiry process, STEED is configured to complete relevant inquiry operations according to relevant structure information of the original data read in the syntax tree through combination with the contents in the binary data. In the row storage structure, recording is taken as a unit, and a nested substructure is defined internally to represent nesting and a repeating field of the semi-structured data; in the column storage, every path from a root to a leaf in the syntax tree is taken as a unit, and the value of the path in all records and structure information of the path are independently stored. According to the method and the system, through the analysis on the storage structure of the semi-structured data, the data storage structure is simplified, and the storage efficiency is improved.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a row-type and column-type storage method and system for tree data. Background technique [0002] With the development of computer network and big data processing technology, traditional relational data can no longer meet the requirements of data definition and use in the network and big data environment, and semi-structured data represented by JSON and Protocol Buffers because It can not only fully express the object (Object) data in the programming language, but also modify and expand the original data format according to the format change of the data, so it is widely used in the actual environment. [0003] Definition of tree-structured data: [0004] T value =T primitive | T object | T array [0005] T primitive =string|number|boolean|null [0006] [0007] [0008] Record=T object [0009] As shown above, the tree structure data is defined as follows:...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/80
CPCG06F16/80
Inventor 陈世敏王智义
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products