A data model conversion and query analysis method applicable to various big data management systems

A management system and data model technology, applied in the field of database and big data, it can solve problems such as the operation that cannot provide non-relational models, and that presto cannot perform data analysis.

Active Publication Date: 2021-01-12
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, presto, Spark DataFrame can map any data into a relational model and perform unified SQL queries, but it cannot provide non-relational model operations, and presto cannot perform data analysis

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data model conversion and query analysis method applicable to various big data management systems
  • A data model conversion and query analysis method applicable to various big data management systems
  • A data model conversion and query analysis method applicable to various big data management systems

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] Embodiment 1: develop corresponding query engine according to the operation of LDM

[0066] Such as image 3 As shown, developers can combine domain knowledge and customize structured query language based on associated document operations. In the process of executing the query, the query statement is converted into a series of operations on associated documents, and finally these operations are used to operate on the associated documents combined by the multi-database. Such as image 3 As shown, these multivariate databases include relational database tables, key-value databases, graph databases, document databases, etc. The data in each database is just an example and has no special meaning, so its meaning will not be explained one by one.

Embodiment 2

[0067] Example 2: Using LDM as an ETL tool that interacts with existing analysis tools

[0068] Such as Figure 4 As shown, developers can realize the mapping rules between the associated document and the original data model and the three data structures. According to user needs, the data in the metadata model is integrated and converted into a target type of data, and finally transmitted to a series of data analysis tools such as Spark DataFrame, Spark Graphx, and TensorFlow through pipelines or Drivers.

Embodiment 3

[0069] Example 3: Design a distributed computing model based on LDM, and realize unified query and analysis of data on this basis

[0070] Spark implements a distributed array RDD, developers can implement a distributed LDM memory management platform, and provide related basic operations and operation interfaces of LDM. Developers using the platform can write data queries and machine learning algorithms directly using the interface.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data model conversion and query analysis method suitable for various big-data management systems. The method comprises the following steps that: 1) establishing an associated document model which comprises a document set and an association set, wherein the association set is a set associatively formed between documents; 2) converting the semantic information of data and different original data models into the associated document model; and 3) converting data in the associated document model into a data structure which can be accepted by a distributed computing programming model so as to convert the associated document model into the distributed computing programming model, and then, utilizing the distributed computing programming model obtained by the conversion of the associated document model to carry out uniform query and analysis on data from different original data models. By use of the method, the uniform access, query and analysis of a multi-source heterogeneous data source can be realized.

Description

technical field [0001] The invention relates to a data model, in particular to a data model conversion and query analysis method suitable for management and analysis in a big data management system, and belongs to the technical fields of big data and databases. Background technique [0002] With the continuous popularization of computers, the demand for data management and processing has become increasingly urgent. People have proposed different data models for different data forms and characteristics, and implemented corresponding data management systems to realize data management and analysis. Influential data models such as the E-R model have basically dominated the database world for more than 40 years since they were proposed in the 1970s. In the past ten years, with the deepening of Internet and Internet of Things applications, the generation of large-scale structured, semi-structured, and unstructured data has triggered the NoSQL movement [Cattell R. Scalable SQL and ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/25
CPCG06F16/254G06F16/258
Inventor 黎建辉李跃鹏沈志宏
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products