Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data query method and device based on big data environment

A big data and environmental technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., to achieve the effect of expanding capacity, making up for the inability to communicate directly, and improving query efficiency

Active Publication Date: 2016-10-05
CHINA MOBILE GROUP SICHUAN
View PDF3 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] In view of this, the embodiments of the present invention expect to provide a data query method and device based on a big data environment, which can provide a unified query interface between a Hadoop big data environment and a relational database, and make up for the inability of existing relational databases and Hadoop big data environment data to Technical bottleneck of direct intercommunication

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data query method and device based on big data environment
  • Data query method and device based on big data environment
  • Data query method and device based on big data environment

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0191] Example 1: Realize the left outer connection (ie left outer connection)

[0192] SELECT count(F. title)

[0193] FROM

[0194] Hbase.FactTable F LEFT JOIN DB2.Dim D

[0195] ON F.cid=D.cid

[0196] Where D.category_name='bingo',

[0197] Explanation: Extract the number of records with category_name='bingo' in the FactTable of Hbase, but there is no category_name information in the FactTable, which is stored in the Dim table of the DB2 database; the title is stored in the FactTable of Hbase, and the category_name is stored in DB2 On the Dim table of the database, there are about 100 million rows of data in the Fact table, 1 million records in the Dim table, and only 1,000 records with category_name='bingo'.

[0198] In the prior art, there are two general processing methods, the first method is to gather FactTable and DimTable together, and then do processing; the second method is to manually obtain the record number of category_name='bingo' on DB2, ...

example 2

[0200] Example 2: Realize inserting data into the database table

[0201] INSERT INTO

[0202] DB2.FactTable1000(title, category_name)

[0203] VALUES(

[0204] SELECT F.title,D.category_name

[0205] FROM

[0206] Hbase.FactTable1F LEFT JOIN DB2.Dim D

[0207] ON F.cid=D.cid

[0208] Where D.cid>1000

[0209] )

[0210] Description: Extract the title and category_name of cid>1000 from the FactTable of HBase and the category_name of the DimTable and store them in the DB2 database for daily query.

[0211] Using the method of the present invention, the query process is realized through the unified interface. The unified interface obtains the context relationship by explaining the SQL, and learns that it first needs to go to the DB2 database with cid>1000, and then finds the matching record on the HBase through the cid to obtain the record. After that, return to the buffer pool, in the buffer pool, merge the data, and then return to the client. T...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data query method based on a big data environment. The method comprises the steps of: when receiving a query request inputted by a client, analyzing the query request, acquiring a context relation of the query request, and forming a sequence of statement; analyzing statements which are distinguished in order based on metadata information, and obtaining nodes where a data source locates and node types; generating data manipulation statements corresponding to various nodes; and summarizing a manipulation result fed back by the various nodes to a buffer pool, performing merging treatment according to the sequence of statement, generating a query result, and outputting the query result to the client. The invention further discloses a data query device based on the big data environment. The technical scheme of the invention can provide a unified query interface of a Hadoop big data environment and a relational database, and make up the technical bottleneck that data of the existing relational database and data of the Hadoop big data environment cannot be directly exchanged.

Description

technical field [0001] The invention relates to the field of data storage and management, in particular to a data query method and device based on a big data environment. Background technique [0002] The main features of Hadoop Database (HBase, Hadoop Database) are: 1. Large: a table can have hundreds of millions of rows and millions of columns; 2. Column-oriented: column (family)-oriented storage and access control, columns ( 3. Sparse: For empty (null) columns, no storage space is occupied, so the table can be designed to be very sparse. [0003] The characteristics of relational databases are: relational retrieval operations are more convenient and can support complex conditional queries. [0004] At present, a large amount of data warehouse data is established on the basis of relational databases. In practical applications, the Hadoop technology used in data warehouses generally adopts a mix-and-match solution, which mainly includes the following types: [0005] 1. M...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 吴凤辉刘三苏
Owner CHINA MOBILE GROUP SICHUAN
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products