MapReduce based XML data query method and system

A data query and XML tree technology, applied in the field of XML query processing, can solve problems such as slow query speed

Active Publication Date: 2015-10-28
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

XML has many advantages. Its wide use makes the amount of XML data grow explosively. The speed of processing XML documents with a large amount of data on one machine can no longer meet people's needs. The query speed of traditional memory-based query methods Very slow, and some distributed XML query methods traverse the entire document for each query

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MapReduce based XML data query method and system
  • MapReduce based XML data query method and system
  • MapReduce based XML data query method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0105] like figure 1 As shown, the present embodiment discloses a XML data query method based on MapReduce, and it is characterized in that, the steps are as follows:

[0106] Step 101, the server receives the XPath query request from the client;

[0107] Step 102, after receiving the XPath query request, the server checks whether the XML document to be queried has been range coded;

[0108] If not, then enter step 103;

[0109] If so, enter step 104;

[0110] Step 103, perform interval encoding on the XML document to be queried, and then enter step 104; in this step, use MapReduce to perform interval encoding on the nodes in the XML tree in the XML document data, the specific process is as follows: Hadoop framework converts the XML tree in the XML document data The nodes in are input to the Map function in the form of key-value pairs. The input of the Map function includes two types, one of which is the start tag, and the other is the end tag; each time the Map function ob...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a MapReduce based XML data query method and system. The method comprises the steps of: receiving an XPath query request of a client by a server; checking whether a to-be-queried XML document is subjected to region encoding or not; performing region encoding on the to-be-queried XML document not subjected to the region encoding; checking whether the to-be-queried XML document is subjected to hierarchical encoding by the server; performing hierarchical encoding on the to-be-queried XML document not subjected to the hierarchical encoding; analyzing a query statement in the query request; generating a query plan tree, and performing estimation on a structural connection result; establishing a cost model, and executing cost estimation on the query plan tree; finding a optimal query plan tree; obtaining the optimal query plan tree, and analyzing an input file of a MapReduce task; executing a MapReduce query task; constructing an output file of the MapReduce task into an XML data result as a query result; and returning the XML data query result to the client. The method has the advantages of being relatively high in execution efficiency, high in speedup ratio, good in query processing performance and good in scalability.

Description

technical field [0001] The invention relates to the field of XML (Extensible Markup Language, Extensible Markup Language) query processing, in particular to a MapReduce-based XML data query method and system. Background technique [0002] XML is an extensible markup language used to mark data, define data types, transmit and store data. Markup is a key part of this, creating content and then marking it with qualifying tags so that each word, phrase or chunk becomes identifiable, classifiable information. The created document, or document instance, consists of elements (markup) and content. Elements help to better understand a document when it is read from a printout or processed electronically. The more descriptive the element, the easier it is to identify parts of the document. Since the emergence of markup, content with markup has an advantage that when the computer system is missing, the data can still be printed out through markup comprehension. [0003] XML currentl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/3331G06F16/951
Inventor 李东邓泽航李祖立
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products