Last common ancestor rapid search method of XML keyword search

A keyword and public technology, applied in the database field, can solve the problems of space consumption, time-consuming, huge, etc., and achieve the effect of reducing overhead

Inactive Publication Date: 2009-02-11
FUDAN UNIV
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of using this encoding is that given any two nodes, the LCA can be solved by simply comparing their encodings, but at the same time it should be seen that the encoding increases with the depth of the nodes. On the one hand, when solving the longest common prefix It will be time-consuming, and on the other hand, storing such codes consumes a lot of space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Last common ancestor rapid search method of XML keyword search
  • Last common ancestor rapid search method of XML keyword search
  • Last common ancestor rapid search method of XML keyword search

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] 1. Some concepts and definitions related to the present invention.

[0018] 1. XML document:

[0019] In the present invention, an XML document D is modeled as an ordered tagged tree T(N, E). The tree node set N includes all elements, attributes or values ​​in the document. The edge set E represents the containment relationship between elements. For simplicity, we ignore possible reference edges between nodes.

[0020] 2. LCA node set:

[0021] Given a keyword set S={k1, k2, k3,...kn}, there will be a node set Si for each keyword ki, and each node in it directly contains the keyword ki. For every possible node combination {e1, e2, ..., en}, where ei∈Si, there will be a corresponding LCA node v, namely v=lca(e1, e2, ..., en). Here we use lca(S1, S2,...Sn) to represent all possible combinations of LCA node sets.

[0022] 3. SLCA node set:

[0023] For the node v in the lca(S1, S2, ... Sn) set, if there is no other node u in the set that satisfies v

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to a database technical field and particularly provides a novel key search method based on RMQ. Through effectively pre-processing, the method can eliminate the time coefficient d used for solving the common ancestor by means of Dewey coding when the key search on XML is carried out. In the meanwhile, since Euler Sequence of the mere storage node instead of Dewey coding can effectively reduce the cost of the storage space. Therefore, the method is superior to the prior method in performance and space utilization. The invention relates to a rapid search method for the Least Common Ancestors of an XML key search.

Description

technical field [0001] The invention belongs to the technical field of databases, and in particular relates to an efficient keyword retrieval method for XML data. Background technique [0002] With the popularity of XML, keyword retrieval on XML is also becoming a research hotspot. Keyword retrieval on XML does not require users to understand the DTD or schema mode of the queried XML, complex XML query language (such as XQuery) and other related knowledge, so it is easier to be accepted by users. Usually keyword search on the Web, such as google or Baidu, their return result is the entire web page containing the keyword provided by the user. However, when searching for keywords on large XML documents, since XML documents are usually modeled as a tree structure with a hierarchical nesting relationship, users usually hope to get the smallest result fragments, and the returned results should contain these keywords. The node set of the word, and the descendants of any node in ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 周傲英谢涛王晓玲
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products