Indexing structure of XML (Extensive Markup Language) document

An indexing structure and document technology, applied in the fields of instrumentation, computing, electrical digital data processing, etc., can solve the problems of low index space efficiency and low time efficiency of retrieval algorithm, and achieve the effect of avoiding redundancy problems and reducing processing.

Inactive Publication Date: 2010-09-15
PEKING UNIV
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0019] In order to solve the problem that the index space efficiency based on Dewey coding is not high and the time efficiency of the retrieval algorithm may be low, the present invention proposes a new XML element coding method: LAF (Layer order And Father numbering) coding, the length of the LAF code is related to the element in Depth independent in XML tree

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Indexing structure of XML (Extensive Markup Language) document
  • Indexing structure of XML (Extensive Markup Language) document
  • Indexing structure of XML (Extensive Markup Language) document

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention is further described by examples below, but it should be noted that the purpose of announcing the embodiments is to help further understand the present invention, but those skilled in the art can understand: without departing from the spirit of the present invention and the appended claims Various substitutions and modifications are possible within the scope. Therefore, the present invention should not be limited to the content disclosed in the embodiments, and the protection scope of the present invention is subject to the scope defined in the claims.

[0038] Figure 4 An example of hierarchical traversal tree is given, the hierarchical traversal results are A, B, C, D, E, F, G, H, I, J, and their corresponding hierarchical traversal numbers are 0, 1, 2, 3 , 4, 5, 6, 7, 8, 9.

[0039] figure 1 The result of using LAF encoding in the XML document is as attached Figure 6 shown. for the root node , its level traversal number is 0, i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a new indexing structure of an XML (Extensive Markup Language) document, belonging to the field of data retrieval. For nodes in the XML document, LAF (Layer order And Father numbering) coding is defined into three parts: a hierarchy traversal number of the nodes, a hierarchy traversal number of father nodes of the nodes and depths at which the nodes locate. The invention further provides a two-stage indexing structure based on the LAF coding; in the indexing structure, the plain text attribute of the XML document is stored in a primary index, and the semi-structure attribute of the XML document is stored in a secondary index, and the primary index and the secondary index are associated together through a pointer. The invention provides the two-stage indexing technology, which can not only avoid the redundancy problem possibly brought by the traditional indexing method, but also support a more-efficient retrieval algorithm and reduce the treatment frequency to invalid elements by the retrieval algorithm.

Description

technical field [0001] The invention relates to an XML document secondary index structure, which belongs to the field of data retrieval. Background technique [0002] Since its birth in 1998, XML documents are now widely used in the Internet, databases and other fields, and have become the language standard for data exchange and integration on the Internet. With the emergence of a large number of XML documents, how to quickly find information that meets user needs from large-scale XML documents has become a research hotspot in the field of information retrieval and databases. [0003] XML information retrieval can be divided into two categories: keyword retrieval and "keyword + structure" retrieval. The XML retrieval standards XPath and XQuery promulgated by W3C are representatives of "keyword + structure" retrieval. "Keyword + structure" retrieval provides an effective description method for users to accurately express their query needs, so as to obtain high-quality searc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 向永清邓志鸿
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products