XML document approximate enquiring method based on diversity

A query method and a variety of technologies, applied in special data processing applications, instruments, electronic digital data processing and other directions, can solve the problem of not considering the diversity of XML document query and low efficiency.

Inactive Publication Date: 2008-01-30
XI AN JIAOTONG UNIV
View PDF0 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the implementation of the query requires manual setting of the cost of mutation operations such as insertion and renaming of each node in the query structure, and in the case of a large number of mutation queries, the efficiency of finding the optimal K result documents is not high. In addition, this article does not consider the diversity of XML document queries under multiple DTDs

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • XML document approximate enquiring method based on diversity
  • XML document approximate enquiring method based on diversity
  • XML document approximate enquiring method based on diversity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] The XML document diversity processing module in Figure 1 uses the PTO model-based mapping rule automatic generation algorithm to rewrite the original query proposed by the user in the global query mode into a rewritten query tree under different DTDs; the XML document approximate query module under a single DTD By rewriting the query tree, combined with the basic mutation operation, the approximate query of the XML document set under a single DTD is realized through multiple accurate embedding of the mutation query tree; the query cost evaluation module uses the method based on the distribution statistics of XML sample data to calculate the The query cost of a query result; the Top-K problem solving module finally realizes the Top-K solution to the approximate query of diverse XML documents by performing interval coding on the nodes and pre-estimating the optimal mutation query tree.

[0071] Figure 2 shows a global query pattern of an XML data integration system in the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an XML approximate query method based on the various XML documents. The XML various processing modules adopt a mapping rule automatic generation algorithm based on a PTO model, and rewrite the primitive query raised by the users under global query mode into different rewritten query tree under different DTD. The XML documents approximate query method under a single DTD adopts a basic mutation operation to do querying variability for the rewritten query tree. The obtained query closure combing with a plurality of times of precise embedding realizes the approximate query of the XML document set. A querying cost evaluation module adopts a data distribution statistics method based on an XML sample, and computes the query cost of each query result. The invention aiming at the various XML documents not only can retract the precise query result, but also can retract an approximating result sequence with a similarity scoring in time.

Description

technical field [0001] The invention belongs to the technical field of computer design and application, and relates to computer software, information retrieval technology, semi-structured data processing technology, artificial intelligence technology and XML description language, and in particular to an approximate query method based on diverse XML documents. Background technique [0002] In recent years, with the emergence of XML (Extensible Markup Language), research on query algorithms for semi-structured data in XML documents has gradually attracted the attention of people in the field of information retrieval at home and abroad. XML documents have flexible expression capabilities, and this flexible expression ability makes it difficult for XML documents created by different organizations and individuals to follow a unified data model. Standards are used to create XML documents with exactly the same structure and identification content, resulting in the diversity of XML ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 覃征衡星辰邵利平姜山
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products