Semantic relevance-based XML (Extensive Makeup Language) keyword top-k inquiring method

A technology of top-k, query method, applied in the database field, can solve the problem of inability to implement the TA algorithm, achieve the effect of improving efficiency and quality, and avoiding redundant operations

Active Publication Date: 2011-05-18
ASIA PACIFIC LIGHT ALLOY NANTONG TECH +1
View PDF3 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In XML document information retrieval, structural semantics is one of the important factors that affect the relevance of query results, but some structural s

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semantic relevance-based XML (Extensive Makeup Language) keyword top-k inquiring method
  • Semantic relevance-based XML (Extensive Makeup Language) keyword top-k inquiring method
  • Semantic relevance-based XML (Extensive Makeup Language) keyword top-k inquiring method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] 1. Some concepts and definitions related to the present invention.

[0035] As shown in Figure 1, an XML document can be expressed as a tree model: T=(NE, NV, E, r), wherein the internal node set NE corresponds to the elements and attributes of the XML document, the leaf node set NV corresponds to the text of the XML document, and E is A set of directed edges represents the information inclusion relationship between nodes, and r is the root node of the document tree. figure 2 As shown in , in the virtual document, the nodes that directly contain the text are considered as descriptions of the text content, and are regarded as annotation nodes.

[0036] Threshold Algorithm is an efficient top-k query algorithm proposed by Fagin in 2001, which is widely used in various fields. Two conditions need to be met: 1. There is a monotonic relationship between the semantic relevancy of query results and the attribute values ​​affecting the relevancy; 2. The values ​​of the factors...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semantic relevance-based XML (Extensive Makeup Language) keyword top-k inquiring method which comprises the following steps of: pretreating a document needing XML by a tree structure and regarding an information segment which can meet the following condition in an XML document as a virtual document; calculating relevance degree between each virtual document and a lexical item contained in the virtual document according to a relevance degree calculating model, establishing an inverted list containing the lexical item virtual document for each lexical item and arranging the inverted list in a descending order according to the relevance degree; and realizing top-k query on the basis of relevance degree between the virtual document d and keyword query Q. The invention can return a plurality of most relevant query results to a user in advance according to the requirement of the user under the condition of not calculating all query results, prevent redundancy operation and improve the efficiency and quality of retrieval.

Description

technical field [0001] The invention belongs to the technical field of databases, and in particular relates to an XML document keyword top-k query method. Background technique [0002] Because of its simplicity, flexibility and high scalability, XML has become one of the important formats for data storage and exchange, and users have higher requirements for the efficiency and quality of XML data retrieval. Users do not need to understand the structure of XML documents or master complex query languages ​​when using keyword queries. Therefore, keyword queries have gradually become an important means of XML data information retrieval. With the rapid increase of the amount of XML data, the number of query results is correspondingly large. Similar to web information retrieval, users often care about the most relevant results. It is not advisable to calculate all the query results and return them to the user in terms of query efficiency and user needs. Using the idea of ​​top-k ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 娄颖陈群李战怀张利军李霞崔海文
Owner ASIA PACIFIC LIGHT ALLOY NANTONG TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products