Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for searching and sequencing keywords of XML documents based on semantic correlation

A semantic correlation, retrieval and sorting technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., to achieve the effect of improving accuracy

Active Publication Date: 2011-06-01
JIANGSU ZHONGWEI HEAVY IND MACHINERY +1
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to overcome the deficiency that the prior art cannot accurately reflect the impact of XML structure and semantics on the relevance of query results, the present invention provides a method for retrieval and sorting of keywords in XML documents based on semantic correlation, which better solves the problem of retrieval targets and query results. Consistency of user information requirements and ensuring the information integrity of query results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for searching and sequencing keywords of XML documents based on semantic correlation
  • Method for searching and sequencing keywords of XML documents based on semantic correlation
  • Method for searching and sequencing keywords of XML documents based on semantic correlation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] Some concepts and definitions relevant to the present invention:

[0025] Definition 1. Topic node: For node n, if the tree T(n) rooted at n contains another subtree T(m) rooted at node m, then n is the topic node.

[0026] Definition 2. Attribute node: For node n, if the subtree rooted at n only contains the content of text values, then n is an attribute node.

[0027] Definition 3. Conditional attribute keywords: Conditional attribute keywords are the names of a type of attribute nodes, which indicate the user's query conditions. For example, the query Q={article, title, XML} indicates that the user wants to search for article information containing the XML keyword in the title, where title is a conditional attribute keyword.

[0028] Definition 4. Return attribute keywords: indicate the keywords returned by the user query. For example, the query Q={article, XML, author} indicates that the user wants to find the auhor information about the XML article, where author ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for searching and sequencing keywords of extensible markup language (XML) documents based on semantic correlation, which comprises the following steps of: sequentially analyzing the XML documents; calculating the semantic correlation of a subject node and an attribute node, and the semantic correlation of the attribute node and the keywords; optimizing search time; performing word stemming on the input query keyword; taking subject node information and correlation information which correspond to the keyword out of an inverted index; searching a subject closest to the keyword; sequencing search results according to correlation from high to low; searching a subject second closest to the keyword; and returning an information segment to a user according to a Dewey code in a result. Aiming at the unique structural semantic characteristic of XML data, SRank correlation search model and method is provided, so that the accuracy of the research results can be improved.

Description

technical field [0001] The invention belongs to the technical field of extensible markup language (XML) keyword retrieval, and in particular relates to a method for keyword retrieval and sorting of XML documents. Background technique [0002] As an international standard for information description and exchange on the Internet and in enterprise applications, XML (eXtensible Markup Language) has many advantages such as semantic marking, easy expansion, openness and interoperability. With the promotion of XML technology and the continuous increase of XML data, the information retrieval technology for XML documents has become a research hotspot in related fields such as information retrieval and database. [0003] Traditional information retrieval techniques are mainly aimed at text documents and HTML documents. An important feature that distinguishes XML documents from text and HTML documents is that they contain rich semantic and structural information, which helps to judge ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 陈群王鹏娄颖崔海文李霞张立军李战怀
Owner JIANGSU ZHONGWEI HEAVY IND MACHINERY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products