Literature retrieval method based on semantic small-word model

A world model and document retrieval technology, applied in the computer field, can solve the problems of large additional cost of updating index information, inappropriate full-text retrieval, network load, etc., and achieve the effect of improving query speed, reducing information storage, and high accuracy.

Inactive Publication Date: 2007-08-15
HUAZHONG UNIV OF SCI & TECH
View PDF0 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above methods all require precise metadata matching (such as file names or keywords) to complete the search requirements. Since the semantic information of other nodes in the network cannot be obtained, it is necessary to blindly search a large number of nodes to ensure the recall rate of information retrieval. causing severe network load
Guiding query messages through improved neighbor node index information (such as local indexes) can improve query performance, but updating index information requires very large additional overhead
A structured peer-to-peer network (such as CAN, Chord) based on a distributed hash table can provide good scalability and effective search performance, but it can only support the keyword / value lookup method. For the information retrieval field Full-text search is not suitable, and the overhead of maintaining a structured peer-to-peer network structure is very high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Literature retrieval method based on semantic small-word model
  • Literature retrieval method based on semantic small-word model

Examples

Experimental program
Comparison scheme
Effect test

example

[0047] (1) The specific implementation of establishing a network topology with the characteristics of a small semantic world includes the following steps:

[0048] (1.1) Use latent semantic indexing to extract document feature vectors, as follows:

[0049] Latent semantic indexing is an extension of the vector space model in traditional information retrieval. In the vector space model, documents and queries are expressed as the weight information of all words in the document collection, and the similarity between the query sentence and the document is expressed by the cosine of the angle between the two in the vector space. If there are t different words in the set of d documents, use the word-document matrix A=(a ij )∈R t×d Represents the collection. Each column vector a j Corresponding documents j, a ij Indicates the weight of word i in document j. Through singular value decomposition, the matrix A is decomposed into three matrixes U, ∑ and V, where ∑ is a diagonal matrix with t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This invention discloses one file index method based on language meanings small world, which comprises the following steps: firstly using latent meanings index to extract file property vector to maintain file property to lower its dimensions and to reduce the information memory volume; then using supportive vector machine to sort all common files to form sort information to mark the sort interest proportion; finally using social network small world with small linkage point with high proportion interest of certain file sort to form network topological structure with small property.

Description

Technical field [0001] The invention belongs to distributed computing and information retrieval in the computer field, and specifically relates to a document retrieval method based on a semantic small world model. The method mainly uses the semantic small world model to solve efficient information storage and retrieval in a peer-to-peer network for document information sharing problem. Background technique [0002] Because of its scalability, fault tolerance, autonomy and self-organization, the peer-to-peer network system has attracted more and more attention in the field of large-scale information retrieval. However, in the peer-to-peer network of document information sharing, how to effectively store and retrieve information is still a very challenging problem. [0003] The phenomenon of small worlds is widespread in social networks, that is, everyone in the world can be connected through a short chain of social relations. The length of the chain of social relations is generall...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 金海宁小敏袁平鹏武浩余一娇
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products