Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Information retrieval method and system based on spurious correlation feedback model

A pseudo-correlation feedback and information retrieval technology, applied in the field of information retrieval, can solve the problems of low average retrieval accuracy, difficulty, information flooding, etc.

Active Publication Date: 2017-10-13
HUAZHONG NORMAL UNIV
View PDF4 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] With the rapid development of the Internet and the accumulation of massive information, the accuracy of information search has become the first point of concern for all users. Now it is becoming more and more difficult to find what users want through information retrieval tools. At the same time, various information Excessive flooding makes users have to spend more time identifying which information is valuable to users
The common problem of existing information retrieval methods is that the average retrieval accuracy is not high. Even the current best retrieval model has an average accuracy of only 30%, and there is still a long way to go to improve the information retrieval accuracy.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information retrieval method and system based on spurious correlation feedback model
  • Information retrieval method and system based on spurious correlation feedback model
  • Information retrieval method and system based on spurious correlation feedback model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The core problem to be solved by the present invention is to propose a kernel function to reflect the distribution of user query words and document candidate words and the degree of correlation between the two, and to use this degree of correlation as an additional weight in the pseudo In the relevant feedback model, query expansion is implemented to improve the accuracy of retrieval.

[0055] The following describes in detail the information retrieval method of the present invention, which integrates the word correlation degree of the kernel function into the pseudo-correlation feedback model, in conjunction with the accompanying drawings and embodiments.

[0056] Aiming at the unreasonable assumption of vocabulary independence in the classical method, the present invention proposes to take the correlation between words into consideration. Through the effective use of some statistical information of the data in the document collection (such as context information and o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an information retrieval method based on a spurious correlation feedback model. The information retrieval method based on the spurious correlation feedback model comprises the following steps: fusing word correlation to a spurious correlation feedback model to realize information retrieval, respectively generating a query expansion word which uses the importance degree of a candidate expansion word as a feature and a query expansion word which uses the relevancy of the candidate expansion word and the query expansion word as a feature when the query expansion word is generated in a spurious correlation document set, and then binding the two query expansion words into the original query expansion word, and finishing final information retrieval; and when generating a query expansion word which uses the relevancy of the candidate expansion word and the query expansion word as a feature, calculating the relevancy between a query word and a candidate word which appear at different positions in a document by using a kernel function. By the method, the distribution condition of the query word and the candidate word can be highlighted, the candidate word which has higher degree of correlation on query thematic words is selected, and therefore, the accurate candidate word is positioned and precision of expansion query and final retrieval is improved due to additional relevancy information.

Description

technical field [0001] The invention belongs to the technical field of information retrieval, and in particular relates to an information retrieval method and system which integrates the kernel function word correlation into a pseudo-correlation feedback model. Background technique [0002] In the age of information competition, browsing and obtaining desired information with the help of search engines is an important part of people's daily life. However, the extremely rich network resources and the rapid expansion of the total amount of information make it difficult for users to efficiently and accurately obtain and identify important information. Information processing technology urgently needs a more effective theory and method to deal with the growing mass of data. As a classic text processing technology, information retrieval can adapt to this requirement and quickly become a research hotspot in the current information processing research field. [0003] Information Re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/3326
Inventor 何婷婷潘敏简芳洪毛智明
Owner HUAZHONG NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products