Unlock instant, AI-driven research and patent intelligence for your innovation.

Information retrieval model for modeling various related characteristics in ad-hoc retrieval task

A related feature and information retrieval technology, which is applied in network data retrieval, other database retrieval, unstructured text data retrieval, etc., can solve the problems of query features and document feature interaction information that do not consider query features

Pending Publication Date: 2020-07-24
TIANJIN UNIV
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these models only use a small number of relevant features or consider a variety of relevant features from the perspective of documents, and do not take into account the relevant features of the query and the interaction information between query features and document features.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information retrieval model for modeling various related characteristics in ad-hoc retrieval task
  • Information retrieval model for modeling various related characteristics in ad-hoc retrieval task
  • Information retrieval model for modeling various related characteristics in ad-hoc retrieval task

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The technical scheme of the present invention is described in further detail below in conjunction with accompanying drawing, but protection scope of the present invention is not limited to the following description. figure 1 Shows the query-document correlation analysis process proposed by this method; figure 2 The Match-Transformer model designed by the present invention is shown; Table 3 shows the final comparison results between different information retrieval models. Specific steps are as follows:

[0039] (1): From the TREC dataset, according to the topic in Web TREC (Robust-04 and ClueWeb-09-CAT-B.), find 1000 related documents from the dataset.

[0040] (2): From the corpus obtained in (1), randomly select 80%*400 topics as the training set and 20% 400*topics as the test set, and preprocess the training set and the test set respectively, and remove each Stop words and punctuation marks of a text.

[0041] (3): For preprocessed queries and documents, construct...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an information retrieval model for modeling various related features in ad-hoc retrieval tasks, namely a Match-Transformer model, which comprises the following steps: collecting a corpus set according to topic, and dividing the corpus set into a training set and a test set; preprocessing queries and documents in the corpus set; constructing vector representation of the query and the document by using the global information and the local information; inputting the training set query and the vector representation of the document into a Match-Transformer model to calculatethe score of the document, and training a final model; inputting the query in the test set and the vector representation of the document into a Match-Transformer model so as to calculate the final score of each document; finally, using a Learning-to-Rank model to learn relative position information between the documents, and finally obtaining a more accurate document ranking result. According tothe method, the difficulties of user demand diversity caused by too short query and text understanding diversity caused by too long documents are overcome, so that various related characteristics of the query and the documents can be better utilized, and the neural network information retrieval model has relatively strong robustness.

Description

technical field [0001] The invention relates to the technical field of text information retrieval, in particular to an information retrieval model for modeling various related features in an ad-hoc retrieval task. Background technique [0002] With the continuous development of the Internet and smart technology, information retrieval is no longer limited to personal computer terminal (PC) searches, and users are increasingly relying on mobile devices to search for the information and services they need. The quality of the information retrieval model directly affects the results of information retrieval. Therefore, the information retrieval model not only has important theoretical significance, but also contains huge social value. The present invention mainly studies the ranking of documents under a given query in an ad-hoc task, that is, the correlation analysis between queries and documents. [0003] Information retrieval model is the main research content of information ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/953G06F16/33
CPCG06F16/953G06F16/3344
Inventor 胡泽婷张鹏蒋永余
Owner TIANJIN UNIV