Document retrieval method for searching topic type query in TED speech

A document retrieval and topic technology, applied in the field of information retrieval, can solve problems such as failure to achieve good results, lack of semantic connection between queries and documents, etc.

Active Publication Date: 2019-04-16
UNIV OF SCI & TECH BEIJING
View PDF9 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The key technical problem to be solved by the present invention is to solve the problem that the traditional retrieval method in the TED speech topic query retrieval cannot achieve better results due to the lack of semantic connection between the query and the document

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Document retrieval method for searching topic type query in TED speech
  • Document retrieval method for searching topic type query in TED speech
  • Document retrieval method for searching topic type query in TED speech

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055]Specific embodiments of the present invention will be described in detail below in conjunction with specific drawings. It should be noted that the technical features or combinations of technical features described in the following embodiments should not be regarded as isolated, and they can be combined with each other to achieve better technical effects.

[0056] Aiming at the problems in speech retrieval, the present invention combines deep learning and information retrieval methods, and proposes a document retrieval method for searching topical queries in TED speeches. Since the traditional retrieval model can help users quickly filter out irrelevant documents, the present invention first uses the traditional retrieval model to query the likelihood retrieval model to obtain preliminary retrieval results (preliminary results), and then in order to better learn text semantic features, the present invention A neural network is introduced to model queries and documents, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of information retrieval, and provides a semantic document retrieval method for searching topic type query in TED speech. The method comprises: training aneural network model by utilizing the existing query and document, and learning neural network model parameters; when a user inputs a query, a query likelihood retrieval model is used to obtain a preliminary retrieval result; and inputting the preliminary retrieval result into the neural network model with the fixed parameters for reordering, and determining a final retrieval result. According tothe method, the problem that a good effect cannot be achieved due to the fact that a traditional retrieval method in topic type query retrieval lacks semantic contact between queries and documents issolved; respectively modeling the topic type query and the speech document by introducing a neural network, and obtaining the semantic level correlation between the query and the document. In the neural network part, a recurrent neural network and a convolutional neural network are connected in series, and in addition, in order to solve the problem of gradient disappearance, a currently popular LSTM module is adopted.

Description

technical field [0001] The invention relates to the technical field of information retrieval, in particular to a document retrieval method for searching topic queries in TED speeches. Background technique [0002] TED (Technology Entertainment Design) is currently the most successful speech viewing platform. Searching is one of the main ways the platform makes speech available to users. Specifically, users can actively enter keywords in the search box or click on the keywords provided on the platform about topics, speakers, and subtitle languages ​​to search for speeches they are interested in. [0003] The retrieval method of the TED platform is based on the Boolean model. Other traditional retrieval models, such as Query Likelihood (QL), Sequence Dependency Model (SDM), BM25, etc., are all optimized on the basis of this model. BooleanModel (Cooper W S.Getting beyond Boole[J].Information Processing&Management,1988,24(3):243-248.) is the simplest and commonly used retrieva...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332
Inventor 殷绪成方帆张博文
Owner UNIV OF SCI & TECH BEIJING
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products