A distributed retrieval resource library selection method based on a variational auto-encoder

An autoencoder and resource library technology, which is applied in the field of distributed retrieval resource library selection based on variational autoencoders, can solve the problem of long calculation time, and achieve the effect of high efficiency and fast speed.

Active Publication Date: 2019-06-21
SOUTH CHINA UNIV OF TECH
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When using LDA to perform model training on large documents in the resource library, the calculation takes a lot of time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A distributed retrieval resource library selection method based on a variational auto-encoder
  • A distributed retrieval resource library selection method based on a variational auto-encoder
  • A distributed retrieval resource library selection method based on a variational auto-encoder

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0055] like figure 1 and figure 2 As shown, the variational autoencoder-based distributed retrieval resource library selection method provided in this embodiment includes the following steps:

[0056] Step 1: Preprocess the text in the sample document set of each resource library obtained by sampling query, splicing to get the "big document" of each resource library, and calculate the bag of words representation and one-hot encoding (one-hot encoding) of the large document in the resource library. )express. The specific steps are:

[0057] Extract the snippet (short summary) of the sampling document in each resource library and splicing to get the text of the resource library. The sampling document without a snippet (short summary) is replaced by the text content of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed retrieval resource library selection method based on a variational auto-encoder, and the method comprises the steps: building an encoder and decoder network structure by utilizing a deep neural network, learning implicit representation of a resource library text, and capturing deep semantic representation of the resource library text; reasoning the extended text of the query word through a model obtained by an unsupervised training method to obtain a hidden representation of the query word; obtaining the correlation ranking of the resource library by calculating the similarity between the query words and the implicit representation of the resource library. The model is unsupervised training, a resource library and a hidden representation vector of a text are automatically obtained, and the defect of text feature design in a supervised training method can be overcome. In addition, the network structure of the variational auto-encoder is simple, andthe calculation time consumption of variational reasoning is lower than that of an LDA topic model based on a Markov chain Monte Carlo reasoning method. And after model training is completed, the timeconsumption for resource library selection is low, and the resource library selection efficiency is high.

Description

technical field [0001] The present invention relates to the technical field of distributed retrieval, in particular to a method for selecting distributed retrieval resource bases based on variational autoencoders. Background technique [0002] With the continuous growth of information, people may not be satisfied with a single source of information when obtaining information. In information query, users may want to query related photos, videos on corresponding video sites, or news, Q&A, technical blogs, content from the latest Weibo, etc. The distributed retrieval system distributes queries to various search engines, and presents them to users after combining the query results of resource libraries in multiple search engines, which can not only combine the results of multiple search engines, but also reduce the switching cost of users. [0003] Resource library selection is a key problem to be solved in distributed retrieval. The goal is to match the user's query requiremen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/951G06F16/953G06F16/9535G06F16/332
Inventor 董守斌吴天锋袁华胡金龙张晶
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products