A Distributed Retrieval Repository Selection Method Based on Variational Autoencoder

A technology of autoencoder and resource library, which is applied in the field of distributed retrieval resource library selection based on variational autoencoder, can solve the problem of time-consuming calculation and achieve high efficiency and fast speed

Active Publication Date: 2021-02-12
SOUTH CHINA UNIV OF TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When using LDA to perform model training on large documents in the resource library, the calculation takes a lot of time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Distributed Retrieval Repository Selection Method Based on Variational Autoencoder
  • A Distributed Retrieval Repository Selection Method Based on Variational Autoencoder
  • A Distributed Retrieval Repository Selection Method Based on Variational Autoencoder

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0055] Such as figure 1 and figure 2 As shown, the variational autoencoder-based distributed retrieval resource library selection method provided in this embodiment includes the following steps:

[0056] Step 1: Preprocess the text in the sample document set of each resource library obtained by sampling query, splicing to get the "big document" of each resource library, and calculate the bag of words representation and one-hot encoding (one-hot encoding) of the large document in the resource library. )express. The specific steps are:

[0057] Extract the snippet (short summary) of the sampling document in each resource library and splicing to get the text of the resource library. The sampling document without a snippet (short summary) is replaced by the text content ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for selecting a distributed retrieval resource base based on a variational autoencoder. The method uses a deep neural network to construct an encoder and a decoder network structure, and learns the implicit representation of the resource base text to capture the resource base. Deep semantic representation of text. The model obtained by the unsupervised training method performs reasoning on the extended text of the query word to obtain the implicit representation of the query word. Relevance rankings of repositories are obtained by computing the similarity between query terms and implicit representations of repositories. The model is unsupervised training, which automatically obtains the resource library and the hidden representation vector of the text, which can overcome the shortcomings of designing text features in the supervised training method. In addition, the network structure of variational autoencoder is simple, and the calculation time consumption of variational inference is lower than that of LDA topic model based on Markov chain Monte Carlo inference method. After the model training is completed, it takes less time to select the resource pool, and the efficiency of resource pool selection is high.

Description

technical field [0001] The present invention relates to the technical field of distributed retrieval, in particular to a method for selecting distributed retrieval resource bases based on variational autoencoders. Background technique [0002] With the continuous growth of information, people may not be satisfied with a single source of information when obtaining information. In information query, users may want to query related photos, videos on corresponding video sites, or news, Q&A, technical blogs, content from the latest Weibo, etc. The distributed retrieval system distributes queries to various search engines, and presents them to users after combining the query results of resource libraries in multiple search engines, which can not only combine the results of multiple search engines, but also reduce the switching cost of users. [0003] Resource library selection is a key problem to be solved in distributed retrieval. The goal is to match the user's query requiremen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F16/951G06F16/953G06F16/9535G06F16/332
Inventor 董守斌吴天锋袁华胡金龙张晶
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products