Unlock instant, AI-driven research and patent intelligence for your innovation.

A neural network-based cross-modal information retrieval method and device

An information retrieval and neural network technology, applied in the fields of natural language processing and deep learning, can solve problems such as consuming huge resources, bias, and incomplete labeling

Active Publication Date: 2021-02-19
中科人工智能创新技术研究院(青岛)有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

During the research process, the inventor found that there are two main problems in the above scheme: one is that manual labeling needs to consume huge resources, especially in the face of massive data; Usually incomplete and biased

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A neural network-based cross-modal information retrieval method and device
  • A neural network-based cross-modal information retrieval method and device
  • A neural network-based cross-modal information retrieval method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] The present disclosure will be further described below in conjunction with the accompanying drawings and embodiments.

[0057] It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used in this disclosure have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

[0058] It should be noted that the terminology used here is only for describing specific implementations, and is not intended to limit the exemplary implementations according to the present application. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that when the terms "comprising" and / or "comprising" are used in this specification, they mean There are features, steps, operations, means, components and / or combi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The disclosure provides a neural network-based cross-modal information retrieval method and device, which maps three modal data of text, voice and image into text data, and measures the similarity between the three modal data through mapping. Complete the transmembrane state information retrieval task. The method comprises the following steps: receiving an input voice signal, extracting features of the voice signal, using a convolutional neural network to train the features of the voice signal and text labels, and recognizing the text information of the voice; receiving an input image, extracting image features, Encode the text description of the image, embed the image in the text space, realize the pairing of the image and the text description, decode the text description, and generate the text information of the image; use the existing text data to train the document topic generation model; use the trained The document topic generation model extracts the topics of the text information of speech and images, calculates the similarity between the text information, and sorts them according to the similarity.

Description

technical field [0001] The invention relates to the fields of natural language processing and deep learning, and mainly relates to a neural network-based cross-modal information retrieval method and device. Background technique [0002] Multimodal information exists in all aspects of real life. With the rapid development of the Internet, multimodal information including text, voice, image, and video is growing explosively, and the retrieval of information between different modalities is becoming increasingly important. [0003] Early research on cross-modal retrieval usually artificially constructs associations between different modal data. Taking text-based image retrieval as an example, a popular solution in the 1970s was to manually label images with text, and then use a text-based database management system to build a text-based image retrieval system. During the research process, the inventor found that there are two main problems in the above scheme: one is that manua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/55G06F16/58G06F16/2458G06F40/216G06N3/04G06N3/08G10L15/02G10L15/06G10L15/08G10L15/26G10L25/24G10L25/30G10L25/45G10L25/54
Inventor 王亮黄岩罗怡文王海滨纪文峰
Owner 中科人工智能创新技术研究院(青岛)有限公司