Unlock instant, AI-driven research and patent intelligence for your innovation.

An intelligent visual question answering method based on deep neural network

A deep neural network, intelligent vision technology, applied in the field of intelligent visual question answering based on deep neural network, can solve the problems of unknown answer reasons and lack of training data.

Active Publication Date: 2020-12-08
厦门大学资产经营有限公司 +1
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] The purpose of the present invention is to provide a new deep learning network design using a multi-task learning framework to solve the two major problems of lack of training data and unknowable answering reasons in intelligent visual question answering. A deep neural network-based intelligent visual question answering method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An intelligent visual question answering method based on deep neural network
  • An intelligent visual question answering method based on deep neural network
  • An intelligent visual question answering method based on deep neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0069] Embodiments of the present invention include the following steps:

[0070] 1. Intelligent Q&A data preprocessing

[0071] 1.1 Adjust all image scales to 448*448 resolution.

[0072] 1.2 Remove stop words from the text content in all training data, and lowercase all English words. Then segment the text content, and select the 8000 words with the highest frequency as the answer dictionary, and select the 20000 words with the highest frequency as the image description dictionary.

[0073] 2. Image depth convolution feature extraction

[0074] Use the residual deep convolutional network to process the image convolution features, and obtain the feature map of each image, expressed as F I ∈ R 14×14×2048 . Here 14×14 is the feature area of ​​the image, and 2048 is the feature dimension of each feature block.

[0075] 3. Depth Feature Extraction for Text Questions

[0076] Use the bidirectional recurrent neural network to extract the problem features, and the processing ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An intelligent visual question answering model based on a deep neural network, involving intelligent visual question answering in the field of artificial intelligence. It includes the following steps: intelligent question answering data preprocessing; image depth convolution feature extraction; text question depth feature extraction; intelligent visual question processing; visual intelligent question answering based on hub channels. Using a multi-task learning framework to design a new deep learning network to solve the two major problems of lack of training data and unknowable answering reasons in intelligent visual question answering. A new type of deep learning network structure is designed. The network can explain the reasons for the given answers while performing intelligent visual question-and-answer. The network structure includes a visual description module, which can describe the content of the image according to the content of the question. . The network structure adopts a hub structure design, which can introduce data from image description, text question answering and other fields into visual intelligent question answering tasks.

Description

technical field [0001] The invention relates to intelligent visual question answering in the field of artificial intelligence, in particular to an intelligent visual question answering method based on a deep neural network. Background technique [0002] Visual Question Answering (Visual Question Answering) is an ultimate machine intelligence task proposed by computer science this year. Its task is to answer natural language questions posed by humans based on the content of a given image. This task was first proposed in 2010 by Bigham et al. of Carnegie Mellon Elephant in "User Interface Software and Technology" [1] . In 2015, Stanislaw Antol of Virginia Institute of Technology and others released the first large-scale data set for visual intelligent question answering at the International Vision Conference ICCV. The data set was manually produced on the Amazon online platform. The content contains the natural question and answer habits of human beings [2] . With the rel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06N3/04G06N3/08
CPCG06F16/3329G06N3/08G06N3/045
Inventor 纪荣嵘周奕毅
Owner 厦门大学资产经营有限公司