Visual question-answering model training method and device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A model training and visual technology, applied in the computer field, can solve problems such as limited data sets, no consideration of spatial semantic context information between image regions, image features and problem feature extraction and single processing, etc.

Active Publication Date: 2019-10-18

BEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD +1

View PDF18 Cites 11 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, the existing visual question answering model training methods are too simple to extract and process image features and question features, without considering the spatial semantic context information between image regions, and the current visual question answering VQA question data set is limited, and the model is generally overfitting. combined state, which affects the semantic context similarity between the obtained answer and the real answer

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0092] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the application. However, the present application can be implemented in many other ways different from those described here, and those skilled in the art can make similar promotions without violating the connotation of the present application. Therefore, the present application is not limited by the specific implementation disclosed below.

[0093] Terms used in one or more embodiments of this specification are for the purpose of describing specific embodiments only, and are not intended to limit one or more embodiments of this specification. As used in one or more embodiments of this specification and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and / or" used in one or more embodiments of the present sp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a visual question-answering model training method and device, and relates to the technical field of computers. The visual question-answering model training method comprises thesteps of obtaining a training sample and a sample label; extracting sample image feature information and sample problem feature information; performing feature cross processing on the sample image feature information and the sample problem feature information to obtain a sample image feature vector carrying the sample problem information and a sample problem feature vector carrying the sample image information; inputting the sample image feature vector carrying the sample question information and the sample question feature vector carrying the sample image information into the visual question-answering model to obtain a prediction answer through the visual question-answering model; determining a loss value of a loss function based on the real answer and the predicted answer; and updating the visual question and answer model through the loss value of the loss function.

Description

technical field [0001] The present application relates to the field of computer technology, in particular to a visual question answering model training method and device, a computing device and a computer-readable storage medium. Background technique [0002] Visual Question Answering (VQA) is a comprehensive task involving computer vision and natural language processing. A VQA system takes a picture and a free-form, open-ended natural language question about the picture as input. Generate a natural language answer as output. [0003] At present, the existing visual question answering model training methods generally first extract the image features to be answered through a pre-trained deep convolutional neural network model (CNN), convert the questions into several word vectors, and then convert the image features into The question words of the word vector are input into the long short-term memory network (LSTM) together, and the LSTM network is used to generate the answer...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62

CPCG06F18/243G06F18/253

Inventor李长亮詹华年丁洪利唐剑波

OwnerBEIJING KINGSOFT DIGITAL ENTERTAINMENT CO LTD

Visual question-answering model training method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology