End-to-end training method and application of image feature module in visual question-answering system

An image feature, question answering system technology, applied in the field of model training, can solve problems such as the difficulty of implementation and the complexity of the model

Pending Publication Date: 2020-10-23
TONGJI UNIV
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing methods usually only use more powerful image feature extraction models, such a

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • End-to-end training method and application of image feature module in visual question-answering system
  • End-to-end training method and application of image feature module in visual question-answering system
  • End-to-end training method and application of image feature module in visual question-answering system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The present invention will be described in detail below in conjunction with specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

[0028] This embodiment provides an end-to-end training method for image feature modules in a visual question answering system, which can be used to promote further convergence of image feature modules on actual application environment data. The visual question answering system is implemented based on a visual question answering model, and the visual question answering model includes an image feature module, a sequential neural network, a fusion reasoning module and an answer generation module. This training method is realized through the following steps.

[0029] (1) Model initialization step.

[0030] Obtain th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an end-to-end training method and application of an image feature module in a visual question-answering system. The training method comprises the steps of obtaining initial model parameters of a visual question-answering model; obtaining a training image and a corresponding training text sequence; performing image feature extraction on the training image, and performing text feature extraction on the training text sequence; performing feature fusion on the image features and the text features to generate fusion features, and generating an output answer based on the fusion features; calculating an answer error based on the output answer and an initial answer of the training image; and on the premise that optimization methods of other parts of the visual question andanswer model are not changed, performing parameter adjustment on the image feature module through a first-order optimization method based on the answer error. Compared with the prior art, the invention has the advantages of remarkable effect, simplicity in implementation and the like.

Description

technical field [0001] The invention relates to a model training method, in particular to an end-to-end training method and application of an image feature module in a visual question answering system. Background technique [0002] With the popularization of mobile devices and the increasing demands of the people, all kinds of visual data presented to everyone have shown explosive growth, and people's demand for visual question answering systems that can answer doubts continues to rise. The visual question answering system aims to help complete the interpretation of visual information according to people's demand descriptions, involving question understanding, object retrieval, positioning and reasoning. Compared with other cross-modal tasks such as visual description, the development of visual question answering tasks is still limited by the contradiction between infinite search space and incomplete training data, unclear data feature extraction, contradiction and reasoning...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06N3/04G06F16/332
CPCG06F16/3329G06N3/045G06F18/253G06F18/214
Inventor 王瀚漓龙宇
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products