Visual question-answering method and system based on semantic alignment and storage medium

A semantic and visual technology, applied in the visual question answering method and system based on semantic alignment, and the field of storage media, can solve problems such as incomplete information and inaccurate answer results, and achieve the effect of accurate answer results, highlighting importance, and improving information

Pending Publication Date: 2020-11-17
HEFEI UNIV OF TECH
View PDF1 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the deficiencies of the prior art, the present invention provides a visual question answering method, system and storage medium based on semantic alignment, which solves the problem that the existing visual question answering technology only involves the original image features and question features in the feature fusion process, and the information is insufficient. Improvement, technical problems that lead to inaccurate answers in the final generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual question-answering method and system based on semantic alignment and storage medium
  • Visual question-answering method and system based on semantic alignment and storage medium
  • Visual question-answering method and system based on semantic alignment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0053] Such as figure 1 As shown, the embodiment of the present invention provides a visual question answering method based on semantic alignment, including:

[0054] Acquiring and preprocessing the data set to obtain the preprocessed original image and question and answer information corresponding to the original image information, the question and answer information including questions and answers;

[0055] Extracting original image features and target location features according to the preprocessed original image, generating image description sentences according to the target location features, and obtaining image description words, question features and image description sentences according to the questions and image description sentences feature;

[0056] Semantically aligning the original image features and image description words to obtain a first image feature, according to the original image features and image description sentence features, obtain a second image feat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a visual question-answering method and system based on semantic alignment and a storage medium, and relates to the technical field of visual question-answering. According to theembodiment of the invention, the method comprises the steps: firstly obtaining and preprocessing a data set, extracting original image features and target position features according to an original image, generating an image description statement according to the target position features, obtaining an image description word, question features and image description statement features, and carryingout the semantic alignment of the original image features and the image description word; and obtaining a first image feature, obtaining a second image feature according to the original image featureand the image description statement feature, obtaining a third image feature according to the original image feature and the question feature, fusing the three image features, the image description statement feature and the question feature to obtain a comprehensive feature, and predicting a final answer result. Therefore, the importance of the image information is highlighted, the information involved in the feature fusion process is perfected, and the finally generated answer result is more accurate.

Description

technical field [0001] The present invention relates to the technical field of visual question answering, in particular to a method and system for visual question answering based on semantic alignment, and a storage medium. Background technique [0002] Visual question answering is a learning task involving computer vision and natural language processing. It is to let the computer learn the input pictures and questions to output an answer that conforms to the rules of natural language and the content is logical. It only focuses on a certain part of the picture according to the different questions. objects, and some questions require a certain amount of common sense reasoning to get the answers. Therefore, visual question answering has higher requirements on the semantic understanding of images than general picture-based speaking, and it also faces greater challenges. [0003] At present, the existing visual question answering technology usually uses the attention mechanism t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/78G06F16/783G06K9/00G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06F16/783G06F16/7867G06N3/049G06N3/08G06V20/41G06V10/44G06N3/045G06F18/241
Inventor 孙晓时雨涛汪萌
Owner HEFEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products