A visual question answering method based on the fusion of fine-grained image features and external knowledge
A technology of image features and external knowledge, applied in the field of visual question answering, can solve problems such as application scene limitations, poor applicability of fine-grained visual questions, poor answering effect of fine-grained image visual questions, etc., to achieve improved applicability and high accuracy Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
specific Embodiment
[0069] 1. Use the NLTK part-of-speech tagging tool to segment the visual question sentences in the sample library, and build a dictionary for the word segmentation results. Each word in the dictionary corresponds to a unique number;
[0070] 2. If figure 2 As shown, the original image is first segmented using an unsupervised image segmentation algorithm. The segmentation result outputs an image that marks each segmented area with different RGB color values, and the pixel coordinate information of each segmented area can be obtained by using different RGB color values. Through these pixel coordinate information, it can be determined that the image feature map corresponds to the original image Parts of each split region. The image size of the segmentation result is unified to 224×224×3 after processing.
[0071] The VGG-16 network whose weights have been pre-trained on ImageNet with the fully connected layer and Softmax layer removed is used as the image feature extractor. T...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


