Unlock instant, AI-driven research and patent intelligence for your innovation.

An Image Description Method Based on Conditional Random Fields and Intrinsic Semantic Attention

A conditional random field and image description technology, applied in the field of semantic description, can solve the problems of single image description, inaccurate results, time-consuming and other problems, achieve good results, ensure accuracy, and enhance the effect of connection

Active Publication Date: 2021-12-10
BEIJING INSTITUTE OF TECHNOLOGYGY
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Traditional image semantic description methods include image description methods based on template filling and image description methods based on retrieval, the results of these methods are not accurate enough, and consume a lot of time and extra work
For several years, the deep learning method based on the encoder-decoder architecture has also been applied to the field of image semantic description generation, but the inconsistency between the image description training and generation process has always made the generated image description too single, and also Not accurate enough
However, the current method still cannot effectively solve the problem of inaccurate sentence structure and repeated phrases in the generated description.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Image Description Method Based on Conditional Random Fields and Intrinsic Semantic Attention
  • An Image Description Method Based on Conditional Random Fields and Intrinsic Semantic Attention
  • An Image Description Method Based on Conditional Random Fields and Intrinsic Semantic Attention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0071] This embodiment is a process of training and applying on the MS COCO dataset.

[0072] An image description method based on conditional random fields and internal semantic attention, such as figure 1 shown, including the following steps:

[0073] Step 1: Process the training data. The dataset for image description generation on MS COCO is used as the training set. Preprocess all the descriptions in the data set, convert all descriptions to lowercase representation, then count the number of each word, save the words that appear more than 5 times into the dictionary, and save the words that appear less than 5 times Words and blank positions are replaced with "UNK", and the dictionary Vocab is finally obtained. At the same time, for the descriptions in the data set whose reference descriptions are less than 5 sentences, the existing reference descriptions are randomly copied so that each image has at least 5 corresponding descriptions. Then use the spaCy method to extr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an image description method based on conditional random fields and internal semantic attention, and belongs to the cross technical field of computer vision and natural language processing. First process the training data, then design the network structure model, use the existing convolutional neural network and target detection network to extract image features, and design a cyclic neural network with internal semantic attention mechanism and attention feature residual structure to generate images corresponding to Description; Next, the combination of cross-entropy loss function and conditional random field loss function is used as the training target, and the processed training data is used to train the network model to obtain a network with image semantic description generation function; finally, input any image The network, get the corresponding description. Compared with the prior art, this method not only ensures the accuracy of the sentence structure of the generated description, but also solves the problem of repeated phrases in the generated description, so that the generated description can better capture the key information in the image.

Description

technical field [0001] The present invention relates to an image description method based on conditional random field and internal semantic attention, in particular to a deep network model based on conditional random field, using a unique internal semantic attention mechanism combined with a network model of attention feature residual structure To generate a semantic description corresponding to an image, which belongs to the cross-technical field of computer vision and natural language processing. Background technique [0002] With the explosive growth of image data on the Internet, it has become unrealistic to identify and retrieve image semantic information manually. The data structure of the image itself is relatively abstract, but it contains a wealth of information. The method of deep learning to generate descriptions for images and mine the semantic information in them has a wide range of application scenarios in image retrieval, AI question answering, intelligent rec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/044G06N3/045G06F18/2411G06F18/25
Inventor 宋丹丹骆源
Owner BEIJING INSTITUTE OF TECHNOLOGYGY