Method and system for generating natural languages for describing image contents

A technology of image content and natural language, applied in the field of image processing, can solve problems such as unidentified, missing skis, and image recognition errors

Active Publication Date: 2018-04-17
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF5 Cites 61 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, most of the proposed convolutional neural network-based methods only use the global features of the image, so that some objects in the image cannot be recognized.
Therefore, when the text description of the image is generated, some object information in the image will be lost
For example, in figure 1 In (a) of (i), the sentence generated by the above prior art only

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for generating natural languages for describing image contents
  • Method and system for generating natural languages for describing image contents
  • Method and system for generating natural languages for describing image contents

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0061] In order to obtain all kinds of features on the image to be processed comprehensively, two concepts of "global features" and "local features" are used in this application. Among them, the global feature refers to the image feature used to describe the context information containing the image object; in contrast, the local feature refers to the image feature used to describe the detailed information containing the image object. Both global and local features are important when representing an image.

[0062] For example, refer to figure 1 (i), "crowd", "snow", "slope" belong to global features, while "skis worn under people's feet", "hats on people's heads", "windows on houses", etc. belong to local features. Similarly, refer to figure 1 (ii), "person" and "soccer field" are global features, while "kite placed on the football field" ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for training a model for generating natural languages for describing image contents, and a method for generating natural languages for describing image contents by adoption of the model. The training method comprises the following steps of: A1) taking global features and local features of images in an image training set as inputs of an attention mechanism, so as toobtain a fusion result comprising the global features and the local features at the same time; and A2) taking the fusion result and a word training set as inputs of a long short-term memory network, and training the attention mechanism and the long short-term memory network by utilizing a loss function so as to obtain a weight of the attention mechanism and a weight of the long short-term memory network, wherein the loss function is a function of a condition probability of the previous word or the ith word of a plurality of words in the content of a known image and a natural sentence for describing the image content, and i is equal to 1,to imax.

Description

technical field [0001] The present invention relates to image processing, and in particular to the description of image content. Background technique [0002] Automatic image description refers to the automatic generation of natural language sentences by computer to describe the content of a given image. Compared with basic tasks such as image classification and object detection, the task of automatic image description generation is more complex and challenging, and it is of great significance for image content understanding. Automatic image description generation requires a computer not only to identify the objects in an image, but also to identify the relationship between objects, behavioral activities, etc., and use natural language to describe the identified semantic information. Automatic image description can be applied to many scenarios, such as image text retrieval system, early childhood education system, and navigation for the blind. [0003] So far, researchers ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06F17/27G06N3/04G06N3/08
CPCG06N3/084G06F40/284G06N3/048G06F18/24G06F18/253
Inventor 唐胜李灵慧张勇东李锦涛
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products