Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

A Generative Approach from Structured Text to Image Descriptions

An image description and structuring technology, applied in still image data indexing, still image data retrieval, metadata still image retrieval, etc., can solve problems such as ignoring attributes, fixed sentence patterns, and missing information in sentences, and achieves the goal of overcoming sentence patterns Effects of single, good sentence diversity, good image description effect and accuracy

Active Publication Date: 2019-06-04
哈尔滨米兜科技有限公司
View PDF1 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has certain limitations. For example, the single template form of the language leads to a relatively fixed sentence structure, and it takes a lot of time to train image features in order to realize the recognition and pre-processing of objects and actions in the image. Annotation of each object and action category in the image
Most importantly, this method ignores the inherent properties of objects, making the generated sentences lose a lot of information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Generative Approach from Structured Text to Image Descriptions
  • A Generative Approach from Structured Text to Image Descriptions
  • A Generative Approach from Structured Text to Image Descriptions

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Below in conjunction with accompanying drawing, the present invention is described in further detail:

[0031] like figure 1 shown, where is the activity in the text description, which represents the action of the object in the image, and can take a value at any element in the candidate class set Activity (ie ), where 0 means not having the activity, and 1 means having the activity; is the object in the text description, which means the object contained in the image description, and can take a value at any element in the candidate subclass collection Object (ie Where 0 means not having the object, 1 means having the object; is an attribute in the text description, which indicates the attribute of the object contained in the image description, and can take a value at any element in the candidate subclass set Attribute (ie ), where 0 means that the object does not have this attribute, and 1 means that the object has this attribute; is the scene in the text des...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a generation method of an image description from a structured text. The generation method comprises the steps of downloading pictures from the internet to form a picture training set; conducting morphological analysis on descriptions which correspond to the pictures in the picture training set to form the structured text; using an existing neural network model to extract convolution neural network characteristics of the pictures in the training set, and using <, picture characteristics and structured text < as inputs to form a multitasking recognition model; using the structured text extracted from the training set and a description which corresponds to the structured text as inputs of a recurrent neural network, and conducting training to obtain a parameter of a recurrent neural network model; inputting the convolution neural network characteristics of an image ready to be described, and obtaining a predicted structured text through the multitasking recognition model; inputting the predicted structured text, and obtaining the image description through the recurrent neural network model. Compared with the prior art, a better image description effect, accuracy and sentence variety can be generated through the method, and the generation method of the image description from the structured text can be effectively popularized in an application of image retrieval.

Description

technical field [0001] The invention relates to the technical field of computer vision content automatic understanding and multimedia retrieval, in particular to a generation method from structured text to image description. Background technique [0002] In the fields of computer vision and multimedia, it is a very important and challenging task to describe the semantic information of images by generating natural language. For example: when people see a picture, especially the objects in the picture have distinctive features or attributes, they will have a certain degree of understanding of the picture, and can use language to tell what happened in the picture. For example, using a sentence like "a yellow school bus" to describe the image, especially "yellow" and "school bus", can describe the attributes of the car in detail. However, in the face of a large number of images, it takes a lot of time, manpower and financial resources to manually describe the images one by one....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/58G06F16/51
CPCG06F16/51G06F16/5866
Inventor 马书博韩亚洪李广
Owner 哈尔滨米兜科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products