A Generative Approach from Structured Text to Image Descriptions

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An image description and structuring technology, applied in still image data indexing, still image data retrieval, metadata still image retrieval, etc., can solve problems such as ignoring attributes, fixed sentence patterns, and missing information in sentences, and achieves the goal of overcoming sentence patterns Effects of single, good sentence diversity, good image description effect and accuracy

Active Publication Date: 2019-06-04

哈尔滨米兜科技有限公司

View PDF1 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, this method has certain limitations. For example, the single template form of the language leads to a relatively fixed sentence structure, and it takes a lot of time to train image features in order to realize the recognition and pre-processing of objects and actions in the image. Annotation of each object and action category in the image

Most importantly, this method ignores the inherent properties of objects, making the generated sentences lose a lot of information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0030] Below in conjunction with accompanying drawing, the present invention is described in further detail:

[0031] like figure 1 shown, where is the activity in the text description, which represents the action of the object in the image, and can take a value at any element in the candidate class set Activity (ie ), where 0 means not having the activity, and 1 means having the activity; is the object in the text description, which means the object contained in the image description, and can take a value at any element in the candidate subclass collection Object (ie Where 0 means not having the object, 1 means having the object; is an attribute in the text description, which indicates the attribute of the object contained in the image description, and can take a value at any element in the candidate subclass set Attribute (ie ), where 0 means that the object does not have this attribute, and 1 means that the object has this attribute; is the scene in the text des...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a generation method of an image description from a structured text. The generation method comprises the steps of downloading pictures from the internet to form a picture training set; conducting morphological analysis on descriptions which correspond to the pictures in the picture training set to form the structured text; using an existing neural network model to extract convolution neural network characteristics of the pictures in the training set, and using <, picture characteristics and structured text < as inputs to form a multitasking recognition model; using the structured text extracted from the training set and a description which corresponds to the structured text as inputs of a recurrent neural network, and conducting training to obtain a parameter of a recurrent neural network model; inputting the convolution neural network characteristics of an image ready to be described, and obtaining a predicted structured text through the multitasking recognition model; inputting the predicted structured text, and obtaining the image description through the recurrent neural network model. Compared with the prior art, a better image description effect, accuracy and sentence variety can be generated through the method, and the generation method of the image description from the structured text can be effectively popularized in an application of image retrieval.

Description

technical field [0001] The invention relates to the technical field of computer vision content automatic understanding and multimedia retrieval, in particular to a generation method from structured text to image description. Background technique [0002] In the fields of computer vision and multimedia, it is a very important and challenging task to describe the semantic information of images by generating natural language. For example: when people see a picture, especially the objects in the picture have distinctive features or attributes, they will have a certain degree of understanding of the picture, and can use language to tell what happened in the picture. For example, using a sentence like "a yellow school bus" to describe the image, especially "yellow" and "school bus", can describe the attributes of the car in detail. However, in the face of a large number of images, it takes a lot of time, manpower and financial resources to manually describe the images one by one....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/58G06F16/51

CPCG06F16/51G06F16/5866

Inventor 马书博韩亚洪李广

Owner 哈尔滨米兜科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Generative Approach from Structured Text to Image Descriptions

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology