Image understanding method based on depth residual error network and LSTM

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An image understanding and residual technology, applied in the field of deep learning and image semantic understanding, can solve problems such as difficulty in implementation, low recognition rate, and poor generalization

Active Publication Date: 2017-05-10

SOUTH CHINA UNIV OF TECH

View PDF4 Cites 46 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

From the perspective of algorithm implementation, currently commonly used image understanding algorithms have shortcomings such as poor generalization, low robustness, strong local dependence, difficult implementation, and low recognition rate.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0060] Such as figure 1 Shown is the method flowchart of the present invention, comprises the steps:

[0061] (1) Download training data sets: Download ImageNet and MS-COCO public image data sets from http: / / www.image-net.org and http: / / mscoco.org respectively. The ImageNet dataset is divided into a training image set and a test image set. The training image set contains 1,000 categories of pictures, 1,300 for each category, and the test image set contains 50,000 pictures; the MS-COCO dataset is divided into a training image set and a test image set. , the training image set contains 82,783 pictures, and the test image set contains 40,504 pictures. Correspondingly, each picture has 5 natural language sentences used to describe its content information.

[0062] (2), pretreatment:

[0063] For the ImageNet dataset: for each image, the image is scaled to a size of 256×256, and then 5 standard-size images with a size of 224×224 are intercepted from the top, middle, bottom, left,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image understanding method based on a depth residual error network and an LSTM; the method comprises the following steps: firstly building a depth residual error network model so as to extract image abstract features, and storing the features as a feature matrix; using a dynamic attention mechanism in a LSTM model to dynamically form a proper feature vector according to the feature matrix; finally using the LSTM model to form a natural language (English) according to the feature vector. The method uses the advantages of the depth residual error network on image feature extraction and LSTM advantages on time sequence modeling; the depth residual error network and the LSTM model can form an encode-decode framework so as to convert the image content information into the natural language, thus extracting the deep information from the image.

Description

technical field [0001] The invention relates to the fields of image semantic understanding and deep learning, in particular to an image understanding method based on a deep residual network and LSTM (Long Short-term Memory). Background technique [0002] Image understanding refers to the understanding of image semantics. It takes the image as the object, knowledge as the core, and studies the location of the image, the relationship between the objects, and the scene of the image. [0003] The input of image understanding is image data, and the output is knowledge, which belongs to the high-level content in the field of image processing research. The focus is to further study the nature and relationship of each target in the image on the basis of image target recognition, and obtain an understanding of the meaning of the image content and an interpretation of the original objective scene, and then guide and plan behaviors. [0004] Currently commonly used image understandin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06K9/46

CPCG06V10/40G06F18/214

Inventor 胡丹袁东芝余卫宇李楚怡

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image understanding method based on depth residual error network and LSTM

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology