Remote sensing image text description generation method with multi-semantic-level attention capability

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A remote sensing image and multi-semantic technology, applied in the field of machine learning, can solve the problems that remote sensing images cannot obtain sufficient reasonable and fine area attention, difficulty in comprehensive understanding, obvious differences in semantic distribution and density between images, etc.

Pending Publication Date: 2021-06-11

NORTHWESTERN POLYTECHNICAL UNIV

View PDF0 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] First, remote sensing images cover a large spatial scale, and the composition of ground elements is complex, making it difficult to fully understand;

[0004] Second, the semantic distribution and density differences between images are obvious, such as deserts and cities;

[0005] Third, it is difficult to model the relationship between ground objects and the whole world, and the complex organizational relationship between ground objects

However, due to the low grid resolution and fixed grid size, the remote sensing images with various element sizes cannot obtain sufficient reasonable and fine regional attention, and it is difficult to achieve ideal description sentence generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment

[0037] 1. Experimental conditions

[0038] This embodiment runs on Nvidia GTX1070 with 8G video memory and Windows operating system, and uses Python to carry out simulation experiments.

[0039]The data used in the simulation is the public remote sensing description task dataset, and the UCM-Captions dataset is used in this experiment. The dataset contains about 20,000 remote sensing images, and each image has 5 annotation sentences.

[0040] 2. Simulation content

[0041] First, three metrics, BLEU, CIDEr and ROUGE-L, which are used to measure the closeness of sentences, are introduced to measure the quality of the sentences generated by the present invention. In order to prove the effectiveness of the present invention, the experimental results are compared with the method based on space traditional Attention and the method of Attention with input attribute information. Among them, the method based on traditional spatial Attention is described in the literature "X.Lu, B.W...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a remote sensing image text description generation method with multi-semantic-level attention ability, and the method comprises the steps: carrying out the frame selection of potential object regions in a remote sensing image through the powerful object positioning capability of an object detection deep neural network, and combining a multi-layer visual element grid system according to the features of the regions; when description statements are generated, automatically concerning corresponding blocks in the multi-layer visual attention grid system according to context information, achieving more accurate visual attention focusing, bringing more appropriate vocabulary estimation, and describing remote sensing images more accurately and comprehensively. Compared with traditional single-scale sparse grid space attention block division, the method has more accurate focusing capability and joint multilayer semantic expression capability.

Description

technical field [0001] The invention belongs to the technical field of machine learning, and in particular relates to a method for generating text descriptions of remote sensing images. Background technique [0002] Remote sensing image text description generation is an emerging remote sensing image visual understanding task. This task requires the description generator to have an in-depth understanding of the semantic features of remote sensing images, and on this basis, it can output descriptive sentences that conform to the laws of human language. Compared with general image description generation, remote sensing image text description generation has the following difficulties: [0003] First, remote sensing images cover a large spatial scale, and the composition of ground elements is complex, making it difficult to fully understand; [0004] Second, the semantic distribution and density differences between images are obvious, such as deserts and cities; [0005] Third...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/50G06N3/04G06N3/08G06K9/32

CPCG06F16/50G06N3/08G06V10/25G06N3/044

Inventor 袁媛王丞泽

Owner NORTHWESTERN POLYTECHNICAL UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Remote sensing image text description generation method with multi-semantic-level attention capability

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology