Remote sensing image text description generation method with multi-semantic-level attention capability

A remote sensing image and multi-semantic technology, applied in the field of machine learning, can solve the problems that remote sensing images cannot obtain sufficient reasonable and fine area attention, difficulty in comprehensive understanding, obvious differences in semantic distribution and density between images, etc.

Pending Publication Date: 2021-06-11
NORTHWESTERN POLYTECHNICAL UNIV
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] First, remote sensing images cover a large spatial scale, and the composition of ground elements is complex, making it difficult to fully understand;
[0004] Second, the semantic distribution and density differences between images are obvious, such as deserts and cities;
[0005] Third, it is difficult to model the relationship between ground objects and the whole world, and the complex organizational relationship between ground objects
However, due to the low grid resolution and fixed grid size, the remote sensing images with various element sizes cannot obtain sufficient reasonable and fine regional attention, and it is difficult to achieve ideal description sentence generation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Remote sensing image text description generation method with multi-semantic-level attention capability
  • Remote sensing image text description generation method with multi-semantic-level attention capability

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment

[0037] 1. Experimental conditions

[0038] This embodiment runs on Nvidia GTX1070 with 8G video memory and Windows operating system, and uses Python to carry out simulation experiments.

[0039]The data used in the simulation is the public remote sensing description task dataset, and the UCM-Captions dataset is used in this experiment. The dataset contains about 20,000 remote sensing images, and each image has 5 annotation sentences.

[0040] 2. Simulation content

[0041] First, three metrics, BLEU, CIDEr and ROUGE-L, which are used to measure the closeness of sentences, are introduced to measure the quality of the sentences generated by the present invention. In order to prove the effectiveness of the present invention, the experimental results are compared with the method based on space traditional Attention and the method of Attention with input attribute information. Among them, the method based on traditional spatial Attention is described in the literature "X.Lu, B.W...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a remote sensing image text description generation method with multi-semantic-level attention ability, and the method comprises the steps: carrying out the frame selection of potential object regions in a remote sensing image through the powerful object positioning capability of an object detection deep neural network, and combining a multi-layer visual element grid system according to the features of the regions; when description statements are generated, automatically concerning corresponding blocks in the multi-layer visual attention grid system according to context information, achieving more accurate visual attention focusing, bringing more appropriate vocabulary estimation, and describing remote sensing images more accurately and comprehensively. Compared with traditional single-scale sparse grid space attention block division, the method has more accurate focusing capability and joint multilayer semantic expression capability.

Description

technical field [0001] The invention belongs to the technical field of machine learning, and in particular relates to a method for generating text descriptions of remote sensing images. Background technique [0002] Remote sensing image text description generation is an emerging remote sensing image visual understanding task. This task requires the description generator to have an in-depth understanding of the semantic features of remote sensing images, and on this basis, it can output descriptive sentences that conform to the laws of human language. Compared with general image description generation, remote sensing image text description generation has the following difficulties: [0003] First, remote sensing images cover a large spatial scale, and the composition of ground elements is complex, making it difficult to fully understand; [0004] Second, the semantic distribution and density differences between images are obvious, such as deserts and cities; [0005] Third...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/50G06N3/04G06N3/08G06K9/32
CPCG06F16/50G06N3/08G06V10/25G06N3/044
Inventor 袁媛王丞泽
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products