Remote sensing image text description generation method with multi-semantic-level attention capability

A remote sensing image and multi-semantic technology, applied in the field of machine learning, can solve the problems that remote sensing images cannot obtain sufficient reasonable and fine area attention, difficulty in comprehensive understanding, obvious differences in semantic distribution and density between images, etc.
CN112948604APending Publication Date: 2021-06-11NORTHWESTERN POLYTECHNICAL UNIV

Patent Information

Authority / Receiving Office
CN Β· China
Patent Type
Applications(China)
Current Assignee / Owner
NORTHWESTERN POLYTECHNICAL UNIV
Publication Date
2021-06-11

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

The invention discloses a remote sensing image text description generation method with multi-semantic-level attention ability, and the method comprises the steps: carrying out the frame selection of potential object regions in a remote sensing image through the powerful object positioning capability of an object detection deep neural network, and combining a multi-layer visual element grid system according to the features of the regions; when description statements are generated, automatically concerning corresponding blocks in the multi-layer visual attention grid system according to context information, achieving more accurate visual attention focusing, bringing more appropriate vocabulary estimation, and describing remote sensing images more accurately and comprehensively. Compared with traditional single-scale sparse grid space attention block division, the method has more accurate focusing capability and joint multilayer semantic expression capability.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of machine learning, and in particular relates to a method for generating text descriptions of remote sensing images. Background technique

[0002] Remote sensing image text description generation is an emerging remote sensing image visual understanding task. This task requires the description generator to have an in-depth understanding of the semantic features of remote sensing images, and on this basis, it can output descriptive sentences that conform to the laws of human language. Compared with general image description generation, remote sensing image text description generation has the following difficulties:

[0003] First, remote sensing images cover a large spatial scale, and the composition of ground elements is complex, making it difficult to fully understand;

[0004] Second, the semantic distribution and density differences between images are obvious, such as deserts and cities;

[0005] Third...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More