Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Attention model-based image identification method and system

An attention model and image recognition technology, applied in the field of image processing, can solve problems such as ignoring local information of data and poor generalization ability of different data

Inactive Publication Date: 2018-08-03
BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
View PDF2 Cites 47 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when dealing with specific image classification tasks or speech recognition tasks, due to the diversity of input data, the model can only capture the global information of the data, while ignoring the local information of the data.
Taking image classification as an example, some traditional solutions are to artificially divide the image into multiple regions and capture the local information of the data in the form of a spatial pyramid. Segmentation regions, so its generalization ability to different data is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention model-based image identification method and system
  • Attention model-based image identification method and system
  • Attention model-based image identification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] figure 1 It is a schematic flow chart of an image recognition method based on an attention model of an embodiment, and an image recognition method based on an attention model includes the following steps:

[0043] Step S10: Obtain the input feature map of the shape of the image matrix [W, H, C], where W is the width (the width of the image, in pixels), H is the height (the height of the image, in pixels), and C is Number of channels (number of color channels of the image). The image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*number of channels.

[0044] Step S20: Use the preset spatial mapping weight matrix to spatially map the input feature map, and obtain the spatial weight matrix after being activated by the activation function, and multiply the spatial weight matrix and the image matrix of the input feature map bit by bit to obtain the output feature map, Among them, the ...

Embodiment 2

[0055] Figure 4 It is a schematic flow chart of an image recognition method based on an attention model in another embodiment, and an image recognition method based on an attention model includes the following steps:

[0056] Step S21: Obtain an input feature map whose image matrix shape is [W, H, C], where W is the width (the width of the image, in pixels), H is the height (the height of the image, in pixels), and C is Number of channels (number of color channels of the image). The image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*number of channels.

[0057] Step S22: Use the spatial attention matrix [C, 1] in the shallow network of the convolutional neural network to spatially map the input feature map, and obtain the first spatial weight matrix after being activated by the activation function, and combine the first spatial weight matrix with The image matrix of the input feature...

Embodiment 3

[0067] The present invention also provides a kind of image recognition system based on attention model, comprising:

[0068] The image acquisition module is used to acquire an input feature map whose image matrix shape is [W, H, C], where W is the width, H is the height, and C is the number of channels.

[0069] The image processing module is used to spatially map the input feature map using the preset spatial mapping weight matrix, and obtain the spatial weight matrix after being activated by the activation function, and multiply the spatial weight matrix and the image matrix of the input feature map bit by bit to obtain the output Feature map, where the preset spatial mapping weight matrix is ​​the spatial attention matrix [C, 1] that focuses on the image width and height. At this time, the shape of the spatial weight matrix is ​​[W, H, 1], or the preset The spatial mapping weight matrix of is the channel attention matrix [C, C] whose attention is on the number of image chan...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an attention model-based image identification method and system. The method comprises the steps of firstly obtaining an input feature graph with an image matrix in a shape [W,H,C], wherein W is width, H is height and C is a channel number; and secondly performing space mapping on the input feature graph by using a preset space mapping weight matrix, performing activation through an activation function to obtain a space weight matrix, and multiplying the space weight matrix by the image matrix of the input feature graph by bit to obtain an output feature graph, wherein the preset space mapping weight matrix is a space attention matrix [C,1] with attention depending on image width and height, at the moment, the shape of the space weight matrix is [W,H,1], or the presetspace mapping matrix is a channel attention matrix [C,C] with attention depending on the image channel number, at the moment, the shape of the space weight matrix is [1,1,C]. The pertinence of feature extraction can be effectively improved, so that the extraction capability of image local features is enhanced.

Description

technical field [0001] The present invention relates to the technical field of image processing, in particular, the present invention relates to an image recognition method and system based on an attention model. Background technique [0002] In recent years, deep learning has been widely used in video image processing, speech recognition, natural language processing and other related fields. However, when dealing with specific image classification tasks or speech recognition tasks, due to the diversity of input data, the model can only capture the global information of the data, while ignoring the local information of the data. Taking image classification as an example, some traditional solutions are to artificially divide the image into multiple regions and capture the local information of the data in the form of a spatial pyramid. Segmentation regions, so its generalization ability to different data is poor. Contents of the invention [0003] The purpose of the presen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/62G06N3/04
CPCG06N3/048G06N3/045G06F18/213G06F18/24G06N3/08G06V10/454G06V10/82G06F18/24133
Inventor 张志伟杨帆
Owner BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products