Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A convolutional neural network RGB-D significance detection method based on multilayer fusion

A convolutional neural network, RGB-D technology, applied in the field of convolutional neural network RGB-D saliency detection, to achieve the effect of improving the results

Inactive Publication Date: 2019-06-18
CIVIL AVIATION UNIV OF CHINA
View PDF5 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] However, these methods only combine the RGB image and the depth map, use the deep convolutional network to directly output the saliency map, and do not further optimize the saliency map by using the depth map

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A convolutional neural network RGB-D significance detection method based on multilayer fusion
  • A convolutional neural network RGB-D significance detection method based on multilayer fusion
  • A convolutional neural network RGB-D significance detection method based on multilayer fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] A convolutional neural network RGB-D saliency detection method based on multi-layer fusion, see figure 1 , the method includes the following steps:

[0039] 1. Iterative optimization of detected salient objects

[0040] The basic idea of ​​the RGB-D saliency detection in the embodiment of the present invention is to use a circular convolutional neural network to iteratively optimize the detected salient objects, which is formalized as:

[0041] S t =φ(I,D,S t-1 ;W) (1)

[0042] Among them, φ is the network model function, I is the RGB image, D is the depth map, S is the saliency detection result, t is the number of iterations, and W is the network parameter.

[0043] 2. Basic Network Architecture

[0044] see figure 1 , the basic network architecture in the embodiment of the present invention is the same as the VGG16 network structure (wherein, the VGG16 network structure mainly includes: 5 convolutional layer modules CONV1-CONV5, and two fully connected layer mod...

Embodiment 2

[0064] Combine below figure 1 1. The specific example further introduces the scheme in embodiment 1, see the following description for details:

[0065] In the embodiment of the present invention, when designing the network, it is necessary to consider how to effectively use the features of different scales of the convolutional neural network to capture salient objects of different scales in the image.

[0066] Specifically, the multi-layer fused convolutional neural network designed in the embodiment of the present invention gradually fuses higher-level convolutional features to lower-level convolutional features, and finally generates a saliency map with the same resolution as the input image, namely :

[0067] 1) First, use a 3×3 convolution with a channel number of 60 to perform dimensionality reduction operations on the FC7 layer, pool4 layer, pool3 layer, and pool2 layer;

[0068] Through the above operations, the number of channels of the features of the corresponding...

Embodiment 3

[0085] Combine below Figure 3-Figure 4 The scheme in embodiment 1 and 2 is carried out feasibility verification, see the following description for details:

[0086] according to figure 1 The shown network structure builds the network in the embodiment of the present invention, expands RGB and RGB-D image data, generates corresponding training data sets, and performs network training. The obtained saliency map is refined, and the network is fine-tuned after refinement.

[0087] From image 3 Among them, it can be found that after different trainings, the significance results detected by the embodiments of the present invention have been significantly improved. The result of the first iteration is the result of training using the all-zero saliency map and all-zero depth map. It can be seen that the obtained results are incomplete and the correct saliency cannot be obtained in the first and third rows of images. sexual object. After fine-tuning the network using the depth m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a convolutional neural network RGB-D significance detection method based on multilayer fusion. The method comprises the following steps: converting full connection layer modules FC6 and FC7 in a VGG16 network into full convolution layers, and combining the full convolution layers CONV1-CONV5 to form a new convolutional neural network; Sequentially carrying out dimension reduction and fusion operation on the new convolutional neural network to obtain an initial iteration significance detection result; Refining the saliency detection result of the initial iteration by adopting iterative optimization; Adopting different training data to sequentially carry out initialization training and first and second fine tuning training on the new convolutional neural network; Andfor the result after training, using the minimum bounding box of the saliency object to perform cutting and mirror overturning on the input image to obtain a saliency detection result. According to the method, an effective CNN model is designed, RGB and depth information is fused, multi-scale features of a significant object are captured, and convolution features from a higher layer to a lower layer are fused, so that the scale problem of the significant object is solved.

Description

technical field [0001] The invention relates to the field of RGB-D saliency detection, in particular to a convolutional neural network RGB-D saliency detection method based on multilayer fusion. Background technique [0002] In recent years, due to the development of depth acquisition equipment and the close relationship between depth information and salient objects, depth information has attracted researchers' attention in image saliency detection. Depth information helps to distinguish foreground objects and backgrounds with similar colors, leading to better saliency detection results. [0003] For some images, existing saliency detectors cannot obtain good saliency detection results, and the main reason for the failure of RGB saliency detectors is that only using RGB cannot provide effective foreground and background discrimination capabilities. However, some current saliency detection algorithms only use depth information as an additional feature to calculate the distan...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06T7/00G06T7/50G06N3/04
Inventor 黄睿周末
Owner CIVIL AVIATION UNIV OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products