The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
 Such as figure 1 As shown, a visual saliency detection method based on self-learning features and matrix low-rank restoration is used in the implementation of the hardware environment: Intel(R)Core(TM)i5CPU 3.2G computer, 8GB memory, 1GB video memory card, running The software environment is: Matlab R2014b and Windows 7. The original image selected in the experiment is a color picture with a resolution of 681*511, such as figure 1 Shown in the upper left. Such as figure 1 As shown, the specific implementation steps of the present invention are as follows:
 1. Acquire self-learning features:
 1. Pretreatment:
 1) Image scaling: the original image with the size of k×g is scaled equally, the scaling ratio is a, and the scaled image size is ak×ag. Among them, k and g are non-negative integers, 0 <1, ak=round(k×a), ag=round(k×g), round(·) represents rounding operation.
 In order to avoid too long calculation time of the method, which affects the operating efficiency, before the feature template self-learning, the image must be zoomed first to reduce the size of the image. Such as figure 1 As shown, the RGB color image (which has been processed in black and white) shown in the upper left part, the original image resolution is: 681×511, the zoom ratio selected in the experiment is 0.14, and the zoomed image resolution is 95×72.
 2) Image block: from the upper left corner to the lower right corner of the zoomed image, a b×b sliding window is used to sequentially intercept b×b image blocks, and each image block is between adjacent image blocks in the horizontal and vertical directions There is a 50% area overlap, and a 25% area overlap with adjacent image blocks in the ±45° direction; all intercepted image blocks are converted into column vectors, and combined into an image block vector matrix in the order of interception, Mark as: X = [x 1 ,x 2 ,...x N ]∈C m×N; Among them, C represents the set of natural numbers, x i ,i∈[1,N] represents the column vector corresponding to the i-th image block, and N is the image block x i The number of; m is the image block x i The dimension of, m=b×b×c, c is the number of image channels, b≥4, and b must be an even number. For example: if the input image is an RGB color image, then c=3; if it is a grayscale image, then c=1.
 An 8×8 sliding window is used to overlap and block the zoomed image. Use an 8×8 sliding window to slide from the upper left corner of the image to the lower right corner: starting from the leftmost side of the first row, the sliding window slides 4 pixels to the right at a time, and the sliding window gets a window every time it goes The 8×8 color image block of the area, the width of the scaled image is 95 pixels, 95/4-1=22.75, after acquiring 22 image blocks, there are 3 pixels left in the first row, so the first row only lasts Moving 3 pixels, a total of 23 8×8 color image blocks are acquired in the first row. Move the sliding window down by 4 pixels, and start sliding the second row from the leftmost side of the image, using the same method as the first row, and so on, until you reach the lower right corner of the image. The height of the zoomed image is 72 pixels, 72/4-1=17, and the sliding window captures a total of 17 lines of image blocks, 23×17=391, that is, a total of 391 8×8 color image blocks are obtained. Since only 4 pixels are moved in the horizontal and vertical directions at a time, any image block overlaps 50% of the adjacent image blocks in the horizontal and vertical directions, and it overlaps the adjacent image blocks in the ±45° direction. There is 25% overlap between them.
 Divide each 8×8 color image block x i Converted into a column vector. Each color image block includes three channels of R, G, and B. Each channel can be transformed into a 8×8=64-dimensional column vector, and three 64-dimensional column vectors are spliced in the order of R, G, and B. Into a 64×3=192-dimensional column vector.
 All image blocks x i The converted column vectors are spliced to obtain an image block vector matrix. Image block x i The converted column vector is combined into an image block vector matrix in the order from the upper left corner to the lower right corner of the image, marked as X=[x 1 ,x 2 ,...x 391 ]∈C 192×391. Among them, C represents the set of natural numbers, x i ∈R 192 ,i ∈ [1,391] represents the column vector corresponding to the i-th image block (subsequently with x i Directly represents the i-th image block).
 2. Self-learning feature extraction:
 1) Self-learning of feature extraction template:
 Taking the image block vector matrix X obtained after the previous image overlap and block division as the training sample set, then the adaptive feature extraction template W=[w 1 ,w 2 ,...w n ]∈R 192×n It can be obtained by solving the following objective function minimization problem:
 Among them, n represents the number of basis vectors in the feature extraction template W, which is set to 300; ||·|| 1 And ||·|| 2 Represents 1 norm and 2 norm operations respectively, α i It is an intermediate variable in the calculation process. Its initial value is set by a random number. 0.1 is a compromise parameter between the balance error (the first term of equation (1)) and the sparsity (the second term of equation (1)). Equation 1 is solved by the mexTrainDL function in SPArse Modeling Software (http://spams-devel.gforge.inria.fr/downloads.html).
 2) Obtain the self-learning feature matrix:
 After the feature extraction template W is determined, any image block x i Eigenvector f i You can calculate the image block x i And the convolution of the basis vectors in the feature extraction template W to obtain:
 f i = X i **W (2)
 Among them, ** represents convolution operation, f i ∈R 300 ,i∈[1,391]. The feature vectors corresponding to all image blocks jointly form a feature matrix, namely: F=[f 1 ,f 2...,f 391 ]∈R 300×391. F is figure 1 The self-learning feature matrix of the input image.
 2. Matrix low-rank restoration based on self-learning features:
 Using sparsity, the self-learning feature matrix F obtained in the previous step can be expressed as:
 F=L+S (3)
 In the above formula, L represents a low-rank matrix and S represents a sparse matrix, which can be expressed as:
 L=[l 1 ,l 2 ,...l 391 ]∈R 300×391 (4)
 S=[s 1 ,s 2 ,...s 391 ]∈R 300×391 (5)
 Where l 1 ,l 2 ,...l 391 Represents the column of the low-rank matrix L, its length is 300, s 1 ,s 2 ,...s 391 Represents the column of the sparse matrix S, and its length is 300. For the input image, L represents the background area with strong feature correlation, and S represents the salient area of the image. The low-rank matrix L and the sparse matrix S can be solved by matrix low-rank restoration, that is, the following objective function minimization problem can be solved:
 Among them, L * ∈R 300×391 And S * ∈R 300×391 Are the solution results of the low-rank matrix L and the sparse matrix S respectively, ||·|| * Represents nuclear norm operation, ||·|| 1 Represents 1 norm operation. Formula (6) can be solved by ALM (Augmented Lagrange Multiplier) algorithm (Zhouchen Lin, Minming Chen, and Yi Ma. The augmented lagrange multiplier method for exact recovery of corrupted low-rankmatrices. arXiv preprint arXiv: 1009.5055, 2010.).
 3. Obtain visual saliency test results:
 1. Calculate the visual saliency of any pixel:
 1) Obtain the visual saliency of any image block. In the previous step, the solved sparse matrix S is obtained * After that, calculate the solved sparse matrix S * The 1 norm of each column is the image block x i The visual saliency of:
 among them, i∈[1,391] represents the sparse matrix S * The i-th column, sr i ,i∈[1,N] represents the image block x corresponding to this column i Significant value of, ||·|| 1 Represents 1 norm operation.
 2) Obtain the visual saliency of any pixel. Since there is a 50% overlap between adjacent image blocks when the image is divided into blocks, the same pixel will be included in multiple image blocks. Therefore, the saliency value of any pixel must be calculated by calculating the saliency of all image blocks containing the pixel. The values are worth:
 Among them, sr (x,y) Represents the saliency value of the pixel with coordinates (x, y). l represents the number of image blocks containing pixels (x, y). If the image block is located at the four vertices of the image, then l=3; if the image block is located at the boundary of the image except the four vertices, then l =5, in other positions, l=8. sr j ,j∈[1,l] represents the saliency value of the j-th image block containing pixel (x, y).
 After finding the saliency value of all pixels in the image, use sr (x,y) As the gray value of the pixel at (x, y), the preliminary visual saliency image SM'∈R is obtained 95×72.
 2. Post-processing:
 In order to obtain a better detection effect, it is necessary to perform Gaussian blur on the visually saliency image SM' obtained in the previous step:
 SM gm =SM'**gm (9)
 Among them, gm stands for Gaussian template, SM gm ∈R 95×72 Represents the blurred image. The standard deviation σ of the Gaussian kernel used by the Gaussian template gm is 0.03 times the image width, that is: σ=0.03×95=2.85; the Gaussian template gm is a square, and its side length d is about 4 times the standard deviation, that is: d= 2.85×4=11.4≈11, the principle of approximation is: select the closest odd number.
 SM of the blurred image g Scale back to the original size of the input image 681×511, and round the gray values of all pixels to get the final visually saliency image SM∈C 95×72 ,Such as figure 1 As shown in the lower right corner, the picture is figure 1 Enter the visual saliency detection result of the image in the upper left corner.
 Experiments show that the method of the present invention can obtain a saliency detection accuracy rate of 91.29%, which is better than other similar saliency detection methods. Among them, the correct rate of saliency detection is defined as the ratio of the size of the saliency area correctly detected to the size of the total saliency area.
 The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the present invention. Within the scope of protection.