A video desnow removal method and apparatus based on tensor dimensionality reduction and Gaussian sparse coding

By employing tensor dimensionality reduction and Gaussian-scale sparse coding, combined with robust principal component analysis and sparse coding techniques, the problems of incomplete snowflake removal and artifacts in traditional methods are solved, achieving effective snowflake removal while preserving the moving foreground in snowfall videos.

CN118096572BActive Publication Date: 2026-06-30XINJIANG UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XINJIANG UNIVERSITY
Filing Date
2024-03-05
Publication Date
2026-06-30

Smart Images

  • Figure CN118096572B_ABST
    Figure CN118096572B_ABST
Patent Text Reader

Abstract

This invention discloses a video desnowing method and apparatus based on tensor dimensionality reduction and Gaussian scale sparse coding. The method includes: using vectorization algorithms to transform a two-dimensional frame data vector of size one frame in a certain channel into a column vector with x elements, reducing the dimensionality of a single-channel three-dimensional video stream into a set of two-dimensional data; linearly combining the video data of the three channels after dimensionality reduction into a two-dimensional matrix, and using alternating optimization to approximate the minimization of the Lagrangian function to obtain the processed video; the areas of all continuous regions form a set, and the threshold of each frame is adaptively adjusted by the area of ​​each region; when the threshold and the area of ​​the continuous regions satisfy the augmented Lagrangian function, only the moving foreground is retained in the mask image; using sparse coding, image blocks are extracted, and the image is restored at the image block level. Considering a set of m similar blocks, the similar block set is simultaneously sparsely encoded based on the GSM model to restore the foreground. The apparatus includes a processor and a memory.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of video image desnowing, and more particularly to a video desnowing method and apparatus based on tensor dimensionality reduction and Gaussian scale sparse coding. Background Technology

[0002] With the rapid development of computer vision technology, numerous applications have emerged in fields such as military, transportation, and urban security, including target tracking, target recognition, pose estimation, person re-identification, and scene segmentation. All of these applications require high-resolution video footage. Videos captured by outdoor surveillance equipment are severely affected by inclement weather, resulting in decreased video quality. Natural weather phenomena can be broadly categorized into dynamic and static weather, each with different impacts. Firstly, dynamic weather phenomena such as snowflakes, raindrops, and hail occur under conditions of snowfall, rain, and hail, appearing randomly in space. Secondly, static weather phenomena, such as heavy fog, smog, and sandstorms, severely affect the overall visibility of videos, causing visual systems to be unable to detect moving targets. Therefore, restoring video quality under adverse weather conditions has become a crucial research topic.

[0003] Currently, in efforts to eliminate the impact of videos captured in adverse weather conditions on the visual system, the restoration of video quality under dynamic weather conditions mainly focuses on the removal of rain streaks. [1,2,3] Research on snowflake removal is relatively limited. Rain streaks and snowflakes share similar characteristics: high brightness and fast falling speed. Some researchers have used video deraining algorithms for video desnow removal. [4,5] They only considered the similarity between rain streaks and snowflakes, without considering their differences. The physical properties of rain and snow are very different. For example, reference [6] describes a photometric model of rain, which treats falling raindrops as a line and assumes that rain streaks at different distances from the camera have the same size and falling speed. However, in videos taken by outdoor video imaging devices, the movement direction of snowflakes is random due to air resistance, and the movement speed of each snowflake is also different. Therefore, the method for removing rain streaks cannot be directly used to remove snowflakes. Therefore, it is necessary to study specific snowflake removal algorithms for snowfall videos. The more complex climatic conditions of snowfall than rainfall make the scenes in the video more complex, which brings more challenges to the research of video snow removal methods.

[0004] Previous studies have shown that some snow removal methods only consider the luminance characteristics of snowflakes when processing snowfall videos, neglecting chrominance features. These methods convert the video from RGB channels to YCbCr channels for post-processing, assuming that snowflake information mainly exists in the luminance channel Y. However, the chrominance channels Cb and Cr still contain a significant amount of snowflake information. Ignoring the chrominance channel information of the video leads to artifacts in the snowflake area when processing videos of heavy snow scenes. Furthermore, previous methods have not adequately considered the presence of moving objects in the video, and have not proposed effective methods for removing snowflakes that obscure the moving foreground. Therefore, researching a method that can effectively remove snowflakes while preserving the moving foreground has significant practical significance and application value.

[0005] References

[0006] [1] T.-X.Jiang, T.-Z.Huang,

[0007] [2] J.Liu, W.Yang, S.Yang, and Z.Guo, "D3R-Net: Dynamic Routing ResidueRecurrent Network for Video Rain Removal," IEEE Trans.on Image Process., vol.28, no.2, pp.699-712, Feb.2019

[0008] [3]J.Chen, C.-H.Tan, J.Hou, L.-P.Chau, and H.Li, "Robust Video ContentAlignment and Compensation for Rain Removal in a CNN Framework," in 2018 IEEE / CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun.2018, pp.6286-6295

[0009] [4]J.-H.Kim, J.-Y.Sim, and C.-S.Kim, "Video Deraining and DesnowingUsing Temporal Correlation and Low-Rank Matrix Completion," IEEE Trans.onImage Process., vol.24, no.9, pp.2658-2670, Sep.2015

[0010] [5]W.Ren, J.Tian, ​​Z.Han, A.Chan, and Y.Tang, "Video Desnowing andDeraining Based on Matrix Decomposition," in 2017 IEEE Conference on ComputerVision and Pattern Recognition (CVPR), Honolulu, HI, Jul.2017, pp.2838-2847

[0011] [6]K.Garg and SKNayar, "Detection and removal of rain from videos," in Proceedings of the 2004 IEEE Computer Society Conference on ComputerVision and Pattern Recognition, 2004. CVPR 2004., Washington, DC, USA, 2004, vol.1, pp.528-535 Summary of the Invention

[0012] This invention provides a video snow removal method and apparatus based on tensor dimensionality reduction and Gaussian scale sparse coding. This invention effectively removes snowflakes from videos using robust principal component analysis based on tensor dimensionality reduction, and effectively removes snowflakes that obscure moving foregrounds, thereby obtaining clear, detailed, and realistic snow-free videos. See the description below for details:

[0013] A first aspect is a video desnowing method based on tensor dimensionality reduction and Gaussian scale sparse coding, the method comprising:

[0014] Using vectorization algorithms, a two-dimensional frame data vector of one frame size in a certain channel is transformed into a column vector with x elements, reducing the dimensionality of a single-channel three-dimensional video stream into a set of two-dimensional data.

[0015] The video data from the three channels after dimensionality reduction are linearly combined into a two-dimensional matrix, and the Lagrange function is approximated by alternating optimization to obtain the processed video.

[0016] The area of ​​all continuous regions forms a set, and the threshold of each frame is adaptively adjusted based on the area of ​​each region.

[0017] When the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function, only the moving foreground is retained in the mask image;

[0018] Sparse coding is used to extract image patches and restore the image at the patch level. Considering a set of m similar patches, the set of similar patches is sparsely coded simultaneously based on the GSM model to restore the foreground.

[0019] The vectorization operation rule is as follows:

[0020]

[0021] The two-dimensional data is:

[0022]

[0023] in, Let w be the image width and h be the image height. Let x be the data from the t-th frame of the color video. This represents all the data for the t-th frame of the color video.

[0024] The threshold for each frame is adaptively adjusted based on the area of ​​each region as follows:

[0025]

[0026] When the threshold ψ and the area of ​​the continuous region satisfy the augmented Lagrangian function Area(R) p When this is the case, only the moving foreground is retained in the mask image:

[0027]

[0028] Among them, R p Representing a range of pixels in a binary mask, Area(·) is a mathematical method for calculating the area of ​​a contiguous region. All contiguous regions R p The areas form a set A = {Area(R1), ...,Area(R2)}. p Area(R) p )< <w×h。

[0029] The sparse encoding of the similar block set based on the GSM model is as follows:

[0030]

[0031] In the formula, X = [x1,...,x] m ] represents a set of m similar blocks, and A = ΛG is the group representation of sparse coefficients in the GSM model. Their first-order and second-order statistics are represented by […]. and To represent, x m Let represent the m-th similar block, and Λ be a diagonal matrix used to represent the variance field of the selected image. γ m Let g represent the m-th observation matrix. m Let B represent the m-th Gaussian matrix, and let B represent the reconstruction matrix. θ represents the scalar of noise variance, θ represents the scalar of sparsity, ∈ represents the sparsity bias, and Φ represents the overcomplete dictionary.

[0032] Secondly, a video desnow removal device based on tensor dimensionality reduction and Gaussian scale sparse coding, the device comprising:

[0033] The transformation and composition module is used to convert a two-dimensional frame data vector of one frame size in a certain channel into a column vector with x elements using vectorization algorithms, thereby reducing the dimensionality of a single channel's three-dimensional video stream into a set of two-dimensional data.

[0034] The processing module is used to linearly combine the three channels of video data after dimensionality reduction into a two-dimensional matrix, and then use alternating optimization to approximate the minimization of the Lagrange function to obtain the processed video.

[0035] The adjustment module is used to form a set of areas of all continuous regions, and the threshold of each frame is adaptively adjusted based on the area of ​​each region.

[0036] A retention module is used to retain only the moving foreground in the mask image when the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function.

[0037] The restoration module is used to extract image patches using sparse coding and restore the image at the image patch level. It considers a set of m similar patches and performs sparse coding on the set of similar patches simultaneously based on the GSM model, thereby restoring the foreground.

[0038] Thirdly, a video desnowing device based on tensor dimensionality reduction and Gaussian scale sparse coding, the device comprising: a processor and a memory, the memory storing program instructions, the processor calling the program instructions stored in the memory to cause the device to perform the method described in any of the first aspects.

[0039] Fourth aspect, a computer-readable storage medium storing a computer program, the computer program including program instructions that, when executed by a processor, cause the processor to perform the method described in any one of the first aspects.

[0040] The beneficial effects of the technical solution provided by this invention are:

[0041] 1. This invention proposes a robust principal component analysis method based on tensor dimensionality reduction. Compared with the traditional robust principal component analysis method, this method can obtain cleaner background information better without increasing time complexity and avoiding snow artifacts.

[0042] 2. When processing snowflakes in front of a moving foreground, this method does not need to decompose the moving object region into a clean local background and sparse snowflakes. Instead, it uses sparse coding to remove snowflakes from the region as a denoising problem. This method preserves the detailed features of the moving foreground well and avoids motion blur of the object.

[0043] 3. This invention proposes an outlier detection method based on L0 regularization and an adaptive threshold segmentation strategy to accurately separate moving targets and large snowflakes, thereby obtaining a more accurate moving foreground. Attached Figure Description

[0044] Figure 1 This is a schematic diagram of a video desnowing method based on tensor dimensionality reduction and Gaussian scale sparse coding;

[0045] Based on the description of the snowflake removal problem in the snowfall video, the background is first restored to obtain a clean, snow-free background; then, snowflake components and moving objects are extracted from the original video, and large snowflakes are removed from these dynamic components in the process; finally, local small snowflakes that obscure the moving foreground are removed to obtain a clean moving foreground.

[0046] Figure 2 The flowchart for the TDR-RPCA scheme;

[0047] Figure 3 A comparative illustration of the snow removal effect on "pedestrians" in a composite snowfall video;

[0048] Among them, (a) is the original video; (b) is the synthesized snowfall video; and (c) is the video processed by this algorithm.

[0049] Figure 4 A comparative illustration of the snow removal effect in a real snowfall video of a "forest" without any motion foreground.

[0050] (a) is the original video; (b) is the video processed by this algorithm.

[0051] Figure 5 A comparative illustration of the snow removal effect on "pedestrians" in a real snowfall video with potential for sports applications;

[0052] Among them, (a) is a real snowfall video with a moving foreground; (b) is the video processed by this algorithm. Detailed Implementation

[0053] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be described in further detail below.

[0054] Example 1

[0055] I. Background Estimation Based on Tensor Dimensionality Reduction

[0056] Since color video is a four-dimensional dataset, but traditional robust principal component analysis (BPI) can only process two-dimensional data, improvements were made to the traditional BPI to make it applicable to four-dimensional video data.

[0057] Assume the snowfall video consists of t frames of color images The resulting video sequence is composed of a color video where each frame is represented by a three-dimensional data point. When the imaging device is in a stable state, the background in the video remains stable, while the moving components in each frame of the color video exhibit temporal correlation.

[0058] First, consider a single channel in the video. When the imaging device is in a stable state, the background in the video remains stable, while the moving components in each frame of a color video are temporally correlated. We will first consider a single channel in the video, using vectorization algorithms. This algorithm transforms a two-dimensional frame data vector of size w×h from a given channel into a column vector with x elements (usually x >> t). Each frame is transformed according to this rule, thus reducing the dimensionality of a single-channel 3D video stream to a set of two-dimensional data, as shown below:

[0059]

[0060] In formula (1), the superscript 'c' represents the color channels of the color video. Dimensionally reducing the entire color video yields the dimensionality reduction matrices for the three channels, which are expressed as follows:

[0061] and

[0062] The following two-dimensional matrix is ​​obtained by linearly combining the video data from the three channels after dimensionality reduction:

[0063]

[0064] The schematic diagram of tensor dimensionality reduction is shown below. Figure 2 As shown, the clean background B in the observed video is considered a low-rank component, and the motion component D is a sparse component. The background B is constrained using the nuclear norm, and the motion component D is constrained using the L1 norm. The minimization problem is formulated as follows:

[0065]

[0066] stO=B+D

[0067] Since the expression of formula (3) is a binomial linear superposition convex optimization problem, it is difficult to solve directly using the augmented Lagrange multiplier method (ALM). Therefore, a new strategy is to use an alternating optimization method to approximate the minimization of the Lagrange function. In this embodiment of the invention, the alternating direction multiplier method (ADMM) is used. This method is applicable to binomial or polynomial convex optimization problems. It involves alternating iterations by fixing the minimization objective functions of B and D respectively. The iteration process is as follows:

[0068]

[0069]

[0070] Y k+1 =Y k +ρ k (B k+1 +D k+1 -O) (4)

[0071] in, For an augmented Lagrange matrix, For the reconstructed augmented Lagrangian function.

[0072] Each step of the above formula (4) using the soft threshold shrinkage operator and singular value decomposition is equivalent to:

[0073]

[0074]

[0075]

[0076] After obtaining a clean background B through TDR-RPCA, the stacked matrix of the form Bc,c∈{Y,U,V} needs to be restored to a four-dimensional tensor video sequence through the inverse process of vec(·) to obtain a clean video.

[0077] II. Outlier Detection Based on Adaptive Thresholding

[0078] In a video sequence, the foreground is defined as all moving objects in a different state from the background. The moving foreground exhibits non-rigid motion in the video sequence. Therefore, the pixel brightness changes caused by the moving foreground make it impossible to be regarded as the local background for restoration using a low-rank background model.

[0079] To remove snowflakes in the mask, an embodiment of the present invention proposes an adaptive thresholding method to determine whether a region is a snowflake pixel according to the area of the region in the mask image. Generally speaking, the area occupied by the moving foreground in the video is much larger than the area of the largest snowflake. Based on this condition, an adaptive thresholding scheme is designed. Here are two definitions: R p represents a number of pixel regions in the binary mask, and Area(·) is a mathematical method for calculating the area of a continuous region. The areas of all continuous regions R p constitute a set A = {Area(R1),..., Area(R p )}, Area(R p ) << w×h. The threshold for each frame is adaptively adjusted according to the area of each region, and the rule is as follows:

[0080]

[0081] When the threshold ψ and the area of the continuous region satisfy the augmented Lagrangian function Area(R p ), only the moving foreground is retained in the mask image:

[0082]

[0083] III. Removing Foreground Snowflakes Based on Simultaneous Sparse Coding with Gaussian Scale

[0084] For local or non-local filtering algorithms, these existing algorithms usually process images at the pixel level, which will cause blurring of objects in the video, generation of artifacts, and even over-smoothing of details. An embodiment of the present invention observes that the image blocks extracted by the block matching method have non-local similarity in the spatio-temporal region. Therefore, sparse coding is used to extract image blocks and restore the image at the image block level.

[0085] The Gaussian Scale Mixture (GSM) model is used to simulate the sparse coefficient vector. After being decomposed by the GSM model, the sparse coefficient vector can be expressed as the dot product of the Gaussian matrix g and the scalar multiplier θ, denoted as α i = g i θ i . According to the GSM prior of the sparse coefficient α and α iThe maximum a posteriori probability (MAP) is such that for a set of similar blocks, the sparse coefficients α corresponding to that set should have the same prior, meaning they can use the same probability density function θ. Therefore, considering a set of m similar blocks, we can simultaneously perform sparse coding on the set of similar blocks based on the GSM model.

[0086] GSM-based group simultaneous sparse coding is represented by the formula:

[0087]

[0088] In the formula, X = [x1,...,x] m ] represents a set of m similar blocks, and A = ΛG is the group representation of sparse coefficients in the GSM model. Their first-order and second-order statistics are represented by […]. and To express.

[0089] To address the foreground snow removal problem, this embodiment of the invention employs a simultaneous sparse coding method combining structural sparsity and a Gaussian scale mixture model to restore the foreground and remove tiny, sparse snowflakes that obscure the moving foreground. This image restoration problem can be described by the optimization equation (9):

[0090]

[0091] Example 2

[0092] To verify the effectiveness of this method in snow removal, experiments were conducted on three types of datasets: synthetic snowfall videos, real static snowfall videos, and real dynamic snowfall videos. The synthetic snowfall videos were further divided into two categories: those with moving objects and those without.

[0093] from Figure 3 As can be seen, this method can effectively remove snowflakes from videos without blurring moving objects. Here, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Feature Similarity (FSIM) are used to measure the quality of the snow removal results. From the evaluation metrics of each group of videos after snow removal in Table 1, it is easy to see that this method performs better in terms of PSNR, SSIM, and FSIM.

[0094] Table 1 Comparison of Objective Indicators

[0095]

[0096] from Figure 4 and Figure 5As can be seen, this method exhibits excellent snow removal performance, not only removing sparse snowflakes and large snowflakes that obscure the background, but also significantly removing snowflakes that obscure moving objects.

[0097] Example 3

[0098] A video desnow removal device based on tensor dimensionality reduction and Gaussian scale sparse coding, the device comprising:

[0099] The transformation and composition module is used to convert a two-dimensional frame data vector of one frame size in a certain channel into a column vector with x elements using vectorization algorithms, thereby reducing the dimensionality of a single channel's three-dimensional video stream into a set of two-dimensional data.

[0100] The processing module is used to linearly combine the three channels of video data after dimensionality reduction into a two-dimensional matrix, and then use alternating optimization to approximate the minimization of the Lagrange function to obtain the processed video.

[0101] The adjustment module is used to form a set of areas of all continuous regions, and the threshold of each frame is adaptively adjusted based on the area of ​​each region.

[0102] A retention module is used to retain only the moving foreground in the mask image when the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function.

[0103] The restoration module is used to extract image patches using sparse coding and restore the image at the image patch level. It considers a set of m similar patches and performs sparse coding on the set of similar patches simultaneously based on the GSM model, thereby restoring the foreground.

[0104] In summary, the embodiments of the present invention effectively remove snowflakes from videos through robust principal component analysis with tensor dimensionality reduction, and effectively remove snowflakes that obscure the moving foreground, thereby obtaining clear, detailed, and realistic snow-free videos.

[0105] Example 4

[0106] A video desnow removal device based on tensor dimensionality reduction and Gaussian scale sparse coding, the device comprising:

[0107] The device includes a processor and a memory, the memory of which stores program instructions. The processor invokes the program instructions stored in the memory to cause the device to perform the following method steps in Embodiment 1:

[0108] Using vectorization algorithms, a two-dimensional frame data vector of one frame size in a certain channel is transformed into a column vector with x elements, reducing the dimensionality of a single-channel three-dimensional video stream into a set of two-dimensional data.

[0109] The video data from the three channels after dimensionality reduction are linearly combined into a two-dimensional matrix, and the Lagrange function is approximated by alternating optimization to obtain the processed video.

[0110] The area of ​​all continuous regions forms a set, and the threshold of each frame is adaptively adjusted based on the area of ​​each region.

[0111] When the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function, only the moving foreground is retained in the mask image;

[0112] Sparse coding is used to extract image patches and restore the image at the patch level. Considering a set of m similar patches, the set of similar patches is sparsely coded simultaneously based on the GSM model to restore the foreground.

[0113] The vectorization operation rules are as follows:

[0114]

[0115] The two-dimensional data is:

[0116]

[0117] in, Let w be the image width and h be the image height. Let x be the data from the t-th frame of the color video. This represents all the data for the t-th frame of the color video.

[0118] The threshold for each frame is adaptively adjusted based on the area of ​​each region as follows:

[0119]

[0120] When the threshold ψ and the area of ​​the continuous region satisfy the augmented Lagrangian function Area(R) p When this is the case, only the moving foreground is retained in the mask image:

[0121]

[0122] Among them, R p Representing a range of pixels in a binary mask, Area(·) is a mathematical method for calculating the area of ​​a contiguous region. All contiguous regions R p The areas form a set A = {Area(R1), ...,Area(R2)}. p )},Aea(R p )< <w×h。

[0123] Among them, the sparse encoding of the similar block set based on the GSM model is as follows:

[0124]

[0125] In the formula, X = [x1,...,x] m ] represents a set of m similar blocks, and A = ΛG is the group representation of sparse coefficients in the GSM model. Their first-order and second-order statistics are represented by […]. and To represent, x m Let represent the m-th similar block, and Λ be a diagonal matrix used to represent the variance field of the selected image. γ m Let g represent the m-th observation matrix. m Let B represent the m-th Gaussian matrix, and let B represent the reconstruction matrix. θ represents the scalar of noise variance, θ represents the scalar of sparsity, ∈ represents the sparsity bias, and Φ represents the overcomplete dictionary.

[0126] It should be noted that the device descriptions in the above embodiments correspond to the method descriptions in the embodiments, and the embodiments of the present invention will not be repeated here.

[0127] The execution entities of the aforementioned processor and memory can be devices with computing functions such as computers, microcontrollers, and single-chip microcomputers. In specific implementations, the embodiments of the present invention do not limit the execution entities and can select them according to the needs of actual applications.

[0128] Data signals are transmitted between the memory and the processor via a bus, which will not be elaborated upon in this embodiment of the invention.

[0129] Based on the same inventive concept, embodiments of the present invention also provide a computer-readable storage medium, the storage medium including a stored program, which, when the program is running, controls the device where the storage medium is located to execute the method steps in the above embodiments.

[0130] The computer-readable storage medium includes, but is not limited to, flash memory, hard disk, solid-state drive, etc.

[0131] It should be noted that the description of the readable storage medium in the above embodiments corresponds to the description of the method in the embodiments, and the embodiments of the present invention will not be repeated here.

[0132] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. A computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the flow or function according to the embodiments of the present invention is generated.

[0133] A computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. Computer instructions can be stored in or transmitted through a computer-readable storage medium. A computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The available medium can be magnetic or semiconductor, etc.

[0134] Unless otherwise specified, the model numbers of the various devices in this embodiment of the invention are not limited, and any device that can perform the above functions is acceptable.

[0135] Those skilled in the art will understand that the accompanying drawings are merely schematic diagrams of a preferred embodiment, and the sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0136] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A video desnowing method based on tensor dimensionality reduction and Gaussian scale sparse coding, characterized in that, The method includes: Using vectorization algorithms, a two-dimensional frame data vector of one frame size in a certain channel is transformed into a... A column vector of elements reduces a single-channel 3D video stream to a set of 2D data. The video data from the three channels after dimensionality reduction are linearly combined into a two-dimensional matrix, and the Lagrange function is approximated by alternating optimization to obtain the processed video. The area of ​​all continuous regions forms a set, and the threshold of each frame is adaptively adjusted based on the area of ​​each region. When the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function, only the moving foreground is retained in the mask image; Sparse coding is used to extract image patches and restore the image at the patch level. Considering a set of n similar patches, the set of similar patches is sparsely coded simultaneously based on the GSM model to restore the foreground. The threshold for each frame is adaptively adjusted based on the area of ​​each region as follows: ; When the threshold The area of ​​the continuous region satisfies the augmented Lagrangian function. At this time, only the moving foreground is retained in the mask image: ; in, Represents a region of pixels in a binary mask. It is a mathematical method for calculating the area of ​​a continuous region; all continuous regions The areas form a set , The width of the image. The height of the image.

2. The video desnowing method based on tensor dimensionality reduction and Gaussian scale sparse coding according to claim 1, characterized in that, The vectorization operation rule is as follows: ; The two-dimensional data is: ; in, For domain, For the t-th frame of a color video, the th... One data point, This represents all the data for the t-th frame of the color video.

3. The video desnowing method based on tensor dimensionality reduction and Gaussian scale sparse coding according to claim 1, characterized in that, The sparse coding of similar block sets based on the GSM model is as follows: ; In the formula, express A set of similar blocks, This is a group representation of sparse coefficients in the GSM model, and their first-order and second-order statistics are represented by... To indicate, This represents the m-th similar block. It is a diagonal matrix used to represent the variance field of the selected image. , This represents the m-th observation matrix. Let m be the m-th Gaussian matrix. Represents the reconstructed matrix. A scalar representing the noise variance. A scalar representing sparsity. The bias representing sparsity. This indicates a complete dictionary.

4. A video desnow removal device based on tensor dimensionality reduction and Gaussian scale sparse coding, characterized in that, The device includes: The transformation and composition module is used to transform a two-dimensional frame data vector of one frame size in a certain channel into a vector using vectorization algorithms. A column vector of elements reduces a single-channel 3D video stream to a set of 2D data. The processing module is used to linearly combine the three channels of video data after dimensionality reduction into a two-dimensional matrix, and then use alternating optimization to approximate the minimization of the Lagrange function to obtain the processed video. The adjustment module is used to form a set of areas of all continuous regions, and the threshold of each frame is adaptively adjusted based on the area of ​​each region. A retention module is used to retain only the moving foreground in the mask image when the threshold and the area of ​​the continuous region satisfy the augmented Lagrangian function. The restoration module is used to extract image patches using sparse coding and restore the image at the image patch level. It considers a set of n similar patches and performs sparse coding on the set of similar patches simultaneously based on the GSM model, thereby restoring the foreground. The threshold for each frame is adaptively adjusted based on the area of ​​each region as follows: ; When the threshold The area of ​​the continuous region satisfies the augmented Lagrangian function. At this time, only the moving foreground is retained in the mask image: ; in, Represents a region of pixels in a binary mask. It is a mathematical method for calculating the area of ​​a continuous region; all continuous regions The areas form a set , The width of the image. The height of the image.

5. A video desnow removal device based on tensor dimensionality reduction and Gaussian scale sparse coding, characterized in that, The device includes a processor and a memory, the memory storing program instructions, the processor invoking the program instructions stored in the memory to cause the device to perform the method according to any one of claims 1-3.

6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, the computer program including program instructions that, when executed by a processor, cause the processor to perform the method described in any one of claims 1-3.