A training system and method for a video noise reduction model and related products

By employing a three-stage progressive training approach and a nonlinear preprocessing layer design for quantization-aware training on edge devices, the storage and computational bottlenecks of denoising models on edge devices are resolved, achieving efficient RAW image data domain denoising and making it suitable for quantization-based denoising models on edge devices.

CN122265073APending Publication Date: 2026-06-23BEIJING TSINGMICRO INTELLIGENT TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING TSINGMICRO INTELLIGENT TECH CO LTD
Filing Date
2026-02-11
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing neural network-based denoising models suffer from large storage volume and high computational overhead when deployed on edge devices. Direct quantization techniques lead to decreased denoising performance or introduce quantization artifacts. Furthermore, the distribution characteristics of RAW image data domain denoising models do not match those of general quantization techniques, resulting in performance loss.

Method used

A three-stage progressive training strategy is adopted, combining a nonlinear preprocessing layer and a denoising network backbone model. A quantization denoising model is obtained through quantization perception training, which is suitable for edge devices. This includes the design of nonlinear preprocessing layers and inverse processing layers, and the optimization of model parameters to adapt to the characteristics of RAW image data.

Benefits of technology

While ensuring high noise reduction performance, the model deployment resource requirements and computational consumption are reduced. The quantized noise reduction model suitable for edge devices can effectively improve noise reduction efficiency and solve the performance loss problem caused by model quantization.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122265073A_ABST
    Figure CN122265073A_ABST
Patent Text Reader

Abstract

The application provides a video noise reduction model training system, a training method and related products. The video noise reduction model training system comprises: a first training module configured to train a noise reduction network backbone model according to a first training data set and an original noise reduction network backbone model; and a second training module configured to train a noise reduction model according to a second training data set and the original noise reduction model. The original noise reduction model comprises a nonlinear preprocessing layer and the noise reduction network backbone model. In a training process, the original noise reduction model updates model parameters of the nonlinear preprocessing layer and keeps model parameters of the noise reduction network backbone model unchanged. A model quantization module performs quantization perception training on the noise reduction model to obtain a quantized noise reduction model. The video noise reduction model training system, the training method and the related products provided in the application can obtain a quantized noise reduction model suitable for deployment on an edge device.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image processing technology, specifically to a training system, training method, and related products for a video noise reduction model. Background Technology

[0002] With the development of deep learning and artificial intelligence technologies, noise reduction technology based on neural networks in the raw (RAW) image data domain has been widely used due to its excellent performance.

[0003] Neural network-based denoising models are typically large and complex, with numerous parameters and high computational costs. They usually rely on high-performance computing hardware such as Graphics Processing Units (GPUs) and execute in high-precision data formats such as 32-bit floating-point (FP32). However, in edge device applications such as mobile phones and cameras, the large storage size and high computational cost of denoising models constitute a bottleneck, severely limiting their deployment on edge devices. Model quantization, by converting model parameters such as model weights and activation values ​​from high-bit floating-point numbers (e.g., FP32) to low-bit specific-point numbers (e.g., INT8), can significantly compress model size, reduce memory usage, accelerate inference computation efficiency, and reduce the power consumption of electronic devices. However, directly applying general quantization techniques to denoising models can lead to a significant decrease in denoising performance or introduce severe quantization artifacts. Therefore, how to obtain RAW image data domain denoising models that can be applied to edge devices has become an important issue that urgently needs to be addressed in this field. Summary of the Invention

[0004] To address the problems in the prior art, embodiments of the present invention provide a training system, training method, and related products for a video denoising model, which can at least partially solve the problems existing in the prior art.

[0005] In a first aspect, the present invention proposes a training system for a video denoising model, comprising: The first training module is used to train and obtain the denoising network backbone model based on the first training dataset and the original denoising network backbone model; wherein, the first training dataset includes multiple clean... A pair of noisy image data, wherein the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; The second training module is used to train a denoising model based on a second training dataset and the original denoising model. The original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model. During training, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged. The second training dataset includes multiple clean... Noisy image data pairs; The model quantization module performs quantization-aware training on the denoising model to obtain a quantized denoising model.

[0006] Furthermore, the nonlinear preprocessing layer includes:

[0007] Where x represents the value of the original image data after brightness normalization; This represents the model parameters of the nonlinear preprocessing layer. .

[0008] Furthermore, the training system for the video denoising model provided in this embodiment of the invention further includes: A joint optimization module is used to optimize and train the denoising model based on a third training dataset; wherein, a set learning rate is used in the optimization training, and the set learning rate is less than the learning rate used by the second training module; the third training dataset includes multiple clean datasets. Noisy image data pairs.

[0009] Furthermore, the training system for the video denoising model provided in this embodiment of the invention further includes: The model building module is used to build an edge quantization denoising model based on the quantization denoising model; the edge quantization denoising model includes the quantization denoising model and an inverse processing layer, and the inverse processing layer is used to perform the inverse processing of the nonlinear preprocessing layer.

[0010] Furthermore, the reverse processing layer includes:

[0011] Where y represents the image data processed by the quantization and denoising model. This represents the model parameters of the reverse processing layer. .

[0012] Secondly, the present invention provides a training method for a video denoising model, comprising: The denoising network backbone model is trained based on the first training dataset and the original denoising network backbone model; the first training dataset includes multiple clean... A pair of noisy image data, wherein the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; A denoising model is trained based on the second training dataset and the original denoising model; wherein, the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during the training process, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein, the second training dataset includes multiple clean... Noisy image data pairs; The denoising model is subjected to quantization perception training to obtain a quantization denoising model.

[0013] Furthermore, the nonlinear preprocessing layer includes:

[0014] Where x represents the brightness-normalized value of the original image data; α represents the model parameters of the nonlinear preprocessing layer. .

[0015] Thirdly, the present invention provides a neural network processor, comprising a quantized noise reduction model trained using the training system for the video noise reduction model described in any of the above embodiments, including: The acquisition module is used to receive raw image data; The noise reduction module is used to perform noise reduction processing on the original image data through an edge quantization noise reduction model, wherein the edge quantization noise reduction model includes the quantization noise reduction model.

[0016] Fourthly, the present invention provides an artificial intelligence image signal processor, including at least one neural network processor as described in the above embodiments.

[0017] Fifthly, the present invention provides a chip including at least one artificial intelligence image signal processor as described in the above embodiments.

[0018] In a sixth aspect, the present invention provides an edge device comprising at least one artificial intelligence image signal processor as described in the above embodiments or at least one chip as described in the above embodiments.

[0019] In a seventh aspect, the present invention provides a computer device comprising a training system for the video noise reduction model described in any of the above embodiments.

[0020] The video denoising model training system, training method, and related products provided in this invention include a first training module for training a denoising network backbone model based on a first training dataset and an original denoising network backbone model; wherein the first training dataset includes multiple clean... The system consists of two parts: a noisy image data pair, where the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; a second training module is used to train a denoising model based on a second training dataset and the original denoising model; wherein the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during training, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein the second training dataset includes multiple clean image data pairs. Noisy image data pairs; Model quantization module, which performs quantization perception training on the denoising model to obtain a quantized denoising model that can be deployed on edge devices. Attached Figure Description

[0021] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings: Figure 1 This is a schematic diagram of the structure of a training system for a video noise reduction model provided in an embodiment of the present invention.

[0022] Figure 2 This is a schematic diagram of the structure of a training system for a video noise reduction model provided in another embodiment of the present invention.

[0023] Figure 3 This is a schematic diagram of the structure of a training system for a video noise reduction model provided in another embodiment of the present invention.

[0024] Figure 4 This is a flowchart illustrating a training method for a video noise reduction model provided in an embodiment of the present invention.

[0025] Figure 5 This is a schematic diagram of the structure of a neural network processor provided in an embodiment of the present invention. Detailed Implementation

[0026] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings. Here, the illustrative embodiments of the present invention and their descriptions are used to explain the present invention, but are not intended to limit the present invention. It should be noted that, unless otherwise specified, the embodiments and features in the embodiments of this application can be combined with each other. The acquisition, storage, use, and processing of data in the technical solutions of this application all comply with the relevant provisions of laws and regulations. The user information in the embodiments of this application is obtained through legal and compliant means, and the acquisition, storage, use, and processing of user information have been agreed upon by the customer.

[0027] To facilitate understanding of the technical solution provided in this application, the relevant content of the technical solution in this application will be explained below.

[0028] Image sensor: A sensor that uses photoelectric conversion to convert light signals on a light surface into quantifiable voltage signals.

[0029] Artificial Intelligence Image Signal Processor (AI-ISP): An innovative image processing engine that combines artificial intelligence technology with traditional image signal processing methods, which can significantly improve image quality in industries such as security and mobile phones.

[0030] Bayer Pattern: A specific image sensor arrangement. This arrangement divides the pixels in the image sensor data into three colors: red, green, and blue, and arranges them according to a specific pattern. The number of green pixels is typically twice that of red and blue pixels because the human eye is more sensitive to green.

[0031] Quantization-Aware Training (QAT): This method simulates the quantization process during model training (or fine-tuning), allowing the model to perceive the numerical errors caused by quantization in advance. This enables the model to adjust its parameters during training to achieve optimal performance after quantization.

[0032] In the existing technology, directly applying general quantization techniques to the RAW image data domain denoising model has the following problems: (1) Difficulty in maintaining accuracy: RAW image data has a very large dynamic range (usually 10-12 bits) and contains a lot of noise when the illumination is low. The denoising model needs to extract extremely weak detail signals from the noise. General quantization techniques will erase these detail signals, resulting in a significant decrease in denoising performance or the introduction of serious quantization artifacts (such as block effect, color distortion); (2) Large distribution differences: The distribution of activation values ​​of the RAW image data domain denoising model (especially in the shallow layer) is very different from the model distribution of clean RGB images, and its statistical characteristics are significantly affected by the denoising model. The calibration method based on the statistics of the entire dataset (such as KL divergence) commonly used in general quantization techniques often fails in this RAW image data domain denoising. (3) Uneven sensitivity: Different layers and different channels of the model have extremely different sensitivities to quantization. The output layer and the attention mechanism layer that are sensitive to noise require higher precision quantization, while other layers can withstand more aggressive quantization. A uniform quantization strategy will lead to performance loss.

[0033] Most existing neural processing unit (NPU) hardware only supports uniform quantization. However, RAW image data has high dynamic range and signal-dependent noise, exhibiting significant non-uniformity in its distribution: pixel values ​​are concentrated in low-brightness areas, while a small number of pixels have extremely high brightness values; the noise model follows a Poisson-Gaussian distribution, which is related to signal intensity; and the Bayer effect causes different color channels to have different statistical properties. Directly applying uniform quantization leads to a significant loss of detail, especially in dark areas, where excessively large quantization intervals cause detail loss and artifacts.

[0034] Therefore, this application proposes a training system for video denoising models based on the data characteristics and task features of RAW image data domain denoising models. This system can train a quantized denoising model for RAW image data domain suitable for NPUs on edge devices, achieving model compression and acceleration while ensuring high denoising performance.

[0035] Figure 1 This is a schematic diagram of the structure of a training system for a video denoising model provided in an embodiment of the present invention, as shown below. Figure 1 As shown, the video denoising model training system provided in this embodiment of the invention includes a first training module 101, a second training module 102, and a model quantization module 103, wherein: The first training module 101 is used to train and obtain the denoising network backbone model based on the first training dataset and the original denoising network backbone model; wherein, the first training dataset includes multiple clean A pair of noisy image data, wherein the noisy image data is the original image data and the clean image data is the original image data after noise removal; The second training module 102 is used to train a denoising model based on the second training dataset and the original denoising model; wherein, the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during the training process, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein, the second training dataset includes each original image data and its corresponding label image data. The model quantization module 103 performs quantization perception training on the denoising model to obtain a quantized denoising model.

[0036] Specifically, the first training module 101 trains the original denoising network backbone model using the first training dataset, thereby obtaining the denoising network backbone model. The first training dataset includes multiple clean... Noisy image data pairs, each clean The noisy image data pair includes noisy image data and corresponding clean image data. The noisy image data is the original image data, which can be acquired through an image sensor. The clean image data is obtained by removing noise from the noisy image data and serves as the training label for the noisy image data. The backbone model of the denoising network can adopt a UNET network structure based on 2D convolution, selected according to actual needs; this embodiment of the invention does not impose limitations. The first training dataset includes clean... The number of noisy image data pairs can be set according to actual needs, and this embodiment of the invention does not limit it.

[0037] After obtaining the backbone model of the denoising network, a nonlinear preprocessing layer is added to the input of the backbone model to form the original denoising model. The second training module 102 trains the original denoising model using a second training dataset. During training, the model parameters of the nonlinear preprocessing layer are updated, but the model parameters of the backbone model remain unchanged. This allows the nonlinear preprocessing layer to learn a data transformation method suitable for the denoising task, unaffected by interference from the backbone model, ultimately resulting in the denoising model. The nonlinear preprocessing layer in the denoising model is used to map the original RAW data from a non-uniform distribution to an approximately uniform distribution. The second training dataset includes multiple clean... Noisy image data pairs. The second training dataset can be a clean dataset from the first training set. Noisy image data pairs can also be processed using a new, clean method. The noise image data pairs can be set according to actual needs, and the embodiments of the present invention do not impose limitations.

[0038] In existing technologies, data preprocessing is typically separated from model training. Furthermore, linear scaling or simple normalization methods cannot effectively address the uneven distribution of RAW data, and linear compression compresses both signal and noise simultaneously, reducing denoising effectiveness. RAW image data often exhibits high dynamic range characteristics, with a large number of pixels concentrated in low-brightness areas and a small number of pixels possessing extremely high brightness values. Nonlinear preprocessing layers can transform the data distribution into a nonlinear form at the preprocessing level, addressing the technical challenge of quantizing denoising in the underlying visual RAW image data domain, which hinders deployment. This application provides a method that integrates nonlinear preprocessing into the denoising model, achieving joint denoising of data-driven preprocessing and a noise reduction backbone network, thereby improving denoising performance.

[0039] After obtaining the denoising model, quantization-aware training is performed on it to obtain a quantized denoising model. The resulting quantized denoising model can be deployed to edge devices. It maintains high denoising performance while reducing the resources required for deployment and lowering computational resource consumption. The specific process of quantization-aware training is existing technology and will not be elaborated here. For example, a 32-bit floating-point (FP32) denoising model can be quantized-awarely trained to obtain an 8-bit integer (INT8) quantized denoising model, thereby compressing the model size and improving inference speed.

[0040] Understandably, in order to make the quantization denoising model more suitable for the NPU of edge devices, the model parameters can be manually quantized and calibrated according to the tensor and weight distribution characteristics of each layer of the quantization denoising model.

[0041] In this implementation case, the video denoising model adopted a three-stage progressive training strategy to ensure the stability of the denoising network training that combines nonlinear preprocessing and the backbone denoising network, which is conducive to obtaining better convergence results.

[0042] The video denoising model training system provided in this embodiment of the invention includes a first training module, used to train and obtain a denoising network backbone model based on a first training dataset and an original denoising network backbone model; wherein, the first training dataset includes multiple clean... The system consists of two parts: a noisy image data pair, where the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; a second training module is used to train a denoising model based on a second training dataset and the original denoising model; wherein the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during training, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein the second training dataset includes multiple clean image data pairs. Noisy image data pairs; Model quantization module, which performs quantization perception training on the denoising model to obtain a quantized denoising model that can be deployed on edge devices.

[0043] Based on the above embodiments, the nonlinear preprocessing layer further includes:

[0044] Where x represents the value of the original image data after brightness normalization; This represents the model parameters of the nonlinear preprocessing layer. .

[0045] Specifically, this application proposes a differentiable nonlinear function. The model parameters, which serve as the nonlinear preprocessing layer, are obtained through model training. This method controls the degree of nonlinearity, allowing for the adjustment of the slope in the numerical representation of dark areas to enhance those regions while preserving the representation of bright areas. The differentiable nonlinear function of the nonlinear preprocessing layer is jointly trained with the backbone model of the denoising network. Through an end-to-end autonomous training optimization strategy, the model parameters of the nonlinear preprocessing layer and the backbone model of the denoising network achieve collaborative adaptation. This joint training of the nonlinear preprocessing layer and the backbone model not only improves the denoising performance of the model but also enhances its learning of uniform quantization distributions, laying the foundation for deploying the denoising model on the NPU of edge devices.

[0046] Figure 2 This is a schematic diagram of the structure of a training system for a video denoising model provided in another embodiment of the present invention, as shown below. Figure 2 As shown, based on the above embodiments, the training system for the video denoising model provided in this embodiment further includes a joint optimization module 104, wherein: The joint optimization module 104 is used to optimize and train the denoising model based on the third training dataset; wherein, a set learning rate is used in the optimization training, and the set learning rate is less than the learning rate used by the second training module; the third training dataset includes multiple clean datasets. Noisy image data pairs.

[0047] Specifically, the joint optimization module 104 trains the denoising model using the third training dataset, fine-tuning the model parameters of the nonlinear preprocessing layer and the backbone model of the denoising network. The joint optimization module 104 uses a set learning rate to optimize the training of the denoising model, and this set learning rate is less than the learning rate used by the second training module 102.

[0048] In one embodiment, the second training module 102 uses a learning rate of 0.0001, and the joint optimization module 104 uses a learning rate of 0.000001.

[0049] Understandably, the model quantization module 103 performs quantization-aware training on the optimized and trained denoising model.

[0050] Figure 3 This is a schematic diagram of the structure of a training system for a video denoising model provided in another embodiment of the present invention, as shown below. Figure 3 As shown, based on the above embodiments, the training system for the video denoising model provided in this embodiment further includes a model building module 105, wherein: The model building module 105 is used to build an edge quantization denoising model based on the quantization denoising model; the edge quantization denoising model includes the quantization denoising model and an inverse processing layer, and the processing of the inverse processing layer is the inverse processing of the nonlinear preprocessing layer.

[0051] Specifically, the model building module 105 adds a reverse processing layer to the output of the quantization and noise reduction model. The reverse processing layer is the opposite of the nonlinear preprocessing layer. The processing of the input data by the reverse processing layer is the reverse of the processing of the input data by the nonlinear preprocessing layer.

[0052] Based on the above embodiments, the reverse processing layer further includes:

[0053] Where y represents the image data processed by the quantization and denoising model. This represents the model parameters of the reverse processing layer. .

[0054] Specifically, for the image data y output by the quantization and denoising model, according to the formula... It can be calculated . This represents the model parameters of the inverse processing layer and the model parameters of the nonlinear preprocessing layer. same.

[0055] Figure 4 This is a flowchart illustrating a training method for a video denoising model according to an embodiment of the present invention, as shown below. Figure 4 As shown, the training method for the video denoising model provided in this embodiment of the invention includes: S401. Based on the first training dataset and the original denoising network backbone model, train to obtain the denoising network backbone model; the first training dataset includes multiple clean... A pair of noisy image data, wherein the noisy image data is the original image data, and the clean image data is obtained by removing noise from the noisy image data; Specifically, a first training dataset is obtained, and the original denoising network backbone model is trained using the first training dataset to obtain the denoising network backbone model. The denoising network backbone model can adopt a 2D convolutional-based UNET network structure, selected according to actual needs; this embodiment of the invention does not impose limitations. The first training dataset includes clean... The number of noisy image data pairs can be set according to actual needs, and this embodiment of the invention does not limit it.

[0056] clean The noisy image data pair includes noisy image data and corresponding clean image data. The noisy image data is the original image data, which can be acquired by an image sensor. The clean image data is the corresponding noisy image data after noise removal, and it serves as the training label for the noisy image data.

[0057] S402. Based on the second training dataset and the original denoising model, a denoising model is trained to obtain a denoising model; wherein, the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during the training process, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein, the second training dataset includes multiple clean... Noisy image data pairs; Specifically, a second training dataset is obtained, and the original denoising model is trained using this dataset to obtain the denoising model. The original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model, with the output of the nonlinear preprocessing layer connected to the input of the denoising network backbone model. The noisy image data included in the second training dataset is input into the nonlinear preprocessing layer. After nonlinear processing by the nonlinear preprocessing layer, it is then input into the denoising network backbone model. During training, the model parameters of the nonlinear preprocessing layer are updated, while the model parameters of the denoising network backbone model remain unchanged. This allows the nonlinear preprocessing layer to learn a data transformation method suitable for the denoising task, without being affected by the denoising network backbone model.

[0058] S403. Perform quantization perception training on the denoising model to obtain a quantization denoising model.

[0059] Specifically, after obtaining the denoising model, quantization-aware training is performed on the denoising model to obtain a quantized denoising model. The obtained quantized denoising model is used for deployment on edge devices. The quantized denoising model can reduce the resources required for deployment and reduce the consumption of computing resources while maintaining high denoising performance.

[0060] The video denoising model training method provided in this embodiment of the invention can train and obtain a denoising network backbone model based on a first training dataset and the original denoising network backbone model; the first training dataset includes multiple clean... The noisy image data pair consists of the original image data and the clean image data obtained by removing noise from the noisy image data. A denoising model is trained based on a second training dataset and the original denoising model. The original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model. During training, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged. The second training dataset includes multiple clean image datasets. Noisy image data pairs; the denoising model is subjected to quantization perception training to obtain a quantization denoising model, which can be used to obtain a quantization denoising model suitable for deployment on edge devices.

[0061] Based on the above embodiments, the nonlinear preprocessing layer further includes:

[0062] Where x represents the value of the original image data after brightness normalization; This represents the model parameters of the nonlinear preprocessing layer. .

[0063] Based on the above embodiments, the training method for the video denoising model provided in this embodiment of the invention further includes: The denoising model is optimized and trained using a third training dataset; wherein, a set learning rate is used in the optimization training, and the set learning rate is less than the learning rate used in the second training module; the third training dataset includes multiple clean datasets. Noisy image data pairs.

[0064] Based on the above embodiments, the training method for the video denoising model provided in this embodiment of the invention further includes: The optimized denoising model is subjected to quantization-perception training to obtain a quantized denoising model.

[0065] Specific embodiments of the training method for the video denoising model provided in this invention can be found in the detailed description of the above-described embodiments of the training system for the video denoising model, and will not be repeated here.

[0066] The training method for the video denoising model provided in this embodiment of the invention can be applied to the training system for the video denoising model described in any of the above embodiments.

[0067] Figure 5 This is a schematic diagram of the structure of a neural network processor provided in an embodiment of the present invention, as shown below. Figure 5 As shown, the neural network processor provided in this embodiment of the invention, using the quantization denoising model trained by the training system of the video denoising model described in any of the above embodiments, includes a receiving module 501 and a denoising module 502, wherein: The acquisition module 501 is used to acquire the original image data; the noise reduction module 502 is used to perform noise reduction processing on the original image data through the edge quantization noise reduction model, wherein the edge quantization noise reduction model includes the quantization noise reduction model.

[0068] Specifically, an edge quantization denoising model is deployed in the neural network processor. This model includes the quantization denoising model itself and the inverse processing layer. The neural network processor acquisition module 501 acquires raw image data, which may be obtained through an image sensor. The denoising module 502 inputs the raw image data into the edge quantization denoising model to perform denoising processing, obtaining denoised raw image data. The denoised raw image data can then be provided to an image signal processor for further processing.

[0069] Neural network processors can be found in edge devices such as smartphones and cameras.

[0070] The neural network processor provided in this invention reduces hardware requirements and expands its application scope by using a quantized denoising model trained by a video denoising model training system. It can be deployed on NPU hardware that only supports uniform quantization, significantly improving denoising efficiency while ensuring denoising quality, thus solving the problem of severe model accuracy loss caused by NPU hardware only supporting uniform quantization in existing technologies.

[0071] This invention provides an artificial intelligence image signal processor, including at least one neural network processor as described in the above embodiments.

[0072] This invention provides a chip that includes at least one artificial intelligence image signal processor as described in the above embodiments.

[0073] This invention provides an edge device, including at least one artificial intelligence image signal processor or at least one chip as described in the above embodiments.

[0074] The edge devices include, but are not limited to, devices with AI application needs such as smartphones, security cameras, doorbell cameras, handheld devices, drones, and handheld or portable medical imaging devices.

[0075] This invention provides a computer device including a training system for the video noise reduction model described in any of the above embodiments.

[0076] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0077] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0078] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0079] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0080] In the description of this specification, the references to terms such as "an embodiment," "a specific embodiment," "some embodiments," "for example," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0081] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A training system for a video denoising model, characterized in that, include: The first training module is used to train and obtain the denoising network backbone model based on the first training dataset and the original denoising network backbone model; wherein, the first training dataset includes multiple clean... A pair of noisy image data, wherein the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; The second training module is used to train a denoising model based on a second training dataset and the original denoising model. The original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model. During training, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged. The second training dataset includes multiple clean... Noisy image data pairs; The model quantization module performs quantization-aware training on the denoising model to obtain a quantized denoising model.

2. The training system for the video denoising model according to claim 1, characterized in that, The nonlinear preprocessing layer includes: Where x represents the value of the original image data after brightness normalization; This represents the model parameters of the nonlinear preprocessing layer. .

3. The training system for the video denoising model according to claim 1, characterized in that, Also includes: A joint optimization module is used to optimize and train the denoising model based on a third training dataset; wherein, a set learning rate is used in the optimization training, and the set learning rate is less than the learning rate used by the second training module; the third training dataset includes multiple clean datasets. Noisy image data pairs.

4. The training system for the video denoising model according to any one of claims 1 to 3, characterized in that, Also includes: The model building module is used to build an edge quantization and denoising model based on the quantization and denoising model. The edge quantization noise reduction model includes the quantization noise reduction model and the inverse processing layer, which is used to perform the inverse processing of the nonlinear preprocessing layer.

5. The training system for the video denoising model according to claim 4, characterized in that, The reverse processing layer includes: Where y represents the image data processed by the quantization and denoising model. This represents the model parameters of the reverse processing layer. .

6. A training method for a video denoising model, characterized in that, include: Based on the first training dataset and the original denoising network backbone model, a denoising network backbone model is trained to obtain the denoising network backbone model. The first training dataset includes multiple clean A pair of noisy image data, wherein the noisy image data is the original image data and the clean image data is obtained by removing noise from the noisy image data; A denoising model is trained based on the second training dataset and the original denoising model; wherein, the original denoising model includes a nonlinear preprocessing layer and a denoising network backbone model; during the training process, the model parameters of the nonlinear preprocessing layer are updated while the model parameters of the denoising network backbone model remain unchanged; wherein, the second training dataset includes multiple clean... Noisy image data pairs; The denoising model is subjected to quantization perception training to obtain a quantization denoising model.

7. The training method for the video denoising model according to claim 6, characterized in that, The nonlinear preprocessing layer includes: Where x represents the brightness-normalized value of the original image data; α represents the model parameters of the nonlinear preprocessing layer. .

8. A neural network processor, characterized in that, A quantization denoising model trained using the training system of the video denoising model according to any one of claims 1 to 5, comprising: The acquisition module is used to receive raw image data; The noise reduction module is used to perform noise reduction processing on the original image data through an edge quantization noise reduction model, wherein the edge quantization noise reduction model includes the quantization noise reduction model.

9. An artificial intelligence image signal processor, characterized in that, It includes at least one neural network processor as described in claim 8.

10. A chip, characterized in that, It includes at least one artificial intelligence image signal processor as described in claim 9.

11. An edge device, characterized in that, It includes at least one artificial intelligence image signal processor as described in claim 9 or at least one chip as described in claim 10.

12. A computer device, characterized in that, The training system includes the video denoising model according to any one of claims 1 to 5.