Endoscopic image electronic staining method, device, system, and storage medium
By employing high-order nonlinear spectral deconstruction and spatiotemporal co-processing of image blocks, the problems of insufficient contrast and noise amplification in lesion detection in endoscopic technology are solved, achieving high-quality narrowband image reconstruction and improving the diagnostic effect of early lesions.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- MACROLUX MEDICAL TECH CO LTD
- Filing Date
- 2026-05-20
- Publication Date
- 2026-06-19
AI Technical Summary
Existing endoscopic techniques suffer from a high rate of missed diagnoses in early lesion detection. In particular, the improved resolution of traditional white light endoscopes has not completely solved the difficulty in identifying tiny lesions. Chemical staining is cumbersome and unsuitable for routine screening. Insufficient light source in hardware NBI results in dim images. Electronic staining techniques are inferior to hardware NBI in terms of lesion contrast and texture sharpness.
A high-order nonlinear spectral deconstruction algorithm is used to expand broadband RGB images into high-dimensional images. Through spatiotemporal co-processing of image patches and combined with a narrowband image calibration matrix, high-contrast reconstruction of microvessels and surface textures is achieved, followed by noise reduction to generate high-quality narrowband images.
Without adding hardware, the contrast and texture clarity of lesion observation were improved, noise interference was reduced, high-fidelity narrowband images were generated, and the diagnostic accuracy of early lesions was improved.
Smart Images

Figure CN122243850A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of medical image analysis and processing technology, and in particular to an electronic staining method, apparatus, system and storage medium for endoscopic images. Background Technology
[0002] With the rapid development of miniaturized sensors and minimally invasive medical technologies, endoscopic minimally invasive or non-invasive medical examinations and treatments have become widely adopted. In modern minimally invasive medicine, endoscopy has become the absolute gold standard for screening, diagnosing, and treating digestive tract diseases. Endoscopic technology allows imaging components to enter the body through natural openings or small surgical incisions, acquiring images of the examined areas to provide more intuitive and accurate diagnostic information. Electronic endoscopes integrate imaging lenses and image sensors at the probe tip, transmitting the images to an image processing unit for processing and display on a screen for user observation. In clinical diagnosis, because malignant tumors inevitably undergo abnormal changes in the newly formed capillary network during their early proliferative stages, observing the microvascular morphology and surface texture of the mucosal surface and submucosa has become a core basis for early optical diagnosis of cancer.
[0003] With the rapid development of complementary metal-oxide-semiconductor (CMOS) image sensors and optical illumination technology, the imaging resolution of electronic endoscopes has leaped from standard definition to full high definition and even 4K levels. However, the improvement in resolution cannot completely solve the problem of difficulty in identifying early, small lesions. For example, early malignant tumors of the digestive tract are often relatively flat in shape, and their surface color is similar to the surrounding normal mucosal tissue. This makes traditional white light endoscopes based on broadband spectrum have a high rate of missed diagnoses in the detection of early lesions.
[0004] To overcome this diagnostic bottleneck, the medical community has introduced narrowband imaging technology to observe the characteristics of microvascular formation. Oxyhemoglobin and deoxyhemoglobin have two very significant absorption peaks in the visible light spectrum, located near 415 nm and 540 nm, respectively. With the introduction of narrowband imaging technology, abnormal changes in capillaries can be better observed by examining these two significant absorption peaks.
[0005] Based on this physical characteristic, there are currently three main endoscopic techniques in clinical practice to enhance microvascular contrast:
[0006] The first approach is traditional chemichromoendoscopy. This technique involves spraying chemical dyes such as Lugol's iodine, indigo carmine, or methylene blue onto the mucosal surface during endoscopic examination. The selective absorption or physical deposition of the dye between different cells and tissues highlights the boundaries of lesions. However, chemichromoendoscopy is cumbersome, time-consuming, and uneven dye distribution can easily obstruct the field of vision. In addition, some dyes have potential irritant properties to the human body, making it unsuitable as a large-scale routine screening method.
[0007] The second approach is narrow-band imaging (NBI) technology based on hardware optical filters (developed by Olympus). Hardware NBI technology uses a mechanical filter wheel or a specific wavelength LED array at the light source to directly filter out the red light component of broadband white light, allowing only narrow-band light of 415 nm and 540 nm to illuminate the tissue. Because these two wavelengths penetrate the mucosa at different depths, the 415 nm blue light is strongly absorbed by superficial capillaries, while the 540 nm green light penetrates to the submucosa and is absorbed by deeper blood vessels. The reflected light signals, after being received by the sensor, can present a high-contrast brown and cyan vascular network. Although hardware NBI has achieved great clinical success and significantly improved the detection rate of adenomas, it has an inherent physical drawback: the narrow-band filter reduces the total luminous flux of the light source, resulting in a very dim overall endoscopic field of view. When performing long-range observation of large cavities such as the stomach or colon, the images are often filled with obvious shot noise due to insufficient light intake. Doctors must bring the lens close to the mucosa to obtain a clear image, which seriously affects the smoothness of endoscopic operation and screening efficiency.
[0008] The third approach is electronic staining based on software algorithms. Representative technologies include Fujifilm's Flexible Spectral Imaging Colour Enhancement (FICE) and Pentax's i-Scan intelligent staining optical enhancement endoscope. Electronic staining eliminates physical filters, using conventional broadband white light illumination to obtain RGB images. Then, backend image processing algorithms utilize "spectral estimation" techniques to infer and reconstruct tissue reflectance images at specific narrowband wavelengths from the broadband RGB signals. For example, the FICE system pre-acquires a dataset of tissue spectral reflectance and uses principal component analysis or Wiener estimation to calculate a simple 3x3 linear transformation matrix, linearly mapping the RGB channel signals to the inferred blue-green channel signals. Because there are no physical filters, electronic staining maintains the high brightness advantage of white light illumination, providing a bright field of view and ease of use. Summary of the Invention
[0009] Although virtual electron microscopy (NBI) techniques, such as FICE and i-Scan, have addressed the issue of insufficient light intake in hardware-based NBI, multiple multicenter randomized controlled trials, after years of clinical validation, have shown that traditional linear transformation-based NBI techniques often lag behind hardware-based NBI in terms of statistical power in improving adenoma detection and lesion diagnosis rates. The fundamental reason for this difference in clinical performance lies in the physical incompleteness of their underlying algorithmic mechanisms, specifically manifested in the following technical shortcomings:
[0010] The first drawback is that linear spectral estimation cannot accurately characterize complex nonlinear tissue optical processes, resulting in insufficient contrast of lesions.
[0011] Current mainstream electronic spectrum analysis techniques generally assume a simple linear combination relationship between the RGB response of the sensor and the target narrowband wavelength, and perform transformation by obtaining a uniform linear projection matrix for the entire image. However, endoscopic imaging is a complex nonlinear process. First, the propagation of light within gastrointestinal tissues is not a simple surface reflection, but rather involves multiple scattering and nonlinear absorption coupling processes. Second, to adapt to the characteristics of human vision and compress data, the image signal processor of a medical camera performs nonlinear gamma correction, adaptive dark current compensation, and color interpolation on the raw photoelectric signal captured by the sensor. When the real physical world undergoes these multiple nonlinear distortions, traditional linear matrix calculations produce severe metamerism errors, meaning that two tissues that appear identical under white light but are actually different under a certain narrowband spectrum cannot be distinguished even after linear mapping. This results in the virtual blue-green image obtained by linear spectrum analysis lacking sufficient color latitude and structural sharpness, failing to approximate the contrast at the edges of microvessels under real physical narrowband light illumination.
[0012] The second drawback is that the introduction of high-order nonlinear functions for fitting inevitably leads to a high-order amplification of the image's background noise. To overcome the limitations of linear spectral resolution, computational photography has proposed nonlinear spectral reconstruction methods such as polynomial regression and root polynomial regression. These methods, by expanding the original 3D RGB signal into a high-dimensional eigenvector containing square, cube, and cross-multiplication terms, can approximate the true narrowband reflectance spectrum with high accuracy. However, this high-order nonlinear spectral resolution method encounters a fatal bottleneck in the engineering applications of endoscopy: noise amplification. Given the current limitations of endoscopic illumination, CMOS sensors often operate at high analog gain to ensure video frame rate and brightness. In this state, the image itself inevitably carries Poisson shot noise and Gaussian readout noise. Assuming a certain channel of the original signal is... ,in This represents zero-mean noise. When the system attempts to improve fitting accuracy using a second-order polynomial, the squared term... It will transform the original linear noise into noise containing cross terms. and thick-tailed non-Gaussian terms Complex nonlinear noise. As the polynomial order increases, the variance of the noise is amplified exponentially. This noise, which is drastically amplified by higher-order functions, not only produces a large number of high-frequency flickering spots in the dark areas of the image, but its frequency distribution also highly overlaps with the capillary texture of the lesion area, ultimately affecting the high-texture information.
[0013] The third drawback is that traditional spatial domain filtering algorithms can damage microvascular texture when dealing with nonlinearly amplified noise. To eliminate the noise amplified by higher-order operations, using traditional two-dimensional spatial domain filtering methods (such as Gaussian filtering, bilateral filtering, or non-local means (NLM) filtering) in image processing leads to a difficult trade-off between noise reduction and edge preservation. Because nonlinearly amplified noise has strong structural characteristics, traditional spatial filters cannot distinguish between genuine vascular texture and higher-order cross-noise. They often smooth out weaker capillary networks along with the noise, producing so-called over-smoothing artifacts, which is unacceptable for endoscopic examinations that require precise tissue structure analysis.
[0014] In view of the above problems, the present invention is proposed to provide an electronic staining method, apparatus, system and storage medium for endoscopic images that overcomes or at least partially solves the above problems.
[0015] This invention provides an electronic staining method for endoscopic images, comprising:
[0016] Acquire the raw broadband RGB image captured by the image acquisition component;
[0017] Based on a preset nonlinear feature expansion operator, the original broadband RGB image is expanded into a P-dimensional broadband image; where P is a positive integer greater than 3.
[0018] For each pixel in the current frame of a P-dimensional wideband image:
[0019] Based on the current image block centered on the pixel, search for reference image blocks belonging to the same position in the current frame and the K frames before and after to obtain a set of reference image blocks;
[0020] The image attenuation weight of each image block is determined based on the similarity distance between each reference image block and the current image block;
[0021] Based on the attenuation weight of each image block, the current image block and each reference image block are fused and weighted to obtain the RGB light reflectance prediction data of the pixel.
[0022] Based on the calibration matrix of the predetermined broadband image and the narrowband image of the specified spectral range, the narrowband image signal data of the specified spectral range corresponding to the RGB light reflectance inference data is calculated to obtain the narrowband light image signal data of each pixel.
[0023] Based on the narrowband image signal data of each pixel in the current frame, a narrowband image of the specified spectral range of the current frame is generated, and an electronic staining image is obtained based on the narrowband image signal data of the specified spectral range.
[0024] In some optional embodiments, the expansion of the original broadband RGB image into a P-dimensional image based on a preset nonlinear feature expansion operator includes:
[0025] The three-channel RGB data of the original broadband RGB image are extended using a predefined nonlinear feature expansion operator. This is converted into 9-channel broadband image data;
[0026] in, R, G, and B represent the signal strengths of the red, green, and blue channels of the image, respectively, while RG, BG, and RB represent the signal strengths of the red-green, blue-green, and red-blue combined channels of the image, respectively.
[0027] In some optional embodiments, the step of searching for reference image blocks belonging to the same position within the current frame and the preceding and following K frames based on the current image block centered on the pixel to obtain a set of reference image blocks includes:
[0028] Get the current frame The pixel block P(x,y,t) corresponding to the pixel point Q(x,y);
[0029] Within the current frame, several preceding adjacent frames, and several following adjacent frames, a preset spatial search window is used. Using the principle of brightness consistency, a reference pixel block that is at the same position as pixel block P(x,y,t) is searched, and the searched reference pixel block is added to the reference pixel block set.
[0030] In some optional embodiments, the step of searching for a reference pixel block at the same location as pixel block P(x, y, t) using the brightness consistency principle includes:
[0031] Based on the principle of brightness consistency, calculate the motion offset vector across frames. Based on the motion offset vector, multiple reference pixel blocks corresponding to the same local location are located in the current frame and adjacent frames; where:
[0032] ;
[0033] In the formula, For the time index of the current frame, For the current frame being processed, For a neighboring frame of the current frame, For the current image patch, For the pixels inside the image relative to the current pixel point Local coordinate offset, For motion offset vector, This represents the offset of a pixel within a row of pixels in the image. This represents the offset of a pixel within a column of pixels in the image.
[0034] In some optional embodiments, the image attenuation weight of each image block is determined based on the similarity distance between each reference image block and the current image block, including:
[0035] Calculate the Euclidean distance between the reference image block and the current image block based on the coordinate information of the reference image block and the coordinate information of the current image block;
[0036] The similarity distance between the reference image patch and the current image patch is determined based on the Euclidean distance and the size of the spatial search window;
[0037] The attenuation weight of each image block is determined based on the similarity distance, the preset filtering parameters for controlling the smoothness of the spatial structure, and the preset attenuation constant in the time dimension.
[0038] In some alternative embodiments, the similar phase distance is calculated using the following formula. :
[0039] ;
[0040] In the formula, For the i-th reference image patch, For the current image patch, This involves summing the squared Euclidean distances between corresponding pixels within an image patch in a high-dimensional feature space. is the size of one side of the spatial search window; a and b are the local coordinate offsets of pixels inside the image relative to the current pixel.
[0041] The attenuation weight is calculated using the following formula:
[0042] ;
[0043] In the formula, Filtering parameters used to control the smoothness of spatial structures; is the decay constant in the time dimension.
[0044] In some optional embodiments, based on the attenuation weights of each image block, a fusion weighted calculation is performed on the current image block and each reference image block to obtain the RGB reflectance prediction data of the pixel, including:
[0045] Based on the center pixel of the current pixel block and its corresponding attenuation weight, and the center pixel of each reference image block and its corresponding attenuation weight, a fusion weighted calculation is performed on the current image block and each reference image block to obtain the RGB light reflectance prediction data of the pixel point;
[0046] The estimated RGB light reflectance data is calculated using the following formula. :
[0047] ;
[0048] In the formula, K is the total number of pixel blocks; For the i-th reference image patch Place it in the center pixel.
[0049] In some optional embodiments, based on a predetermined calibration matrix M of a broadband image and a narrowband image within a specified spectral range, the RGB reflectance estimation data is calculated using the following formula to obtain the narrowband image signal data within the specified spectral range corresponding to the RGB reflectance estimation data, thus obtaining the narrowband light image signal data y(x, y, t) for each pixel:
[0050] ;
[0051] In the formula, M is a nonlinear mapping regression matrix obtained by optimizing a number of sets of measured narrowband light reflectance data and corresponding white light reflectance data based on a preset calibration objective function and using the relative error least squares method.
[0052] The specified spectral range includes one or more specified spectral ranges, each of which corresponds to a calibration matrix M.
[0053] In some optional embodiments, the narrowband image calibration matrix M is obtained by optimizing the objective function as follows:
[0054] ;
[0055] In the formula, This is the calibration value of the true absolute reflectance of narrowband light obtained by actual measurement using a spectrometer. The suppression coefficient, This represents the P-dimensional image signal of the corresponding pixel under white light illumination.
[0056] In some optional embodiments, the above method further includes:
[0057] The original broadband RGB image is preprocessed, and the preprocessing includes at least one of dark current compensation, white balance correction and demosaic interpolation.
[0058] This invention provides an electronic staining device for endoscopic images, comprising:
[0059] The input module is used to acquire the raw broadband RGB image captured by the image acquisition component;
[0060] An extension module is used to extend the original broadband RGB image into a P-dimensional broadband image based on a preset nonlinear feature extension operator; where P is a positive integer greater than 3.
[0061] The matching module is used for each pixel in the current frame of a P-dimensional broadband image: based on the current image block centered on the pixel, it searches for reference image blocks belonging to the same position in the current frame and the K frames before and after, to obtain a set of reference image blocks;
[0062] The weight determination module is used to determine the image attenuation weight of each image block based on the similarity distance between each reference image block and the current image block;
[0063] The fusion module is used to perform fusion weighting calculation on the current image block and each reference image block according to the attenuation weight of each image block to obtain the RGB light reflectance prediction data of the pixel.
[0064] The calculation module is used to calculate the narrowband image signal data in the specified spectral range corresponding to the RGB light reflectance inference data based on the calibration matrix of the broadband image and the narrowband image in the specified spectral range, so as to obtain the narrowband light image signal data of each pixel.
[0065] The output module is used to generate a narrowband image of a specified spectral range for the current frame based on the narrowband light image signal data of each pixel in the current frame, and to obtain an electronic staining image based on the narrowband image signal data of the specified spectral range.
[0066] This invention provides an electronic staining system for endoscopic images, comprising: an image acquisition unit, an image processing host, and an endoscope display;
[0067] The image acquisition component is used to acquire the original broadband RGB image of the target area;
[0068] The image processing host is equipped with the aforementioned electronic staining device for endoscopic images;
[0069] The endoscope display is used to display at least one of the original broadband RGB image acquired by the image acquisition component and the electronically stained image processed by the image processor host.
[0070] This invention provides a computer storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described electronic staining method for endoscopic images.
[0071] The beneficial effects of the above-described technical solutions provided in the embodiments of the present invention include at least the following:
[0072] Based on a preset nonlinear feature expansion operator, the original broadband RGB image is expanded into a P-dimensional broadband image. By expanding the image to a higher dimension, the nonlinear physical properties of specified bands in the image can be reconstructed more accurately, improving the contrast of blood vessels and surface textures. Higher imaging levels can be achieved at low cost without adding hardware such as filters or narrowband LEDs. Based on the structural similarity of spatial image patches, weighted calculations are performed based on the similarity distance of multiple image patches at the same location in space. Through cross-frame matching, noise is precisely suppressed, achieving image denoising. This solves the noise amplification problem that may be caused by nonlinear calculations, resulting in more accurate image data and making it possible to obtain high-fidelity narrowband images under low signal-to-noise ratio conditions. Using a preset broadband and narrowband image calibration matrix, RGB light reflectance inference data is calculated to obtain a narrowband image within a specified spectral range. Electron staining images are synthesized based on the narrowband image within the specified spectral range, enabling hardware-level imaging through software. This allows for the acquisition of high-quality images at low cost, leading to better observation of lesions and accurate diagnostic results.
[0073] Other features and advantages of the invention will be set forth in the description which follows, and will be apparent in part from the description, or may be learned by practicing the invention. The objects and other advantages of the invention may be realized and obtained by means of the structures particularly pointed out in the written description, claims, and drawings.
[0074] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description
[0075] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:
[0076] Figure 1 This is a flowchart of the electronic staining method for endoscopic images in an embodiment of the present invention;
[0077] Figure 2 This is a schematic diagram of the endoscopic image electronic staining system in an embodiment of the present invention;
[0078] Figure 3 This is a schematic diagram of the structure of the image processor host in an embodiment of the present invention;
[0079] Figure 4 This is a schematic diagram of the electronic staining device for endoscopic images in an embodiment of the present invention;
[0080] Figure 5 This is a schematic diagram of the processing flow of the electronic staining system for endoscopic images in an embodiment of the present invention;
[0081] Figure 6 This is a schematic diagram of the processing flow of the electronic staining unit in an embodiment of the present invention;
[0082] Figure 7 This is an example image of the original image in an embodiment of the present invention;
[0083] Figure 8 This is an example diagram comparing the implementation process and results of electronic staining in an embodiment of the present invention;
[0084] Figure 9 This is an example image of an electronic staining image obtained using the method of the present invention in an embodiment of the present invention;
[0085] Figure 10 This is an example image of an electronic staining image obtained by a conventional method in an embodiment of the present invention. Detailed Implementation
[0086] Exemplary embodiments of the present disclosure will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
[0087] To address the problems existing in current technologies, this invention, after objectively evaluating the contradictions of insufficient contrast in existing linear spectral interpretation and amplified noise in nonlinear spectral interpretation, proposes a novel approach to achieve electronic staining of endoscopic images. This approach is applicable to the fields of medical image processing, computational photography, nonlinear spectral reconstruction, and minimally invasive endoscopic imaging.
[0088] This invention proposes a virtual staining or electronic staining imaging system and method for electronic endoscopes. By employing a high-order nonlinear spectral deconstruction algorithm, it accurately converts the red, green, and blue (RGB) broadband image signals acquired under standard broadband white light illumination into reflectance images of specific narrowband wavelengths (e.g., 415 nm blue light and 540 nm green light). Simultaneously, addressing the problem of drastic image noise amplification inevitably caused during nonlinear high-order function fitting, this invention creatively introduces a three-dimensional collaborative processing mechanism based on joint filtering of image spatial patches and patches at the same anatomical location over time. This invention completely eliminates high-frequency structural noise caused by nonlinear calculations while perfectly reconstructing and sharpening the microvascular morphology and glandular opening structures of the digestive tract mucosa, providing high-contrast and low-noise narrowband blue-green light images for accurate optical diagnosis of early respiratory, urinary, and digestive tract lesions. This invention aims to overcome the physical and algorithmic limitations of existing hardware filtering and low-order linear electronic staining through purely computational imaging, thereby improving the image quality of medical endoscopes.
[0089] This invention provides an electronic staining method for endoscopic images. The purpose of this method is to perform nonlinear spectral analysis on RGB images obtained from broadband white light imaging to acquire specific narrowband imaging results, thereby achieving electronic staining effects. The process is as follows: Figure 1 As shown, it includes the following steps:
[0090] Step S101: Acquire the raw broadband RGB image acquired by the image acquisition component.
[0091] Step S102: Based on a preset nonlinear feature expansion operator, expand the original broadband RGB image into a P-dimensional broadband image; P>3 and is a positive integer. Each dimension is a power-law term of the broadband RGB image data.
[0092] Step S103: Perform the following steps for each pixel in the current frame of the P-dimensional broadband image:
[0093] Step S104: Based on the current image patch centered at a pixel, search for reference image patches belonging to the same position in the current frame and the K frames before and after it to obtain a set of reference image patches. K is a positive integer.
[0094] Step S105: Determine the image attenuation weight of each image block based on the similarity distance between each reference image block and the current image block.
[0095] Step S106: Based on the attenuation weight of each image block, perform a fusion weighted calculation on the current image block and each reference image block to obtain the RGB light reflectance prediction data of the pixel.
[0096] Step S107: Based on the calibration matrix of the predetermined broadband image and the narrowband image of the specified spectral range, calculate the narrowband image signal data of the specified spectral range corresponding to the RGB light reflectance prediction data, and obtain the narrowband light image signal data of each pixel.
[0097] Step S108: Based on the narrowband light image signal data of each pixel in the current frame, generate a narrowband image of the specified spectral range of the current frame, and obtain an electronic staining image based on the narrowband image signal data of the specified spectral range.
[0098] Optionally, the method may further include the following steps before step S101:
[0099] Step S100: Preprocess the original broadband RGB image, including at least one of dark current compensation, white balance correction and demosaic interpolation.
[0100] The above method, based on a preset nonlinear feature expansion operator, expands the original broadband RGB image into a P-dimensional broadband image. By expanding the image to a higher dimension, the nonlinear physical characteristics of a specified band in the image can be reconstructed more accurately, improving the contrast of blood vessels and surface textures. Higher imaging levels can be achieved at low cost without adding hardware such as filters or narrowband LEDs. Based on the structural similarity of spatial image patches, weighted calculations are performed based on the similarity distance of multiple image patches at the same location in space. Through cross-frame matching, noise is precisely suppressed, achieving image denoising. This solves the noise amplification problem that may be caused by nonlinear calculations, resulting in more accurate image data and making it possible to obtain high-fidelity narrowband images under low signal-to-noise ratio conditions. Using a preset broadband and narrowband image calibration matrix, RGB light reflectance prediction data is calculated to obtain a narrowband image within a specified spectral range. Electron staining images are synthesized based on the narrowband image within the specified spectral range, enabling hardware-level imaging through software. This allows for the acquisition of high-quality images at low cost, leading to better observation of lesions and accurate diagnostic results.
[0101] The above method can be used for virtual staining enhancement of medical images in endoscopy. After receiving continuous multi-frame image signals acquired based on broadband white light illumination, a high-order nonlinear spectral estimation model is established. A high-order function mapping containing cross-product or power terms is applied to the original broadband signal to generate a predicted virtual narrowband image with strong absorption characteristic bands of hemoglobin. To address the spatial high-frequency amplification noise caused by high-order nonlinear operations, patches are extracted on the time axis of the video stream and combined with image block sequences across frames at the same anatomical location for spatiotemporal joint filtering. Finally, a high-contrast narrowband image sequence with random noise removed and microvascular texture preserved is output.
[0102] Based on the same inventive concept, embodiments of the present invention also provide an endoscopic image electronic staining system (not shown) for implementing the above method, the structure of which is described in [reference needed]. Figure 2 As shown, it includes an image acquisition unit 2, an image processing host 1, and an endoscope display 3.
[0103] Image acquisition component 2 is used to acquire the original broadband RGB image of the target area.
[0104] An electronic staining device for endoscopic images is installed in the image processing host 1.
[0105] Endoscopic display 3 is used to display at least one of the original broadband RGB image acquired by image acquisition unit 2 and the electronically stained image processed by image processor host 1.
[0106] The structure of the image processing host 1 described above is shown in the figure. Figure 3 As shown, the device includes an electronic staining apparatus 10 for endoscopic images. The image processing host 1 internally contains hardware processing circuits or a microprocessor with pre-installed firmware, including a Field-Programmable Gate Array (FPGA), DSP, or Application Specific Integrated Circuit (ASIC) architecture. It is used to execute all the computational logic modules in the above method, such as image data acquisition, higher-order function dimensionality reduction transformation, temporal patch buffer registration, and multi-frame noise reduction joint filtering.
[0107] The structure of the endoscopic image electronic staining device 10 is shown in the figure. Figure 4 As shown, it includes:
[0108] Input module 11 is used to acquire the original broadband RGB image acquired by image acquisition component 2;
[0109] Extension module 12 is used to extend the original broadband RGB image into a P-dimensional broadband image based on a preset nonlinear feature extension operator; where P is a positive integer greater than 3.
[0110] Matching module 13 is used for each pixel in the current frame of a P-dimensional broadband image: based on the current image block centered on the pixel, to search for reference image blocks belonging to the same position in the current frame and the K frames before and after, to obtain a set of reference image blocks;
[0111] The weight determination module 14 is used to determine the image attenuation weight of each image block based on the similarity distance between each reference image block and the current image block;
[0112] The fusion module 15 is used to perform fusion weighting calculation on the current image block and each reference image block according to the attenuation weight of each image block to obtain the RGB light reflectance prediction data of the pixel.
[0113] The calculation module 16 is used to calculate the narrowband image signal data in the specified spectral range corresponding to the RGB light reflectance inference data according to the calibration matrix of the broadband image and the narrowband image in the specified spectral range, so as to obtain the narrowband light image signal data of each pixel.
[0114] Output module 17 is used to generate a narrowband image of a specified spectral range of the current frame based on the narrowband light image signal data of each pixel in the current frame, and to obtain an electronic staining image based on the narrowband image signal data of the specified spectral range.
[0115] Optionally, the above-mentioned endoscopic image electronic staining device 10 further includes an image preprocessing module 18 for preprocessing the original broadband RGB image, wherein the preprocessing includes at least one of dark current compensation, white balance correction and demosaic interpolation.
[0116] The aforementioned endoscopic image electronic staining device 10 can also be understood as including an image preprocessing module 18 and an electronic staining processing unit 19. The electronic staining processing unit 19 includes an input module 11, an extension module 12, a matching module 13, a weight determination module 14, a fusion module 15, a calculation module 16, and an output module 17, which are used to perform electronic staining processing on the image.
[0117] In some optional embodiments, the flowchart of the endoscopic image electronic staining system for implementing the endoscopic image electronic staining method is shown below. Figure 5 As shown, the image acquisition unit 2 acquires images of the target area, obtains raw image data, and transmits it to the image processing host 1. The image preprocessing module 18 in the image processing host 1 preprocesses the image and provides the RGB data to the electronic staining processing unit 19. The electronic staining processing unit 19 stains the image using the method provided in this application and outputs the electronically stained image to the endoscope display 3. The flowchart of the single-image electronic staining process implemented by the electronic staining processing unit 19 in the endoscope image electronic staining device 10 is shown below. Figure 6 As shown below. (Combined with...) Figure 1 and Figure 6 The steps for implementing electronic image coloring in this application are described in detail.
[0118] The above step S100 performs image preprocessing. After receiving the raw broadband RGB image data transmitted by the image acquisition unit 2, the image processing host 1 preprocesses the image data. After processing by at least one image signal processor (ISP), such as dark current compensation, white balance correction, and demosaic interpolation, a pixel RGB image is generated and input into the electronic staining processing unit. The image acquisition unit 2 can be an image sensor, which is set at the front end of the endoscope and can be inserted into the biological body to acquire images and transmit photoelectric signals back. The RGB image transmitted to the electronic staining processing unit 19 after preprocessing by the image preprocessing module 18 is a broadband image.
[0119] The above step S101 achieves the acquisition of the RGB image, see [link / reference]. Figure 6 The step of inputting an RGB image in the image acquisition unit 2 acquires the raw wideband RGB image. The acquired RGB image can be a video captured by the image acquisition unit 2, or it can be a series of images.
[0120] Step S102 above achieves the high-dimensional extended mapping of nonlinear features, see [link to documentation]. Figure 6 The high-dimensional nonlinear feature expansion step involves expanding the basic RGB vectors into a high-dimensional form using a higher-order function on the received RGB wideband image data. This expansion can be achieved by calculating the quadratic terms, cross terms, and square roots of the RGB values. Based on a predefined nonlinear feature expansion operator, the original wideband RGB image is expanded into a P-dimensional wideband image. This includes: expanding the three-channel RGB data of the original wideband RGB image using a predefined nonlinear feature expansion operator. This is converted into 9-channel broadband image data; among which, R, G, and B represent the signal intensities of the red, green, and blue channels of the image, respectively. RG, BG, and RB represent the signal intensities of the red-green, blue-green, and red-blue combined channels of the image, respectively, calculated from the signal intensities of the red-green, blue-green, and red-blue channels, for example, by multiplying them.
[0121] In traditional linear electronic staining, taking the target narrowband wavelengths of 415nm and 540nm as an example, for any pixel in the image, the broadband RGB data output vector of its image sensor is denoted as... The linear electronic staining model attempts to find a 2×3 transformation matrix. This makes the inferred spectral vector of the target narrowband band... satisfy As mentioned earlier, this linear mapping is prone to metamerism error.
[0122] This invention abandons simple linear operations and introduces the concept of a higher-order fitting function. Through feature expansion calculations using higher-order nonlinear polynomials, it expands RGB images into multi-channel, high-dimensional image data. A nonlinear feature expansion operator can be defined during the higher-order nonlinear polynomial feature expansion calculations. The operator The dimensions can be defined based on the dimensions of the original image and the expanded image. For example, when expanding a 3-channel RGB image into 9-channel image data, a 9th-order operator can be defined: Based on this operator, the original image data is expanded. Taking a quadratic polynomial as an example, the original three-dimensional RGB vector is expanded into a... The higher-order feature vector φ(z) is given by dimensionality. Here, R, G, and B are the signal intensities of the red, green, and blue channels of the image, respectively, and RG, BG, and RB are the signal intensities of the red-green, blue-green, and red-blue combined channels of the image, respectively.
[0123] In this step, the construction of a high-order nonlinear spectral estimation model employs multivariate polynomial regression or a root polynomial algorithm with brightness scale invariance to approximate the nonlinear diffuse reflection response of specific wavelengths of the target, such as 415 nm and 540 nm.
[0124] Step S104 above extracts and matches local spatiotemporal features. See [link / reference] Figure 6 The steps of spatiotemporal joint patch extraction and matching are as follows: In the video stream buffer queue, using the current image patch corresponding to the current pixel in the current image frame as a reference, a feature distance-based search algorithm is used to search for matching patches with high structural similarity in the spatial neighborhood and in neighboring image patches of past / future adjacent image frame sequences. That is, in the spatiotemporal buffer queue, similar reference patches at the same anatomical location are searched forward / backward.
[0125] Optionally, in order to eliminate the noise increase caused by step 102 without damaging the original texture of the tissue, and considering that the spectrum of local tissues should be approximately similar during endoscopic imaging, and that the spectral distribution of tissues corresponding to consecutive frames in space should also be similar, this invention constructs a patch joint filtering method based on temporal multi-frame redundancy.
[0126] This step, based on the current image patch centered on a pixel, searches for reference image patches belonging to the same location within the current frame and the K frames before and after it. The process of obtaining the set of reference image patches includes:
[0127] First, extract spatiotemporally similar patches as data preparation for filtering.
[0128] Constructing a spatiotemporal patch collection based on a 3D search window: Obtaining the current frame. The pixel block P(x,y,t) corresponding to the pixel point Q(x,y); within the current frame, several preceding adjacent frames, and several following adjacent frames, based on a preset spatial search window. Using the principle of brightness consistency, a reference pixel block belonging to the same position as pixel block P(x, y, t) is searched, and the searched reference pixel block is added to the reference pixel block set. This represents the high-frequency noise frame currently being processed in the image sequence. The image patch P(x, y, t) is an image patch centered at pixel coordinates Q(x, y). The window... The size can be set as needed, for example, 7×7 or 9×9 pixels. The number of preceding adjacent frames and following adjacent frames can be the same or different.
[0129] When constructing the reference pixel block set, not only in the current frame Within a spatial search window (e.g., 31×31), it searches for similar image patches and also in adjacent frames. arrive The spacetime volume formed The search is performed within the current frame, where tk represents the k frames before the current frame, and t+k represents the k frames after the current frame. t represents the current frame, and k is a positive integer.
[0130] Considering that the lens of image acquisition unit 2 may experience rapid translation, this invention introduces local block motion compensation (Block-matching) between frames: utilizing brightness consistency, the motion offset vector across frames is calculated. Through motion offset vector This describes the inter-frame organizational displacement caused by camera movement, and the corresponding image patch set located at the same local position in adjacent frames.
[0131] Using the principle of brightness consistency, a reference pixel block belonging to the same position as pixel block P(x, y, t) is searched, including: calculating the motion offset vector across frames based on the principle of brightness consistency. Based on the motion offset vector, multiple reference pixel blocks corresponding to the same local location are located in the current frame and adjacent frames; where:
[0132] ;
[0133] In the formula, This is the time index of the current frame in the image sequence. The current frame being processed in the image sequence. The previous adjacent frame of the current frame. For the current image patch, For the pixels inside the image relative to the current pixel point Local coordinate offset, For motion offset vector, This represents the offset of a pixel within a row of pixels in the image. This represents the offset of a pixel within a column of pixels in the image.
[0134] The above formula is used to track the physical motion trajectory of the same location, such as the same piece of digestive tract mucosa, between adjacent video frames; that is, to find the optimal spatial displacement between the current frame image block and the previous frame (or historical frame) image block. This formula employs block-matching motion estimation. It iterates through all possible displacement vectors within the search window. Calculate the current frame The reference image patch and reference frame in The sum of the absolute errors between the image blocks after displacement. The sum of the pixel brightness differences between two image blocks is minimized (i.e., ...). This indicates that the textures of the two image patches are the best match. The corresponding displacement vector at this point... It refers to the actual motion offset of the mucosal tissue between two frames.
[0135] Steps S105 and S106 above implement time-based collaborative denoising, performing joint distance-weighted calculation and temporal aggregation on the extracted set of 3D spatiotemporal reference image patches at the same location to reduce background noise amplified by higher-order functions. Wherein:
[0136] Step S105 above calculates the distance metric and spatiotemporal fusion weights, calculates the patch-weighted distance in the high-dimensional feature space, and assigns weights based on the time decay constant. After obtaining a set of reference patches through matching within the spatiotemporal volume, fusion weights are assigned to each reference patch. For any reference image patch found within the spatiotemporal search volume... ,calculate Compared with the current image patch as a benchmark similarity distance between To reduce interference from high-frequency noise, the distance metric not only uses Euclidean distance but also introduces a weighted distance in the feature space; then, based on the distance... Assign filter weights .
[0137] In this step, the image attenuation weight of each image block is determined based on the similarity distance between each reference image block and the current image block, including:
[0138] 1) Calculate the Euclidean distance between the reference image block and the current image block based on the coordinate information of the reference image block and the coordinate information of the current image block.
[0139] 2) Determine the similarity distance between the reference image patch and the current image patch based on the Euclidean distance and the size of the spatial search window.
[0140] Alternatively, the similar phase distance can be calculated using the following formula. :
[0141] ;
[0142] In the formula, Let i be the i-th reference image patch, where i is a positive integer. For the current image patch, This is to sum the squared Euclidean distances between corresponding pixels within an image patch in a high-dimensional feature space. is the size of one side of the spatial search window; a and b are the local coordinate offsets of pixels inside the image relative to the current pixel.
[0143] The above formula is used to quantify the similarity in microscopic anatomy between two different image patches (a baseline patch and a candidate reference patch). Distance The smaller the value, the more similar the two pixels are. This is not a simple comparison of the RGB colors of two pixels, but rather a calculation of the similarity between two pixels of size 1. Within an image patch, the squared Euclidean distance between the high-dimensional feature vectors of all corresponding pixels ( ). Divide by This is to average the area of image patches. Since the calculation is performed in a high-dimensional feature space after nonlinear expansion, this distance measure can effectively resist the interference of noise amplified by high-order polynomials and accurately restore the structural similarity of high-frequency textures.
[0144] 3) Determine the attenuation weight of each image block based on the similarity distance, the preset filtering parameters for controlling the smoothness of the spatial structure, and the preset attenuation constant in the time dimension.
[0145] Alternatively, the attenuation weight can be calculated using the following formula for attenuation weight based on the combined spatial structural similarity and temporal correlation:
[0146] ;
[0147] In the formula, Filtering parameters used to control the smoothness of spatial structures; is the decay constant in the time dimension.
[0148] When intense motion causes a decrease in inter-frame registration confidence... The image rapidly shrinks, and the algorithm of this invention adaptively degenerates into spatial patch filtering to avoid motion blur; when the lens is stationary (e.g., during magnified observation), The algorithm of this invention powerfully utilizes multi-frame temporal redundancy to smooth out nonlinear noise.
[0149] The above formula determines the weight of each reference block in the final result when fusing and denoising multiple reference image blocks. The weight allocation is obtained by multiplying exponential decay functions in both spatial and temporal dimensions.
[0150] Spatial term: using the similarity distance calculated above If the structures of the reference block and the base block differ significantly, the distance... Much larger than the smoothing control parameter If the weight is close to 0, the exp function will output a very small weight, thus eliminating dissimilar noise blocks.
[0151] Time Item: Measures the time of the reference block The time span from the current frame (time t). Time interval. The larger the weight, the more exponentially the weight decays, with the decay rate increasing from the time constant. Control. This time-weighted allocation mechanism ensures that only image patches with the most similar structure and the most temporal relevance can participate in the final fusion, thereby preserving sharp edges while denoising and preventing motion blur.
[0152] The above step S106 achieves temporal collaborative aggregation noise reduction, performing weighted averaging or recursive Kalman filtering on the matched multi-frame 3D spatiotemporal reference patch set to reduce non-Gaussian tail noise that is randomly amplified by higher-order functions, and accurately preserves continuous blood vessel and texture structures.
[0153] In this step, based on the attenuation weights of each image block, a fusion weighted calculation is performed on the current image block and each reference image block to obtain the RGB reflectance prediction data of the pixel, including:
[0154] Based on the center pixel and corresponding attenuation weight of the current pixel block, and the center pixel and corresponding attenuation weight of each reference image block, a weighted fusion calculation is performed on the current image block and each reference image block to obtain the estimated light reflectance data of the pixel. The estimated RGB light reflectance data is calculated using the following formula. :
[0155] ;
[0156] In the formula, K is the total number of pixel blocks; For the i-th reference image patch Place it in the center pixel. This is where the estimated light reflectance data is used to calculate the light reflectance. The pixel block includes the current pixel block and all reference pixel blocks, and the weight of the current pixel block is... The default value is 1, which is the weight of the reference pixel block. It is calculated above.
[0157] The above formula outputs the clean, high-dimensional feature vector of the current center pixel after completely eliminating random noise. This calculation is a non-local weighted averaging process. The algorithm finds K matching reference blocks and assigns their center pixel feature vectors to their corresponding weights. Multiply and sum the results, then divide by the sum of all weights for normalization. At the physical level, since the actual edge structure of a graphic is highly correlated in space and time (high weight), while the thermal noise and shot noise generated by the image sensor are completely randomly distributed (low weight), the random noise is canceled out by the weighted average of multiple independent samples, while the true structural information is preserved and enhanced.
[0158] Temporal co-aggregation and target spectral output ultimately yield the inferred RGB reflectance value of the current center pixel. , It is calculated by weighted average of the center pixels of all matched reference patches. This includes the total number of matched reference patches that meet the similarity requirements. Thus, the amplified irrelevant high-frequency noise from higher-order operations is reduced by averaging multiple temporally independent samples, while highly correlated texture structures are accurately preserved and reconstructed due to the high weight of patch matching. Low-noise, high-contrast narrowband spectral information is directly extracted. The RGB reflectance inference data obtained in this step corresponds to the data of the broadband image; subsequent calculations using the calibration matrix yield the specified narrowband image data.
[0159] In the spatiotemporal joint filtering process, the feature metrics used to measure the similarity of image blocks in steps S105 and S106 are not based on low-order RGB colors, but directly on the Euclidean or Mahalanobis distance of the multidimensional feature space vector generated after high-order fitting expansion; and sub-pixel level motion compensation alignment is performed by calculating inter-frame block motion estimation, thereby obtaining a higher quality image.
[0160] Steps S107 and S108 above achieve narrowband target spectrum output and color synthesis. The denoised high-dimensional steady-state pure feature vector is multiplied by the pre-calibrated offline regression matrix to parse the pure image data of the specified color channels, such as high-precision narrowband blue (415nm) and green (540nm) channel image data. These two channel image data are placed in the R, G, and B channels of the display, respectively, and transmitted to the medical monitoring monitor to output a high-contrast, low-noise electronic chromoendoscopy image, thus presenting a high-contrast vascular enhancement image.
[0161] In this step, based on the predetermined calibration matrix M of the broadband image and the narrowband image within a specified spectral range, the RGB light reflectance estimation data is calculated using the following formula to obtain the narrowband image signal data within the specified spectral range corresponding to the RGB light reflectance estimation data, thus obtaining the narrowband light image signal data y(x, y, t) for each pixel:
[0162] ;
[0163] In the formula, The calculated light reflectance prediction data, for endoscopic scenarios, can represent the narrowband light reflectance of the tissue surface after filtering. M is a nonlinear mapping regression matrix obtained by optimizing several sets of measured narrowband light reflectance data and corresponding white light reflectance data using a preset calibration objective function and the least squares relative error method. The specified spectral range includes one or more specified spectral ranges, each corresponding to a calibration matrix M. The light reflectance prediction data is then processed using the calibration matrix M. Calibration yields narrowband light image signal data. Taking the blue (415nm) and green (540nm) channels as an example, dual-channel vector data is obtained. Then, y(x, y, t) is mapped to the digital signal of the display, presenting an endoscopic electronic staining image with both high vascular contrast and low noise.
[0164] The denoised low-noise texture features (the estimated reflectance values obtained from the calculations) are then mapped to the final target narrowband spectrum using a calibration matrix. The calibration matrix M can be pre-determined using machine learning methods based on measured real narrowband light data as an offline training set; it is a nonlinear mapping regression matrix, for example, a pre-calibrated high-dimensional nonlinear mapping regression matrix. (dimension is) To suppress parameter anomalies caused by overfitting, the "relative error least squares method" is used to optimize the regression matrix, which can be optimized using a defined objective function. For example, the narrowband image calibration matrix M is obtained by optimizing using the following objective function:
[0165] ;
[0166] In the formula, This is the calibration value of the true absolute reflectance of narrowband light obtained by actual measurement using a spectrometer. The suppression coefficient, Let be the P-dimensional (high-dimensional) image signal of the corresponding pixels under white light illumination. By solving this optimization problem, an accurate high-order nonlinear mapping regression matrix can be obtained. . The image can be obtained by illuminating the object with white light and narrowband light respectively, acquiring images, and then using the narrowband light to locate the acquired images.
[0167] The above formula is used to solve for the accurate nonlinear mapping matrix M during the laboratory calibration stage, enabling ordinary RGB signals to be accurately converted into the target narrowband spectral image. The objective function employs the Relative Error Least Squares (RELS) method combined with Tikhonov regularization. The formula includes a relative error term and a regularization term.
[0168] Relative error term: The denominator of the first part of the formula is divided by the true spectral reflectance. This causes the model to calculate "percentage error" rather than "absolute error" during optimization. During endoscopic imaging, the light reflection from structures such as diseased microvessels is typically low (small absolute value); if absolute error is used, the features of dark-colored vessels will be masked by errors in the bright-colored mucosa. Introducing relative error allows the algorithm to assign higher penalty weights to fit the details of dark-colored vessels, thereby improving the vessel contrast in the final image.
[0169] Regularization term: the latter half The sum of squares of the elements of matrix M (i.e., the Frobenius norm) was calculated. Since higher-order polynomials can easily lead to overfitting of the model, causing matrix parameters to become abnormally large and thus noise to be amplified abnormally. By introducing a suppression coefficient λ to penalize excessively large matrix elements, abnormal oscillations in the equation solution can be limited, ensuring that the obtained matrix has good robustness.
[0170] The calibration is performed using data obtained from white light illumination and corresponding narrowband light illumination. When obtaining the calibration matrix, each narrowband light range can correspond to a calibration matrix M. For example, if we need to obtain the corresponding data for two narrowband light ranges, namely 415 nm blue light and 540 nm green light, we can obtain the calibration matrices corresponding to the two narrowband light ranges respectively. Of course, the narrowband light range is not limited to one or two, but can be multiple.
[0171] Step S108 above: Based on the narrowband light image signal data of each pixel in the current frame, generate a narrowband image of the specified spectral range of the current frame, and obtain an electronic staining image based on the narrowband image signal data of the specified spectral range.
[0172] After obtaining narrowband image data of a specified spectral range for each pixel, the narrowband image data of the entire current frame can be obtained. Based on the obtained narrowband image data of one or more spectral ranges, an electron-stained image can be synthesized. In other words, by placing the narrowband image data corresponding to multiple narrowband ranges into the RGB display channels, an electron-stained image is generated.
[0173] For example, placing narrowband image data of 415 nm blue light and narrowband image data of 540 nm green light into the RGB display channels respectively produces corresponding electronically stained images. Figure 7 The original image of white light shown, the original image ( Figure 8 The process of obtaining an electronically stained image by performing electronic staining on the original white light image (using the electronic staining method provided by this invention) is described in [reference needed]. Figure 8 As shown, white light R, G, and B undergo a nonlinear spectral resolution process to obtain narrowband blue and green light images, respectively. The narrowband blue and green images are then combined to output an electron-stained image, i.e. Figure 8 The nonlinear spectral electron staining image obtained by the present invention shown is shown below. Figure 9 (This is a magnified view of the image). Figure 8 The example also shows linear spectral electron staining images obtained using traditional methods. Figure 10 (This is a magnified view of the image). From Figure 8 As can be seen, compared with the linear spectral electron staining images obtained by traditional methods, the nonlinear spectral electron staining images obtained by the method of this invention have clearer textures and more obvious contrasts in the textures of blood vessels and mucosal surfaces, which can better observe lesions and obtain accurate diagnostic results.
[0174] This invention also provides a computer storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described electronic staining method for endoscopic images.
[0175] This invention also provides an image processing host 1, including: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the above-described electronic staining method for endoscopic images.
[0176] The core inventive concept of this invention lies in: "Solving the limitation of traditional spectral mapping, which is limited to isolated calculations of single spatial pixels, by establishing an electronic staining calculation method based on high-order nonlinear spectral solution and spatiotemporal block multidimensional joint optimization." To obtain narrowband blue and green light images with high contrast and high fidelity for the mucosal surface, it is necessary to introduce higher-order function fitting, which differs from the simple linear operation matrix commonly used by other manufacturers. Addressing the issue of increased background noise from the image sensor during higher-order function fitting, this invention, under the assumption of relatively stable spatiotemporal local spectral information, combines the redundancy of spectral information from spatial image patches and temporally continuous frames to reduce noise during the spectral solution process. During dynamic endoscopic imaging, although the pixels change in each frame due to lens movement, the anatomical structure of the digestive tract mucosa remains continuous and highly correlated within adjacent frames. Conversely, electronic noise generated by the sensor's analog gain and amplified by a higher-order polynomial is completely random and uncorrelated in its temporal distribution. Therefore, this invention proposes using image spatial patches as the basic analysis unit, employing a motion matching algorithm to find the motion trajectory of patches at the same anatomical location over time, and then performing weighted filtering within this spatiotemporal three-dimensional (3D) joint domain spanning multiple frames. This spatiotemporal patch joint filtering mechanism can utilize the temporal and spatial dimensions to filter random noise to a low level, while simultaneously leveraging patch structural similarity to perfectly preserve or even sharpen the spatial high-frequency details of real microvessels. Based on the above inventive concept, this invention specifically aims to solve the following three key technical problems:
[0177] 1. Overcoming the technical bottleneck of insufficient contrast in electronic staining: Overcoming the problem that existing electronic staining techniques such as FICE or i-Scan cannot accurately simulate the nonlinear absorption characteristics of tissues due to the use of low-order linear mapping matrices, eliminating metamerism, and achieving high-precision, nonlinear virtual fitting of the characteristic absorption wavelengths of hemoglobin (415 nm blue light and 540 nm green light), thereby significantly improving the capillary contrast of flat and small lesions.
[0178] 2. Solving the problem of nonlinear bursts of sensor noise caused by high-order functions: This addresses the challenge that the introduction of high-dimensional features such as square terms and cross-product terms into the algorithm amplifies the weak dark current noise and shot noise in the original broadband RGB signal, thereby obscuring the details of clinical diagnosis.
[0179] 3. Addressing the issue of traditional denoising algorithms obscuring microvascular details: This addresses the problem that simple two-dimensional spatial domain smoothing filtering inevitably leads to blurred or disappeared microvascular and microglandular structures in lesions when removing high-intensity nonlinear noise. By constructing a three-dimensional spatiotemporal patch collaborative model, high-resolution narrowband spectral images with no motion blur and sharp vascular textures are still output even during low signal-to-noise ratio and high-order computation processes.
[0180] The electronic staining method, apparatus, and system for endoscopic images provided by this invention have the following specific technical advantages:
[0181] 1. Improved the detection contrast of blood vessel and mucosal surface texture.
[0182] The nonlinear high-order spectral interpretation method of this invention solves the uncertainty in spectral estimation caused by the assumption of a linear combination of reflectance spectra in FICE technology. The algorithm of this invention can accurately fit and reconstruct the nonlinear physical characteristic of hemoglobin's exponential absorption in the 415 nm and 540 nm bands, thereby improving the contrast of mucosal microvessels and surface textures, visually approaching the imaging level of hardware NBI without increasing hardware costs such as filters or narrowband LEDs.
[0183] 2. The problems of nonlinear calculation and high-frequency noise amplification have been solved.
[0184] In this invention, the accuracy of spectral reconstruction is improved through high-order polynomial regression, and the image problems caused by noise amplification are resolved. This invention avoids the problem of "single-frame spatial domain noise reduction" in traditional endoscopic systems, creatively utilizing the "temporal redundancy" and "self-similarity of spatial patch structures" of the video stream. Microstructure patches at the same location exhibit a deterministic and coherent structure in time, while noise is an independently distributed random variable. Through cross-frame block matching, the system accurately suppresses noise amplified by high-order functions, making high-fidelity narrowband imaging under low signal-to-noise ratio conditions a reality.
[0185] 3. It avoids the "overly smoothed" blurring of tiny lesions.
[0186] Traditional denoising algorithms, such as Gaussian and bilateral filtering, blur high-frequency noise along with fine capillaries, causing the most important diagnostic features of early cancerous areas to be erased, resulting in image accuracy issues. The spatiotemporal joint filtering of this invention relies on "patch distance weights" and incorporates a motion-adaptive time decay constant. This means that the fusion process, which presents the microstructure of the mucosal surface, actually sharpens the vessel wall edges; this provides high-quality images for clinical endoscopists to perform real-time, precise optical observation.
[0187] 4. Low hardware dependency, making it easy to implement on various platforms.
[0188] This invention eliminates the need for complex light source mechanical rotors and multi-wavelength LED control matrix hardware, enabling even low-cost basic white light gastrointestinal endoscopes, or ultra-miniature capsule endoscopes that cannot accommodate complex optical components, to instantly acquire virtual narrowband staining observation capabilities simply by applying this algorithm in the backend. This cost advantage gives the product competitiveness and promising market commercialization prospects.
[0189] While the above description focuses on calculating virtual NBI staining with absorption peaks at 415 nm and 540 nm, this framework of high-order nonlinear spectral interpretation combined with spatiotemporal patch denoising is not limited to this and can be extended to observe different medical observation wavelengths. By changing the spectral calibration data of the offline training set, virtual fluorescence images that can penetrate deeper or exhibit autofluorescence (AFI) characteristics can be completely replaced and calculated.
[0190] In some alternative embodiments, when performing high-dimensional expansion on image data, in addition to the methods described above, the nonlinear mapping mechanism can be replaced with a deep neural network (DNN) architecture.
[0191] Besides using a polynomial regression matrix with a clearly defined mathematical form, calculated through least squares of relative error, for nonlinear fitting, this high-dimensional image expansion process can be completely replaced by data-driven deep convolutional neural networks (DNN / CNN) or local multilayer perceptrons (MLP). For example, a local pixel-level nonlinear network can be used to perform spectral nonlinear mapping from white light RGB to narrowband light such as 415nm and 540nm. Therefore, the subsequent steps of the patch-based video-level spatiotemporal joint filtering pixel block extraction and fusion weight calculation, aggregation and noise reduction processing, etc., of this invention can still be indispensable as the overall strategy for smoothing out these nonlinear artifacts remains, and this alternative embodiment is equally applicable in this invention.
[0192] In some alternative embodiments, the spatial search dimension of the spatiotemporal noise reduction filtering algorithm can be changed when implementing image electronic coloring.
[0193] If the endoscope host, i.e., the image processing host 1, has limited computing power, full-size three-dimensional block matching may lead to high chip heat generation or power consumption. In this case, the present invention can make a degraded substitution: omitting or simplifying the traversal search of the patch in the planar dimension in the spatial domain, tracking only the one-dimensional evolution state of a single pixel point based on optical flow motion compensation on the Z-axis of the time series, and using scalar Kalman filtering with variance estimation in the time dimension. This simplification strategy sacrifices the self-similarity-assisted recovery capability of the spatial block structure, but can still effectively suppress a large number of random noise particles caused by the surge of high-order operations in the time domain, achieving a compromise between cost, power consumption, and imaging quality.
[0194] In some alternative embodiments, in addition to employing Figure 1 In addition to the processing flow shown, the execution order of the image electronic staining processing steps can also be rearranged and adjusted.
[0195] Figure 1 In the preferred embodiment shown, the technical path is to first perform high-order fitting using a broadband image to expand a high-dimensional image, and then perform spatiotemporal patch denoising on the amplified target band. As some alternatives, this processing order can be varied, but is not limited to: in the broadband white light RGB domain, first, through optical flow registration and three-dimensional spatiotemporal block similarity aggregation, a low-noise, noise-free reference RGB video stream that has fully utilized temporal information for reconstruction is obtained; then, this data stream is input into a high-order nonlinear reconstruction matrix containing square and cross terms. Since the original sensor noise within the reference stream has been suppressed to near zero, no "noise amplification effect" occurs during high-order fitting. This process reversal also achieves the core concept of this invention: "blocking the correlation between nonlinear spectral fitting and noise amplification."
[0196] Unless otherwise specifically stated, terms such as processing, calculation, operation, determination, display, etc., may refer to the actions and / or processes of one or more processing or computing systems or similar devices that represent the manipulation and conversion of data representing physical (e.g., electronic) quantities within the registers or memory of the processing system into other data similarly representing physical quantities within the memory, registers, or other such information storage, transmission, or display devices of the processing system. Information and signals can be represented using any of a variety of different techniques and methods. For example, data, instructions, commands, information, signals, bits, symbols, and chips mentioned throughout the above description can be represented by voltage, current, electromagnetic waves, magnetic fields or particles, light fields or particles, or any combination thereof.
[0197] It should be understood that the specific order or hierarchy of steps in the disclosed process is an example of an exemplary method. Based on design preferences, it should be understood that the specific order or hierarchy of steps in the process may be rearranged without departing from the scope of this disclosure. The appended method claims provide elements of various steps in an exemplary order and are not intended to limit the scope to the specific order or hierarchy described.
[0198] In the detailed description above, various features are combined together in a single embodiment to simplify this disclosure. This approach to disclosure should not be construed as reflecting an intention that embodiments of the claimed subject matter require more features than are explicitly stated in each claim. Rather, as reflected in the appended claims, the invention is presented with fewer features than all of the features in a single disclosed embodiment. Therefore, the appended claims are hereby explicitly incorporated into the detailed description, with each claim representing a separate preferred embodiment of the invention.
[0199] Those skilled in the art will also understand that the various illustrative logic blocks, modules, circuits, and algorithm steps described in conjunction with the embodiments herein can be implemented as electronic hardware, computer software, or a combination thereof. To clearly illustrate the interchangeability between hardware and software, the various illustrative components, blocks, modules, circuits, and steps described above are generally described in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the specific application and the design constraints imposed on the overall system. Those skilled in the art can implement the described functionality in alternative ways for each specific application; however, such implementation decisions should not be construed as departing from the scope of this disclosure.
[0200] The steps of the methods or algorithms described in conjunction with the embodiments herein can be directly embodied in hardware, software modules executed by a processor, or a combination thereof. The software modules can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disks, removable disks, CD-ROMs, or any other form of storage medium well known in the art. An exemplary storage medium is connected to the processor, enabling the processor to read information from and write information to the storage medium. Of course, the storage medium can also be a component of the processor. The processor and storage medium can reside in an ASIC. The ASIC can reside in a user terminal. Alternatively, the processor and storage medium can exist as discrete components in the user terminal.
[0201] For software implementation, the techniques described in this application can be implemented using modules (e.g., procedures, functions, etc.) that perform the functions described in this application. This software code can be stored in memory units and executed by a processor. The memory units can be implemented within the processor or outside the processor; in the latter case, they are communicatively coupled to the processor via various means, as is well known in the art.
[0202] The foregoing description includes examples of one or more embodiments. It is certainly impossible to describe all possible combinations of components or methods in order to describe the above embodiments, but those skilled in the art will recognize that further combinations and arrangements of the various embodiments are possible. Therefore, the embodiments described herein are intended to cover all such changes, modifications, and variations that fall within the scope of the appended claims. Furthermore, the term "comprising" as used in the specification or claims is interpreted in a manner similar to the term "including," as interpreted when used as a conjunction in the claims. Additionally, the use of any term "or" in the specification of the claims is intended to mean "non-exclusive or."
Claims
1. A method for electronic staining of endoscopic images, characterized in that, include: Acquire the raw broadband RGB image captured by the image acquisition component; Based on a preset nonlinear feature expansion operator, the original broadband RGB image is expanded into a P-dimensional broadband image; where P is a positive integer greater than 3. For each pixel in the current frame of a P-dimensional wideband image: Based on the current image block centered on the pixel, search for reference image blocks belonging to the same position in the current frame and the K frames before and after to obtain a set of reference image blocks; The image attenuation weight of each image block is determined based on the similarity distance between each reference image block and the current image block; Based on the attenuation weight of each image block, the current image block and each reference image block are fused and weighted to obtain the RGB light reflectance prediction data of the pixel. Based on the calibration matrix of the predetermined broadband image and the narrowband image of the specified spectral range, the narrowband image signal data of the specified spectral range corresponding to the RGB light reflectance inference data is calculated to obtain the narrowband light image signal data of each pixel. Based on the narrowband image signal data of each pixel in the current frame, a narrowband image of the specified spectral range of the current frame is generated, and an electronic staining image is obtained based on the narrowband image signal data of the specified spectral range.
2. The method as described in claim 1, characterized in that, The pre-defined nonlinear feature expansion operator expands the original broadband RGB image into a P-dimensional image, including: The three-channel RGB data of the original broadband RGB image are extended using a predefined nonlinear feature expansion operator. This is converted into 9-channel broadband image data; in, R, G, and B represent the signal strengths of the red, green, and blue channels of the image, respectively, while RG, BG, and RB represent the signal strengths of the red-green, blue-green, and red-blue combined channels of the image, respectively.
3. The method as described in claim 1, characterized in that, The process involves searching for reference image blocks belonging to the same position within the current frame and the preceding and following K frames, based on the current image block centered at the pixel, to obtain a set of reference image blocks, including: Get the current frame The pixel block P(x,y,t) corresponding to the pixel point Q(x,y); Within the current frame, several preceding adjacent frames, and several following adjacent frames, a preset spatial search window is used. Using the principle of brightness consistency, a reference pixel block that is at the same position as pixel block P(x,y,t) is searched, and the searched reference pixel block is added to the reference pixel block set.
4. The method as described in claim 3, characterized in that, The process of searching for a reference pixel block at the same position as pixel block P(x, y, t) using the principle of brightness consistency includes: Based on the principle of brightness consistency, calculate the motion offset vector across frames. Based on the motion offset vector, multiple reference pixel blocks corresponding to the same local location are located in the current frame and adjacent frames; where: ; In the formula, The time index of the current frame. For the current frame being processed, For a neighboring frame of the current frame, For the current image patch, For the pixels inside the image relative to the current pixel point Local coordinate offset, For motion offset vector, This represents the offset of a pixel within a row of pixels in the image. This represents the offset of a pixel within a column of pixels in the image.
5. The method as described in claim 1, characterized in that, Based on the similarity distance between each reference image patch and the current image patch, the image attenuation weight of each image patch is determined, including: Calculate the Euclidean distance between the reference image block and the current image block based on the coordinate information of the reference image block and the coordinate information of the current image block; The similarity distance between the reference image patch and the current image patch is determined based on the Euclidean distance and the size of the spatial search window; The attenuation weight of each image block is determined based on the similarity distance, the preset filtering parameters for controlling the smoothness of the spatial structure, and the preset attenuation constant in the time dimension.
6. The method as described in claim 5, characterized in that, The similar phase distance is calculated using the following formula. : ; In the formula, For the i-th reference image patch, For the current image patch, This involves summing the squared Euclidean distances between corresponding pixels within an image patch in a high-dimensional feature space. is the size of one side of the spatial search window; a and b are the local coordinate offsets of pixels inside the image relative to the current pixel. The attenuation weight is calculated using the following formula: ; In the formula, Filtering parameters used to control the smoothness of spatial structures; is the decay constant in the time dimension.
7. The method as described in claim 1, characterized in that, Based on the attenuation weights of each image block, a fusion weighted calculation is performed on the current image block and each reference image block to obtain the RGB reflectance prediction data of the pixel, including: Based on the center pixel of the current pixel block and its corresponding attenuation weight, and the center pixel of each reference image block and its corresponding attenuation weight, a fusion weighted calculation is performed on the current image block and each reference image block to obtain the RGB light reflectance prediction data of the pixel point; The estimated RGB light reflectance data is calculated using the following formula. : ; In the formula, K is the total number of pixel blocks; For the i-th reference image patch Place it in the center pixel.
8. The method as described in claim 1, characterized in that, Based on the predetermined calibration matrix M of the broadband image and the narrowband image within a specified spectral range, the RGB reflectance estimation data is calculated using the following formula to obtain the narrowband image signal data within the specified spectral range corresponding to the RGB reflectance estimation data, thus obtaining the narrowband light image signal data y(x, y, t) for each pixel: ; In the formula, M is a nonlinear mapping regression matrix obtained by optimizing a number of sets of measured narrowband light reflectance data and corresponding white light reflectance data based on a preset calibration objective function and using the relative error least squares method. The specified spectral range includes one or more specified spectral ranges, each of which corresponds to a calibration matrix M.
9. The method as described in claim 8, characterized in that, The narrowband image calibration matrix M is obtained by optimizing the objective function as follows: ; In the formula, This is the calibration value of the true absolute reflectance of narrowband light obtained by actual measurement using a spectrometer. The suppression coefficient, This represents the P-dimensional image signal of the corresponding pixel under white light illumination.
10. The method according to any one of claims 1-9, characterized in that, Also includes: The original broadband RGB image is preprocessed, and the preprocessing includes at least one of dark current compensation, white balance correction and demosaic interpolation.
11. An electronic staining device for endoscopic images, characterized in that, include: The input module is used to acquire the raw broadband RGB image captured by the image acquisition component; An extension module is used to extend the original broadband RGB image into a P-dimensional broadband image based on a preset nonlinear feature extension operator; where P is a positive integer greater than 3. The matching module is used for each pixel in the current frame of a P-dimensional broadband image: based on the current image block centered on the pixel, it searches for reference image blocks belonging to the same position in the current frame and the K frames before and after, to obtain a set of reference image blocks; The weight determination module is used to determine the image attenuation weight of each image block based on the similarity distance between each reference image block and the current image block; The fusion module is used to perform fusion weighting calculation on the current image block and each reference image block according to the attenuation weight of each image block to obtain the RGB light reflectance prediction data of the pixel. The calculation module is used to calculate the narrowband image signal data in the specified spectral range corresponding to the RGB light reflectance inference data based on the calibration matrix of the broadband image and the narrowband image in the specified spectral range, so as to obtain the narrowband light image signal data of each pixel. The output module is used to generate a narrowband image of a specified spectral range for the current frame based on the narrowband light image signal data of each pixel in the current frame, and to obtain an electronic staining image based on the narrowband image signal data of the specified spectral range.
12. An electronic staining system for endoscopic images, characterized in that, include: Image acquisition unit, image processing host, and endoscope display; The image acquisition component is used to acquire the original broadband RGB image of the target area; The image processing host is equipped with the electronic staining device for endoscopic images as described in claim 11; The endoscope display is used to display at least one of the original broadband RGB image acquired by the image acquisition component and the electronically stained image processed by the image processor host.
13. A computer storage medium, characterized in that, The computer storage medium stores computer-executable instructions, which, when executed by a processor, implement the electronic staining method for endoscopic images according to any one of claims 1-10.