Three-dimensional reconstruction method and system based on single-channel medical endoscope

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a single-channel medical endoscope for 3D reconstruction, high-resolution 3D images are generated using deep learning networks and camera calibration technology. This solves the problem of insufficient image clarity in existing 3D endoscopes and achieves high-pixel and high-stereoscopic 3D imaging effects.

WO2026123511A1PCT designated stage Publication Date: 2026-06-18EAGLESCOPE MEDICAL TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: EAGLESCOPE MEDICAL TECH CO LTD
Filing Date: 2025-04-08
Publication Date: 2026-06-18

Application Information

Patent Timeline

08 Apr 2025

Application

18 Jun 2026

Publication

WO2026123511A1

IPC: G06T17/00

AI Tagging

Application Domain

3D modelling

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing 3D endoscope camera systems use a dual-path imaging scheme, resulting in small endoscope mirror and sensor sections. The sensors transmit images with low pixel counts, leading to poor 3D imaging clarity.

⚗Method used

A 3D reconstruction method using a single-channel medical endoscope is adopted. By acquiring and preprocessing single-channel image information, deep learning networks and camera calibration technology are used to generate planar projections from different perspectives. These projections are then fused to generate point cloud maps and render depth images, thereby improving the pixel count and stereoscopic effect of the 3D image.

🎯Benefits of technology

By combining hardware computing power and software algorithms, the pixel count and ultra-high definition of the 3D images have been improved, enhancing the stereoscopic effect of the 3D images.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025087635_18062026_PF_FP_ABST

Patent Text Reader

Abstract

The present disclosure relates to a three-dimensional reconstruction method and system based on a single-channel medical endoscope. The method comprises: acquiring single-channel image information and pre-processing the single-channel image information; performing three-dimensional reconstruction on the pre-processed single-channel image information, and acquiring a reconstructed fused image; and matching the reconstructed fused image with a corresponding output format, and outputting same. The three-dimensional reconstruction method based on a single-channel medical endoscope as provided in the present invention enables the three-dimensional reconstruction of a single-channel image and the generation of a corresponding fused image by means of the cooperation of hardware computing power and a software algorithm, thereby enhancing pixels of a three-dimensional picture, and further improving the stereoscopic effect and ultra-high definition of the three-dimensional picture.

Need to check novelty before this filing date? Find Prior Art

Description

A three-dimensional reconstruction method and system based on a single-channel medical endoscope Technical Field

[0001] This disclosure relates to the technical field of medical image processing, and more specifically, to a three-dimensional reconstruction method and system based on a single-channel medical endoscope. Background Technology

[0002] A 3D endoscopic camera system is a medical device used in surgery that combines endoscopic technology and 3D imaging technology. Through a special camera and display technology, it converts traditional two-dimensional images into stereoscopic three-dimensional images, helping surgeons obtain a clearer and more detailed view during surgery.

[0003] Most existing 3D endoscope camera systems use a dual-channel imaging scheme, employing dual sensors with a certain spacing to capture images and present a 3D effect.

[0004] In existing technologies, the mirror and sensor parts at the front end of the endoscope are relatively small, resulting in low pixel counts in the images transmitted by the sensors, which ultimately leads to poor clarity in the 3D imaging. Summary of the Invention

[0005] One objective of this disclosure is to provide a new technical solution for a three-dimensional reconstruction method and system based on a single-channel medical endoscope.

[0006] According to a first aspect of this disclosure, a three-dimensional reconstruction method based on a single-channel medical endoscope is provided, the method comprising:

[0007] Acquire single-channel image information and perform preprocessing;

[0008] The preprocessed single-channel image information is reconstructed in three dimensions to obtain the reconstructed and fused image;

[0009] The reconstructed and fused image is matched to the corresponding output format and output.

[0010] Optionally, single-channel image information is acquired and preprocessed, including:

[0011] Acquire single-channel image information and the corresponding single-channel image format;

[0012] The single-channel image format is matched with the standard image formats in the preset standard image library, which stores different standard image formats used for 3D reconstruction.

[0013] If the single-channel image format does not match the standard image format, the single-channel image format is converted until it matches the standard image format.

[0014] The single-channel image information whose format matches the standard image format is set as the preprocessed single-channel image information.

[0015] Optionally, the preprocessed single-channel image information is reconstructed in 3D to obtain a reconstructed fused image, including:

[0016] Based on the preprocessed single-channel image information, a pre-defined algorithm is used to generate planar projections from different perspectives.

[0017] The planar projections from the different perspectives are fused to generate a point cloud map;

[0018] Render a planar projection from a preset viewpoint based on the point cloud map and generate a corresponding depth image;

[0019] The depth image is set as the reconstructed fused image.

[0020] Optionally, planar projections from different perspectives can be generated based on preprocessed single-channel image information using a preset algorithm, including:

[0021] The preprocessed single-channel image information is input into a preset deep learning network to obtain planar projections from different perspectives;

[0022] The error between the plane projection predicted by the deep learning network and the actual plane projection is calculated using a loss function.

[0023] The weights of the deep learning network are adjusted by optimizing the algorithm until the error between the plane projection predicted by the deep learning network and the actual plane projection is within a preset error range.

[0024] Optionally, the planar projections from the different viewpoints are fused to generate a point cloud map, including:

[0025] Obtain planar projections and preset 3D point clouds from different perspectives;

[0026] Camera intrinsic and extrinsic parameters are obtained through camera calibration.

[0027] The planar projections from the different viewpoints are mapped to a 3D point cloud based on the corresponding depth information;

[0028] The planar projections from the different perspectives are fused to generate a point cloud map.

[0029] Optionally, rendering a planar projection of a preset viewpoint based on the point cloud image and generating a corresponding depth image includes:

[0030] Projecting point cloud images onto a plane projection based on camera intrinsic and extrinsic parameters;

[0031] Calculate depth information and populate the depth map;

[0032] The pixel depth is optimized based on the nearest point strategy, and the corresponding depth image is generated.

[0033] According to a second aspect of this disclosure, a three-dimensional reconstruction system based on a single-channel medical endoscope is also provided, the system comprising:

[0034] The image receiving module is used to acquire single-channel image information and perform preprocessing.

[0035] The 3D reconstruction module is used to reconstruct the preprocessed single-channel image information into a 3D image and obtain the reconstructed fused image.

[0036] The image output module is used to match the reconstructed and fused image with the corresponding output format and output it.

[0037] According to a third aspect of this disclosure, an electronic device is also provided, including a memory and a processor, the memory being used to store a computer program; the processor being used to execute the computer program to implement the method according to a first aspect of this disclosure.

[0038] According to a fourth aspect of this disclosure, a computer-readable storage medium is also provided, on which a computer program is stored, which, when executed by a processor, implements the method described according to a first aspect of this disclosure.

[0039] According to a fifth aspect of this disclosure, a computer program product is also provided, including a computer program that, when executed by a processor, implements the method described according to a first aspect of this disclosure.

[0040] One beneficial effect of this disclosure is that the three-dimensional reconstruction method based on a single-channel medical endoscope provided by the present invention can reconstruct a single-channel image in two dimensions and generate a corresponding fused image through the combination of hardware computing power and software algorithms, thereby improving the pixel count of the three-dimensional image and thus enhancing the stereoscopic effect and ultra-high definition of the three-dimensional image.

[0041] Other features and advantages of the embodiments of this disclosure will become clear from the following detailed description of exemplary embodiments with reference to the accompanying drawings.

[0042] Attached Figure Description

[0043] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments of the present disclosure and, together with their description, serve to explain the principles of the embodiments of the present disclosure.

[0044] Figure 1 is a schematic diagram of the composition structure of a system that can be applied according to one embodiment;

[0045] Figure 2 is a flowchart illustrating a three-dimensional reconstruction method based on a single-channel medical endoscope according to one embodiment;

[0046] Figure 3 is a flowchart illustrating a three-dimensional reconstruction method based on a single-channel medical endoscope according to another embodiment;

[0047] Figure 4 is a flowchart illustrating a three-dimensional reconstructed image according to one embodiment;

[0048] Figure 5 is a block diagram of a three-dimensional reconstruction device based on a single-channel medical endoscope according to one embodiment;

[0049] Figure 6 is a schematic diagram of the hardware structure of an electronic device according to one embodiment.

[0050] Detailed Implementation

[0051] Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that, unless otherwise specifically stated, the relative arrangement, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the invention.

[0052] The following description of at least one exemplary embodiment is merely illustrative and is in no way intended to limit the invention or its application or use.

[0053] Techniques, methods, and equipment known to those skilled in the art may not be discussed in detail, but where appropriate, such techniques, methods, and equipment should be considered part of the specification.

[0054] In all the examples shown and discussed herein, any specific values should be interpreted as merely exemplary and not as limitations. Therefore, other examples of exemplary embodiments may have different values.

[0055] It should be noted that similar labels and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be discussed further in subsequent figures.

[0056] <System Implementation>

[0057] Figure 1 is a schematic diagram of the composition of a 3D reconstruction system based on a single-channel medical endoscope, which can be applied according to an embodiment of the 3D reconstruction method based on a single-channel medical endoscope. As shown in Figure 1, the 3D reconstruction system based on a single-channel medical endoscope may include an image receiving module, a computing module, and an image output module. Endoscope application scenarios include: general surgery, orthopedics, gynecology, otolaryngology, urology, gastroenterology, neurosurgery, etc.

[0058] In the embodiments of this disclosure, the memory is used to store a computer program that controls the processor to operate according to a three-dimensional reconstruction method based on a single-channel medical endoscope according to any embodiment. Those skilled in the art can design the computer program based on the schemes of the embodiments of this disclosure. How the computer program controls the processor to operate is well known in the art and will not be described in detail here.

[0059] <Method Implementation>

[0060] Figure 2 is a flowchart illustrating a three-dimensional reconstruction method based on a single-channel medical endoscope according to one embodiment.

[0061] As shown in Figures 2 and 3, the three-dimensional reconstruction method based on a single-channel medical endoscope in this embodiment may include the following steps S210 to S230:

[0062] S210: Acquire single-channel image information and perform preprocessing.

[0063] Perspective transformation is a method of converting one two-dimensional image into another two-dimensional image according to a specific viewpoint, camera position, and projection plane. The basic principle of this process is based on a perspective projection model, using a homography matrix for transformation, specifically including:

[0064] Select corresponding points in the original image, usually by manually or automatically selecting four corner points or other feature points on the original image, which correspond to the corresponding positions in the new viewpoint; calculate the homography matrix, using the known corresponding points, and calculate the homography matrix using the least squares method or the direct linear transformation (DLT) method, which describes the transformation relationship from the original viewpoint to the target viewpoint; apply perspective transformation, using the obtained homography matrix to transform the original image to obtain a new image.

[0065] S220 performs 3D reconstruction on the preprocessed single-channel image information and obtains the reconstructed fused image.

[0066] A single two-dimensional image cannot provide enough spatial information. Depth estimation techniques can be used to infer the depth information corresponding to each pixel in the image. Through depth estimation, the three-dimensional structure of the scene can be inferred from the two-dimensional image, and then the projection from different viewpoints can be calculated.

[0067] S230, Match the reconstructed and fused image to the corresponding output format and output it.

[0068] Different 3D display devices have different interfaces, so the fused image needs to be converted into an image format that matches the corresponding interface for each 3D display device in order to output the fused image to the display device.

[0069] In the above embodiments, the three-dimensional reconstruction method based on a single-channel medical endoscope provided by the present invention can reconstruct a single-channel image in two dimensions and generate a corresponding fused image through the combination of hardware computing power and software algorithms, thereby improving the pixel count of the three-dimensional image and thus enhancing the stereoscopic effect and ultra-high definition of the three-dimensional image.

[0070] In one embodiment, the preprocessing step specifically includes:

[0071] Acquire single-channel image information and the corresponding single-channel image format; match the single-channel image format with the standard image format in the preset standard image library; if the single-channel image format does not match the standard image format, convert the single-channel image format until it matches the standard image format; set the single-channel image information whose single-channel image format matches the standard image format as the preprocessed single-channel image information.

[0072] The standard image library stores different standard image formats used for 3D reconstruction. If a single-channel image format matches a standard image format, the single-channel image information that matches the standard image format is set as the preprocessed single-channel image information.

[0073] In one embodiment, as shown in Figure 4, the 3D reconstruction of single-channel image information further includes:

[0074] Based on preprocessed single-channel image information, a pre-defined algorithm is used to generate planar projections from different perspectives; the planar projections from different perspectives are fused to generate a point cloud map; the planar projections from the pre-defined perspective are rendered based on the point cloud map and a corresponding depth image is generated; the depth image is set as the reconstructed fused image.

[0075] The goal of deep learning models is to infer the depth of each pixel from a 2D image. Convolutional neural networks (CNNs) are typically used for this task. Several common depth estimation models include:

[0076] Fully Convolutional Networks (FCNs): FCNs replace the fully connected layers in Convolutional Neural Networks (CNNs) with convolutional layers, making them suitable for pixel-level image prediction tasks. FCNs classify or regress each pixel in an image to predict its depth information.

[0077] U-Net: U-Net is a network based on an encoder-decoder structure. It extracts features through the encoder part and then upsamples the feature maps back to the original image size in the decoder part. In depth estimation, U-Net's decoder helps recover spatial information, making it very effective for fine-grained depth estimation.

[0078] DepthNet: This is a network specifically designed for depth estimation tasks. It is usually based on an encoder-decoder structure and incorporates skip connections to improve accuracy, especially in low-level details.

[0079] Self-supervised learning methods: These methods do not rely on real depth labels, but train the network through self-supervision (e.g., by computing reconstruction errors) to infer depth without depth labels.

[0080] In one embodiment, generating a 2D projection of a single 2D image from a predetermined viewpoint specifically includes:

[0081] The preprocessed single-channel image information is input into a preset deep learning network to obtain planar projections from different perspectives; the error between the planar projection predicted by the deep learning network and the real planar projection is calculated using a loss function; the weights of the deep learning network are adjusted using an optimization algorithm until the error between the planar projection predicted by the deep learning network and the real planar projection is within a preset error range.

[0082] Generating a 2D projection from a single 2D image to a predetermined viewpoint typically relies on several core algorithms in computer vision and graphics, primarily including perspective transformation, camera calibration, and image transformation techniques. This process usually involves inferring the geometric information of the 3D scene from the original image and using this information to generate projections from different viewpoints.

[0083] It is worth mentioning that, in order to make the image transformation more accurate, the camera can be calibrated first to obtain the camera's intrinsic and extrinsic parameters. These parameters describe the camera's geometric characteristics (such as focal length, principal point position, distortion, etc.) and the relationship between the camera and the world coordinate system. The camera's intrinsic parameters describe the properties of the camera lens, such as focal length, principal point coordinates, pixel ratio, etc. The camera's extrinsic parameters describe the camera's position and orientation relative to the world coordinate system.

[0084] Camera calibration allows for a more precise connection between points in a 2D image and points in 3D space, enabling better viewpoint transformation.

[0085] If a single 2D image cannot provide sufficient spatial information, depth estimation techniques can be used to infer the depth information corresponding to each pixel in the image. Through depth estimation, algorithms can attempt to infer the 3D structure of a scene from a 2D image, and then calculate the projection from different viewpoints. Common depth estimation methods include:

[0086] Monocular depth estimation: estimating the depth information of each pixel from a single 2D image using a deep learning model; Stereo vision: reconstructing depth using a binocular camera or images from multiple viewpoints;

[0087] Structured light scanning: It helps capture depth information by projecting specific light patterns; after obtaining the 3D structure, it can be rotated, translated and transformed, and then projected back to the two-dimensional plane to obtain a new viewpoint image.

[0088] In one embodiment, fusing planar projections from different perspectives to generate a point cloud map specifically includes:

[0089] Acquire planar projections from different perspectives and preset 3D point clouds; obtain camera intrinsic and extrinsic parameters through camera calibration; map planar projections from different perspectives to 3D point clouds based on corresponding depth information; fuse planar projections from different perspectives to generate point cloud maps.

[0090] The process of fusing 2D projected images into 3D point cloud data to generate point cloud maps involves mapping the depth information or other attributes (such as color) of the two-dimensional image onto each point of the three-dimensional point cloud.

[0091] If a 2D image contains depth information (such as an RGBD image), the 2D image can be projected onto a 3D point cloud using the following steps:

[0092] Depth map generation: If the input image is an RGBD image, each pixel value in the depth map represents the depth of that pixel in the scene (distance from the camera); Pixel-to-3D point conversion: Based on the camera's intrinsic parameters, each pixel in the 2D image is projected into 3D space. For each pixel (u, v), the depth value d can be converted to the coordinates (X, Y, Z) of a 3D point using the following formula: ; ; ;

[0093] Among them, c x and c y These are the principal point coordinates of the camera (part of the camera's intrinsic parameters), f x and f y d is the camera's focal length, and d is the depth value of the corresponding pixel in the depth map.

[0094] When converting between 3D point clouds and 2D images, it may be necessary to transform point cloud data from one coordinate system to another. For example, if the 2D image and 3D point cloud data come from different sensors (such as an RGB camera and a LiDAR), they need to be aligned to the same coordinate system using known camera extrinsic parameters (rotation and translation matrices).

[0095] Point cloud coordinate transformation: If a 2D image and a 3D point cloud are in different coordinate systems, a transformation matrix can be used to transform the point cloud from its original coordinate system to the same coordinate system as the 2D image. The transformation matrix typically includes a rotation matrix R and a translation vector T, and the transformation formula is as follows: ;

[0096] Where R is the rotation matrix and T is the translation vector.

[0097] It's worth noting that by aligning camera intrinsic and extrinsic parameters, performing depth projection, and color mapping, 2D images can be combined with 3D point cloud data to generate enhanced point cloud maps. The typical approach is as follows:

[0098] Assigning color to each point cloud point: If the 2D image contains RGB information, the RGB value of each pixel in the image can be mapped to the corresponding point in the 3D point cloud using the camera's intrinsic and extrinsic parameters; Fusion of depth information: If the depth map and point cloud data complement each other, the accuracy of the point cloud can be further enhanced or missing parts can be filled.

[0099] After the fusion of 2D projection onto 3D point cloud data is completed, the point cloud may need to undergo some post-processing steps, such as:

[0100] Noise reduction: Point cloud data may be affected by noise. Filtering algorithms (such as statistical filtering, voxel grid filtering, etc.) can be applied to smooth and clean the point cloud. Point cloud simplification: If the point cloud data is too dense, algorithms such as the voxel grid method can be used to simplify the point cloud and reduce the number of points. Point cloud reconstruction: If sparse point clouds are generated during the fusion process, reconstruction algorithms (such as surface reconstruction algorithms, interpolation algorithms, etc.) can be used to generate a more complete 3D model.

[0101] In one embodiment, rendering a planar projection from a preset viewpoint based on the point cloud map and generating a corresponding depth image specifically includes:

[0102] The point cloud image is projected onto a plane projection based on camera intrinsic and extrinsic parameters; depth information is calculated and the depth map is filled; pixel depth is optimized based on the nearest point strategy and the corresponding depth image is generated.

[0103] During the rendering process, multiple 3D points may be projected onto the same 2D pixel. In this case, occlusion needs to be handled, that is, points far away from the camera will not affect the depth value of nearby points. The commonly used strategy is the nearest point strategy, which selects the point closest to the camera as the depth of the pixel.

[0104] Closest point handling: For each pixel, if multiple 3D points project to that pixel location, select the closest point (the one with the smallest z-axis). c The value is used as the depth of the pixel.

[0105] It's worth noting that mapping the depth information of each 3D point onto a 2D image yields a depth map. The depth map is a grayscale image, where the grayscale values are proportional to the depth of each pixel, and it typically undergoes normalization. Depth map normalization: The grayscale values of the depth map usually need to be normalized to a specific range (e.g., 0 to 255) for easier display or subsequent processing. The normalization formula can be: ;

[0106] Where d is the depth value, maxd is the maximum depth value, mind is the minimum depth value, and d' is the normalized depth value.

[0107] <Equipment Example 1>

[0108] Figure 5 is a schematic block diagram of a three-dimensional reconstruction system based on a single-channel medical endoscope according to an embodiment. As shown in Figure 5, the three-dimensional reconstruction system 500 based on a single-channel medical endoscope may include: an image receiving module 510, a three-dimensional reconstruction module 520, and an image output module 530, specifically including:

[0109] The image receiving module is used to acquire single-channel image information and perform preprocessing.

[0110] The 3D reconstruction module is used to reconstruct the preprocessed single-channel image information into a 3D image and obtain the reconstructed fused image.

[0111] The image output module is used to match the reconstructed and fused image with the corresponding output format and output it.

[0112] In one embodiment, the image receiving module is further configured to: acquire single-channel image information and the corresponding single-channel image format; match the single-channel image format with a preset standard image library containing different standard image formats for 3D reconstruction; if the single-channel image format does not match the standard image format, convert the single-channel image format until it matches the standard image format; and set the single-channel image information whose single-channel image format matches the standard image format as preprocessed single-channel image information.

[0113] In one embodiment, the 3D reconstruction module is further configured to: generate planar projections from different perspectives based on preprocessed single-channel image information using a preset algorithm; fuse the planar projections from different perspectives to generate a point cloud map; render the planar projections from a preset perspective based on the point cloud map and generate a corresponding depth image; and set the depth image as the reconstructed fused image.

[0114] In one embodiment, the 3D reconstruction module is further configured to: input the preprocessed single-channel image information into a preset deep learning network and obtain planar projections from different viewpoints; calculate the error between the planar projection predicted by the deep learning network and the real planar projection using a loss function; and adjust the weights of the deep learning network using an optimization algorithm until the error between the planar projection predicted by the deep learning network and the real planar projection is within a preset error range.

[0115] In one embodiment, the 3D reconstruction module is further configured to: acquire planar projections from different perspectives and preset 3D point clouds; acquire camera intrinsic and extrinsic parameters through camera calibration; map the planar projections from different perspectives to the 3D point clouds based on corresponding depth information; and fuse the planar projections from different perspectives to generate a point cloud map.

[0116] In one embodiment, the 3D reconstruction module is further configured to: project the point cloud image onto a planar projection based on camera intrinsic and extrinsic parameters; calculate depth information and fill the depth map; optimize pixel depth based on the nearest point strategy and generate the corresponding depth image.

[0117] <Equipment Example 2>

[0118] Figure 6 is a schematic diagram of the hardware structure of an electronic device according to another embodiment.

[0119] As shown in FIG6, the electronic device 600 includes a processor 610 and a memory 620. The memory 620 is used to store an executable computer program, and the processor 610 is used to execute the method as described in any of the above method embodiments under the control of the computer program.

[0120] The modules of the three-dimensional reconstruction system 500 based on a single-channel medical endoscope described above can be implemented by the processor 610 in this embodiment executing the computer program stored in the memory 620, or they can be implemented by other structures, which are not limited here.

[0121] This invention can be a system, method, and / or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the invention.

[0122] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example—but not limited to—electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination of the foregoing. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0123] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.

[0124] The computer program instructions used to perform the operations of this invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing state information from the computer-readable program instructions. This electronic circuitry can execute the computer-readable program instructions to implement various aspects of the invention.

[0125] Various aspects of the present invention are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0126] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0127] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0128] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions. It will be known to those skilled in the art that implementation in hardware, implementation in software, and implementation using a combination of software and hardware are equivalent.

[0129] The various embodiments of the present invention have been described above. These descriptions are exemplary and not exhaustive, and are not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the embodiments in the market, or to enable others skilled in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.

Claims

1. A three-dimensional reconstruction method based on a single-channel medical endoscope, characterized in that, The method includes: Acquire single-channel image information and perform preprocessing; The preprocessed single-channel image information is reconstructed in three dimensions to obtain the reconstructed and fused image; The reconstructed and fused image is matched to the corresponding output format and output.

2. The method according to claim 1, characterized in that, The acquisition and preprocessing of single-channel image information includes: Acquire single-channel image information and the corresponding single-channel image format; The single-channel image format is matched with the standard image formats in the preset standard image library, which stores different standard image formats used for 3D reconstruction. If the single-channel image format does not match the standard image format, the single-channel image format is converted until it matches the standard image format. The single-channel image information whose format matches the standard image format is set as the preprocessed single-channel image information.

3. The method according to claim 1, characterized in that, The step of performing three-dimensional reconstruction on the preprocessed single-channel image information and obtaining the reconstructed fused image includes: Based on the preprocessed single-channel image information, a pre-defined algorithm is used to generate planar projections from different perspectives. The planar projections from the different perspectives are fused to generate a point cloud map; Render a planar projection from a preset viewpoint based on the point cloud map and generate a corresponding depth image; The depth image is set as the reconstructed fused image.

4. The method according to claim 3, characterized in that, The step of generating planar projections from different perspectives based on preprocessed single-channel image information using a preset algorithm includes: The preprocessed single-channel image information is input into a preset deep learning network to obtain planar projections from different perspectives; The error between the plane projection predicted by the deep learning network and the actual plane projection is calculated using a loss function. The weights of the deep learning network are adjusted by optimizing the algorithm until the error between the plane projection predicted by the deep learning network and the actual plane projection is within a preset error range.

5. The method according to claim 3, characterized in that, The process of fusing the planar projections from different perspectives to generate a point cloud map includes: Obtain planar projections and preset 3D point clouds from different perspectives; Camera intrinsic and extrinsic parameters are obtained through camera calibration. The planar projections from the different viewpoints are mapped to a 3D point cloud based on the corresponding depth information; The planar projections from the different perspectives are fused to generate a point cloud map.

6. The method according to claim 3, characterized in that, The step of rendering a planar projection of a preset viewpoint based on the point cloud map and generating a corresponding depth image includes: Projecting point cloud images onto a plane projection based on camera intrinsic and extrinsic parameters; Calculate depth information and populate the depth map; The pixel depth is optimized based on the nearest point strategy, and the corresponding depth image is generated.

7. A three-dimensional reconstruction system based on a single-channel medical endoscope, characterized in that, The system includes: The image receiving module is used to acquire single-channel image information and perform preprocessing. The 3D reconstruction module is used to reconstruct the preprocessed single-channel image information into a 3D image and obtain the reconstructed fused image. The image output module is used to match the reconstructed and fused image with the corresponding output format and output it.

8. An electronic device, characterized in that, The system includes a memory and a processor, the memory being used to store a computer program; the processor being used to execute the computer program to implement the method according to any one of claims 1 to 6.

9. A computer-readable storage medium, characterized in that, A computer program is stored on the computer-readable storage medium, which, when executed by a processor, implements the method according to any one of claims 1 to 6.

10. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method of any one of claims 1 to 6.