Method for generating a forbidden zone type three-dimensional virtual clamp for surgical safety
By updating the three-dimensional forbidden area virtual fixture in real time, and using three-dimensional affine transformation and optical flow information, combined with repulsive force and viscous resistance to constrain surgical instruments, the problem of surgical instruments entering the forbidden area is solved, thus improving surgical safety and efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI CHANGZHENG HOSPITAL
- Filing Date
- 2022-07-04
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, the prohibited areas planned before surgery are difficult to update in real time during surgery, which increases the risk of surgical instruments entering the prohibited areas of the work area and affects the safety of the surgery.
By reading endoscopic images, two-dimensional key points of the prohibited area are obtained. Combined with three-dimensional affine transformation and optical flow information, the three-dimensional prohibited area is updated in real time, and a virtual force field containing repulsive force and viscous resistance is established to constrain surgical instruments to enter the protected tissue.
It enables precise dynamic positioning and tracking of surgical instruments within the body environment, ensuring that surgical instruments avoid prohibited areas and improving surgical safety and efficiency.
Smart Images

Figure CN115330678B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of virtual fixture technology, and specifically to a method, system, storage medium, and electronic device for generating prohibited area-type three-dimensional virtual fixtures for surgical safety assurance. Background Technology
[0002] The emergence of autonomous surgical robots for remote surgery has solved the problem of uneven distribution of medical resources. By tracking three-dimensional key points, it can guide the avoidance of protected prohibited areas in the work area, thereby improving the quality of remote surgical guidance and the autonomous operation capability of surgical robots.
[0003] Virtual jigs are often used in intraoperative surgical scenarios in the form of force interaction to solve the problem of controlling surgical instruments with non-periodic motion, such as organ protection, target guidance, and obstacle avoidance constraints during surgery, thereby ensuring the safety and precision of the surgical process.
[0004] Because existing technologies cannot update pre-operatively planned prohibited areas in real time during surgery, surgeons often struggle to locate these areas during the procedure and cannot visually assess deviations in the surgical path, significantly increasing the difficulty of controlling the robot. This also directly impacts virtual grippers, particularly prohibited area-type virtual grippers, which are unable to prevent surgical instruments from entering prohibited areas of the work zone. Summary of the Invention
[0005] (a) Technical problems to be solved
[0006] To address the shortcomings of existing technologies, this invention provides a method, system, storage medium, and electronic device for generating prohibited areas of surgical safety, solving the technical problem of being unable to prevent surgical instruments from entering prohibited areas of the work area.
[0007] (II) Technical Solution
[0008] To achieve the above objectives, the present invention provides the following technical solution:
[0009] A method for generating a prohibited area-type 3D virtual fixture for surgical safety assurance includes:
[0010] S1. Read the endoscope image, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area;
[0011] S2. Track the first local region containing the first two-dimensional key point on the current frame image, obtain the second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image.
[0012] S3. Based on the mapping relationship between the endoscopic image and the point cloud, the first local region is mapped onto the first local point cloud and the second local region is mapped onto the second local point cloud respectively; and the first two-dimensional key point is determined as the first three-dimensional key point on the first local point cloud, and the second three-dimensional key point on the second local point cloud is obtained through coordinate transformation.
[0013] S4. Map the second 3D key points back to the second local region to obtain the second 2D key points on the next frame image; and combine them with the initial 2D key points to obtain the 2D coordinates of the tracked key points by minimizing the preset optimization function, and finally obtain the corresponding 3D coordinates; and summarize the 3D coordinates of each key point to obtain the 3D forbidden region.
[0014] S5. Establish a prohibited area type virtual fixture on the three-dimensional prohibited area. The prohibited area type virtual fixture constrains the surgical instruments to enter the protected tissue through a force feedback mechanism.
[0015] Preferably, S5 specifically includes:
[0016] S51. Convert the real-time three-dimensional position information of the surgical instrument and the point cloud obtained in the previous steps to the same coordinate system, and obtain the distance from the end of the surgical instrument in the three-dimensional prohibited area. nearest point and the relative distance between the two points. ;
[0017] S52, Establishing a framework regarding Artificial obstacle avoidance vector field:
[0018] Define all points on a 3D forbidden region as contained in a point set. In the middle, seek subset ,in For additional investigation radius;
[0019]
[0020] in, for The artificial obstacle avoidance vector field; For artificial vector length function, As a point Point of view The vector, when Different artificial vector length functions are established for different organizations. The conditions are met. ; It is a subset The number of elements in the middle;
[0021] S53. Establish a virtual force field that includes repulsive force and viscous resistance:
[0022]
[0023] in, For virtual force fields; This is the proportionality coefficient between the repulsive force at that location and the artificial obstacle avoidance vector. The damping coefficient varies with relative distance. Move; for The speed.
[0024] Preferably, S2 includes:
[0025] S21, First define the first Frame image is ,in Indicates the width of the endoscopic image. Indicates the height of the endoscopic image; the first Frame image is First two-dimensional key points Its coordinates are ;
[0026] Using the first two-dimensional key points The center of the first local region The first local region is determined based on the preset region shape and side length. ;
[0027] S22. Image processing using optical flow method. Perform feature matching of feature points to obtain the image. Above and The center of the corresponding second local region ,
[0028]
[0029] in, Representing an image Feature points on, Representing an image The number of feature points;
[0030] S23, according to the center The second local region is determined by the preset region shape and side length. ;
[0031] S24. Optical flow method is used again to directly determine the first two-dimensional key points. Initial 2D key points on the next frame image .
[0032] Preferably, S3 includes:
[0033] S31. Perform depth estimation on the endoscopic image to obtain a depth image corresponding to the endoscopic image; obtain the spatial and color information of each pixel from the depth image and the endoscopic image respectively by reading line by line to obtain the first local point cloud. Second local point cloud ;
[0034] S32. Determine the first two-dimensional key points. First local point cloud The first three-dimensional key point ,
[0035]
[0036] express arrive The mapping relationship between them is denoted as ;
[0037] S33. Obtaining local regions using optical flow methods Feature point pairs, respectively denoted as and ,but and There is a coordinate transformation relationship between them:
[0038]
[0039] in, These are the parameters of the fitted function; by using least squares, It can be obtained from the following formula:
[0040]
[0041] but The transformation matrix of the affine transformation between them is:
[0042]
[0043] in, ;
[0044] S34, By analyzing the first three-dimensional key points Perform a 3D affine transformation, in matrix form:
[0045] .
[0046] Second local point cloud Search for the nearest point to obtain the second and third-dimensional key points. The initial position.
[0047] Preferably, in step S31, depth estimation is performed on the endoscopic image to obtain a depth image corresponding to the endoscopic image. The binocular depth estimation network used has the ability to quickly overlearn and can continuously adapt to new scenes using self-supervised information. Specifically, this includes:
[0048] S311. Acquire binocular endoscope images and extract multi-scale features of the current frame image using the encoder network of the current binocular depth estimation network.
[0049] S312. Using the decoder network of the current binocular depth estimation network, multi-scale features are fused to obtain the disparity of each pixel in the current frame image.
[0050] S313. Based on the camera's intrinsic and extrinsic parameters, convert the parallax into depth and output it as the result of the current frame image;
[0051] S314. Without introducing external ground truth, update the parameters of the current stereo depth estimation network using self-supervised loss for depth estimation of the next frame image.
[0052] Preferably, the optimization function in S4 refers to:
[0053]
[0054] in, Represents the optimization function;
[0055] Cosine similarity of SIFT feature vectors:
[0056]
[0057] in, Representing the first two-dimensional key point The feature descriptor is a vector; Represents the magnitude of a vector; second two-dimensional key points The coordinates are , Representing the second two-dimensional key points neighborhood points, and It is the second two-dimensional key point coordinate offset, Representing neighborhood points The feature descriptor is a vector;
[0058] Indicating the impact of optical flow information:
[0059]
[0060] in,
[0061] .
[0062] Preferably, in step S4, the two-dimensional coordinates of the tracked key points are obtained by minimizing a preset optimization function, and finally the corresponding three-dimensional coordinates are obtained, including:
[0063] By traversal search and , can get and Make them satisfy the following expression:
[0064]
[0065] To obtain the ideal offset Then, obtain the key points obtained through tracking. The two-dimensional coordinates are obtained, and then the corresponding three-dimensional coordinates are obtained based on the mapping relationship between the endoscopic image and the point cloud.
[0066] A three-dimensional virtual fixture generation system for prohibited areas aimed at ensuring surgical safety includes:
[0067] The selection module is used to read endoscopic images, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area;
[0068] The tracking module is used to track a first local region containing a first two-dimensional key point on the current frame image, obtain a second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image.
[0069] The mapping module is used to map a first local region onto a first local point cloud and a second local region onto a second local point cloud according to the mapping relationship between the endoscopic image and the point cloud; and to determine the first three-dimensional key points of the first two-dimensional key points on the first local point cloud, and to obtain the second three-dimensional key points on the second local point cloud through coordinate transformation.
[0070] The optimization module is used to map the second 3D key points back to the second local region, obtain the second 2D key points on the next frame image; and combine the initial 2D key points to obtain the 2D coordinates of the tracked key points by minimizing the preset optimization function, and finally obtain the corresponding 3D coordinates; and summarize the 3D coordinates of each key point to obtain the 3D forbidden region.
[0071] The constraint module is used to create a restricted area type virtual fixture on a three-dimensional restricted area. The restricted area virtual fixture constrains surgical instruments to enter the protected tissue through a force feedback mechanism.
[0072] A storage medium storing a computer program for generating prohibited area-type three-dimensional virtual fixtures for surgical safety assurance, wherein the computer program causes a computer to execute the prohibited area-type three-dimensional virtual fixture generation method as described above.
[0073] An electronic device, comprising:
[0074] One or more processors;
[0075] Memory; and
[0076] One or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs including methods for performing the prohibited area type three-dimensional virtual fixture generation method as described above.
[0077] (III) Beneficial Effects
[0078] This invention provides a method, system, storage medium, and electronic device for generating prohibited area-type three-dimensional virtual fixtures to ensure surgical safety. Compared with existing technologies, it has the following advantages:
[0079] In this invention, the selected key points are initially tracked through three-dimensional affine transformation; and the precise dynamic positioning and tracking of three-dimensional key points in the in vivo environment is achieved by combining texture and optical flow information; in addition, the three-dimensional coordinates of each key point are summarized to obtain a three-dimensional forbidden region, and a virtual force field containing repulsive force and viscous resistance is established on the three-dimensional forbidden region, namely a forbidden region type virtual clamp, which precisely constrains the surgical instruments to enter the protected tissue through a force feedback mechanism. Attached Figure Description
[0080] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0081] Figure 1 This is a flowchart illustrating a method for generating a prohibited area type three-dimensional virtual fixture for surgical safety assurance, provided by an embodiment of the present invention.
[0082] Figure 2 A schematic diagram of a network training architecture provided in an embodiment of the present invention;
[0083] Figure 3 This is a schematic diagram of left and right parallax acquisition provided in an embodiment of the present invention;
[0084] Figure 4 A schematic diagram of a network application provided in an embodiment of the present invention;
[0085] Figure 5 This invention provides a relationship diagram between an initial two-dimensional key point, a second two-dimensional key point, and their neighboring points. Detailed Implementation
[0086] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are described clearly and completely. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0087] This application provides a method, system, storage medium, and electronic device for generating prohibited areas of surgical safety, which solves the technical problem of being unable to prevent surgical instruments from entering prohibited areas of the work area.
[0088] The technical solution in this application is to solve the above-mentioned technical problems, and the general idea is as follows:
[0089] The prohibited area type three-dimensional virtual fixture generation method system, storage medium and electronic device provided in the embodiments of the present invention are used for dynamic tracking of artificial positions based on three-dimensional point clouds in robotic remote surgery, and are mainly applied to, but not limited to, minimally invasive endoscopic surgery scenarios.
[0090] This invention proposes to use a three-dimensional reconstruction algorithm for flexible in vivo environments, where doctors draw protected areas and prioritize obstacle avoidance for different tissues in the body according to their importance, thereby constructing a control mechanism for robot safety constraints and forming a method for generating dynamic prohibited area type three-dimensional virtual grippers for flexible in vivo environments.
[0091] To address the characteristics of the flexible and dynamic environment within the body, this invention utilizes intraoperative imaging for manual selection of key points, which are then updated in real-time on a 3D point cloud, ensuring the accuracy of surgical path updates in complex and ever-changing environments. Furthermore, to address the indistinct features of the internal environment, initial tracking of the selected key points is achieved through 3D affine transformation, combined with texture and optical flow information to realize precise dynamic positioning and tracking of 3D key points within the internal environment. Additionally, a virtual force field containing repulsive and viscous resistance, i.e., a restricted area virtual fixture, is established on the 3D model of the protected tissue area, using a force feedback mechanism to assist the surgeon in safely operating the surgical robot system.
[0092] To better understand the above technical solutions, the following will provide a detailed explanation of the technical solutions in conjunction with the accompanying drawings and specific implementation methods.
[0093] Example:
[0094] like Figure 1 As shown, a method for generating a prohibited area type 3D virtual fixture for surgical safety includes:
[0095] S1. Read the endoscope image, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area;
[0096] S2. Track the first local region containing the first two-dimensional key point on the current frame image, obtain the second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image.
[0097] S3. Based on the mapping relationship between the endoscopic image and the point cloud, the first local region is mapped onto the first local point cloud and the second local region is mapped onto the second local point cloud respectively; and the first two-dimensional key point is determined as the first three-dimensional key point on the first local point cloud, and the second three-dimensional key point on the second local point cloud is obtained through coordinate transformation.
[0098] S4. Map the second 3D key points back to the second local region to obtain the second 2D key points on the next frame image; and combine them with the initial 2D key points to obtain the 2D coordinates of the tracked key points by minimizing the preset optimization function, and finally obtain the corresponding 3D coordinates; and summarize the 3D coordinates of each key point to obtain the 3D forbidden region.
[0099] S5. Establish a prohibited area type virtual fixture on the three-dimensional prohibited area. The prohibited area type virtual fixture constrains the surgical instruments to enter the protected tissue through a force feedback mechanism.
[0100] In this invention, the selected key points are initially tracked through three-dimensional affine transformation; and the precise dynamic positioning and tracking of three-dimensional key points in the in vivo environment is achieved by combining texture and optical flow information; in addition, the three-dimensional coordinates of each key point are summarized to obtain a three-dimensional forbidden region, and a virtual force field containing repulsive force and viscous resistance is established on the three-dimensional forbidden region, namely a forbidden region type virtual clamp, which precisely constrains the surgical instruments to enter the protected tissue through a force feedback mechanism.
[0101] The following section will detail each step of the above technical solution:
[0102] In step S1, the endoscope image is read, and the prohibited area on the current frame image is obtained according to the doctor's selection, and all the first two-dimensional key points within the prohibited area are obtained;
[0103] This invention implements initial tracking of key points within a prohibited area through three-dimensional affine transformation; and combines texture and optical flow information to achieve precise dynamic positioning and tracking of three-dimensional key points in the in vivo environment.
[0104] In this step, the surgeon marks prohibited areas on intraoperative images to identify key points within those areas. These key points are then used for subsequent display and updating on the 3D point cloud, ensuring the intuitiveness and accuracy of information transmission and improving surgical efficiency.
[0105] In step S2, a first local region containing the first two-dimensional keypoint is tracked in the current frame image, a second local region in the next frame image is obtained, and the initial two-dimensional keypoint of the first two-dimensional keypoint in the next frame image is determined; specifically including:
[0106] S21, First define the first Frame image is ,in Indicates the width of the endoscopic image. Indicates the height of the endoscopic image; the first Frame image is First two-dimensional key points Its coordinates are ;
[0107] Using the first two-dimensional key points The center of the first local region The first local region is determined based on the preset region shape and side length. ;
[0108] S22. Image processing using optical flow method. Feature matching of feature points is performed, and the average difference in pixel coordinates between feature point pairs is used to obtain the direction and distance of movement between two frames, thus creating the image. Above and The center of the corresponding second local region is represented as:
[0109]
[0110] in, This is represented as the center of the second local region, with coordinates as... ;
[0111] Representing an image Feature points on, Representing an image The number of feature points;
[0112] S23, according to the center The second local region is uniquely determined by the preset region shape and side length. ;
[0113] S24. Optical flow method is used again to directly determine the first two-dimensional key points. Initial 2D key points on the next frame image .
[0114] It's easy to understand that the preset area shape and side length can be selected according to actual needs, and are not strictly limited here. (Using size as...) rectangular area For example, Representing endoscopic images Upper rectangular area The center point; where:
[0115]
[0116]
[0117] Then the first rectangular area of frame image for:
[0118]
[0119]
[0120] In the embodiment of the present invention, when tracking three-dimensional key points, a local area is first determined by the position of the key points, and the local area is tracked to reduce the error tracking of three-dimensional key points.
[0121] In step S3, based on the mapping relationship between the endoscopic image and the point cloud, the first local region is mapped onto the first local point cloud and the second local region is mapped onto the second local point cloud, respectively; and the first two-dimensional key point is determined as the first three-dimensional key point on the first local point cloud, and the second three-dimensional key point on the second local point cloud is obtained through coordinate transformation.
[0122] This step is actually the initial localization of 3D key points, which can be achieved through the following two steps.
[0123] First, based on the mapping relationship between the endoscopic image and the point cloud, the three-dimensional key points corresponding to the two-dimensional key points are determined on the point cloud by using the position of the two-dimensional key points. Second, the tissue in the local area can be approximated as a rigid body, and the transformation matrix between the point clouds can be solved by using three-dimensional affine transformation. The three-dimensional key points on the target point cloud are obtained through coordinate transformation.
[0124] Accordingly, S3 specifically includes:
[0125] S31. Utilize a neural network to perform depth estimation on the endoscopic image to obtain a depth image corresponding to the endoscopic image; by reading line by line, obtain the spatial and color information of each pixel from the depth image and the endoscopic image respectively to acquire the first local point cloud. Second local point cloud ;
[0126] S32. Determine the first two-dimensional key points. First local point cloud The first three-dimensional key point ,
[0127]
[0128] express arrive The mapping relationship between them is denoted as ;
[0129] S33. To obtain least squares observations, optical flow is used to obtain local region data. Feature point pairs, respectively denoted as and ,but and There is a coordinate transformation relationship between them:
[0130]
[0131] in, R^(4×3) are the parameters of the fitted function; by using least squares, It can be obtained from the following formula:
[0132]
[0133] but The transformation matrix of the affine transformation between them is:
[0134]
[0135] in, ;
[0136] S34, By analyzing the first three-dimensional key points Perform a 3D affine transformation, in matrix form:
[0137] .
[0138] Second local point cloud Search for the nearest point to obtain the second and third-dimensional key points. The initial position.
[0139] Specifically, in step S31, depth estimation is performed on the endoscopic image to obtain a depth image corresponding to the endoscopic image. The binocular depth estimation network used has the ability to quickly overlearn and can continuously adapt to new scenes using self-supervised information. Specifically, this includes:
[0140] S311. Acquire binocular endoscope images and extract multi-scale features of the current frame image using the encoder network of the current binocular depth estimation network.
[0141] S312. Using the decoder network of the current binocular depth estimation network, multi-scale features are fused to obtain the disparity of each pixel in the current frame image.
[0142] S313. Based on the camera's intrinsic and extrinsic parameters, convert the parallax into depth and output it as the result of the current frame image;
[0143] S314. Without introducing external ground truth, update the parameters of the current stereo depth estimation network using self-supervised loss for depth estimation of the next frame image.
[0144] The aforementioned method for acquiring depth images utilizes the similarity of consecutive frames to extend the overfitting concept from a pair of binocular images to overfitting over time series. By continuously updating the model parameters through online learning, it can obtain high-precision tissue depth in various binocular endoscopic surgical environments.
[0145] Specifically, the pre-training stage of the network model abandons the traditional training mode and adopts the idea of meta-learning, which allows the network to learn the depth of one image to predict the depth of another image, thereby calculating the loss and updating the network. This can effectively promote the network's generalization to new scenes and improve its robustness to low-texture complex lighting, while significantly reducing the time required for subsequent overfitting.
[0146] like Figure 2 As shown, the initial model parameters corresponding to the stereo depth estimation network are obtained through meta-learning training, specifically including:
[0147] S100, Randomly select an even number of pairs of stereo images. And equally divided into support sets and query set , and Images are randomly paired to form K tasks ;
[0148] S200, Inner Circulation Training: Based on The loss is calculated from the support set image to perform a parameter update;
[0149]
[0150] in, This represents the network parameters after the inner loop update; To express differentiation, The learning rate for the inner loop. For the first Support set images for each task It is based on the initial parameters of the model Calculated loss;
[0151] S300, External Circulation Training: Based on The query set image is used to calculate the meta-learning loss using the updated model, and the initial parameters of the model are directly updated. for ;
[0152]
[0153] in, The learning rate for the outer loop; It is the first Image of the query set for each task. This is the learning loss of the meta-learning.
[0154] In step S311, as Figure 3 As shown, binocular endoscopic images are acquired, and the encoder network of the current binocular depth estimation network is used to extract multi-scale features of the current frame image.
[0155] For example, in this step, the encoder of the binocular depth estimation network is selected to be a ResNet18 network, which is used to extract feature maps at five scales from the endoscopic image.
[0156] In step S312, as Figure 3 As shown, the decoder network of the current binocular depth estimation network is used to fuse multi-scale features and obtain the disparity of each pixel in the current frame image; specifically, it includes:
[0157] The decoder is used to pass the coarse-scale feature map through a convolutional block and upsampling, and then concatenate it with the fine-scale feature map. The feature map is then fused through another convolutional block, wherein the convolutional block is constructed by combining reflection padding, convolutional layers, and non-linear activation units (ELUs).
[0158] Calculate the disparity directly based on the output with the highest network resolution:
[0159]
[0160] in, This represents the disparity estimate of a pixel. For the preset maximum parallax range, It is the highest resolution output. It is a convolutional layer. Perform range normalization.
[0161] In step S313, the parallax is converted into depth based on the camera's intrinsic and extrinsic parameters and output as the result of the current frame image.
[0162] In this step, converting parallax to depth means:
[0163]
[0164] in, This is the depth estimate of a pixel; Internal reference for binocular cameras; The baseline length is the extrinsic parameter of the binocular camera.
[0165] In step S314, as Figure 4 As shown, without introducing external ground truth, the parameters of the current stereo depth estimation network are updated using self-supervised loss for depth estimation of the next frame image.
[0166] Self-monitoring losses include:
[0167] (1) Geometric consistency loss :
[0168]
[0169] in, This represents the right-eye depth obtained after transforming the left-eye depth map. This represents the right eye depth obtained by sampling on the right eye depth map;
[0170] By incorporating geometric consistency constraints into the training loss, the network's general applicability to hardware is ensured, enabling it to autonomously adapt to unconventional binocular images such as surgical endoscopes.
[0171] (2) Photometric loss :
[0172]
[0173] in, Indicates the number of valid pixels. Represents the set of valid pixels; and These represent the original image and the reconstructed image, respectively. and For balancing parameters, Indicates image structural similarity;
[0174] (3) Smoothing loss :
[0175]
[0176] in, Represents a normalized depth map. and This represents the first derivative along the horizontal and vertical directions of the image.
[0177] In summary, the above-described method for acquiring depth images treats depth estimation for each frame of the binocular image as an independent task, and uses real-time overfitting to obtain a high-precision model suitable for the current frame; moreover, it can quickly learn new scenes through online learning to obtain high-precision depth estimation results.
[0178] In step S4, the second three-dimensional key points are mapped back to the second local region to obtain the second two-dimensional key points on the next frame image; and combined with the initial two-dimensional key points, the two-dimensional coordinates of the tracked key points are obtained by minimizing the preset optimization function, and finally the corresponding three-dimensional coordinates are obtained; and the three-dimensional coordinates of each key point are summarized to obtain the three-dimensional forbidden region.
[0179] This step is essentially the precise localization of 3D keypoints. In the in vivo environment, tissues are dynamic, flexible, and highly similar. Therefore, this embodiment of the invention utilizes texture information from the neighborhood of keypoints to construct an optimization function, and precisely localizes the 3D keypoints by minimizing this optimization function.
[0180] Specifically, firstly, based on the mapping relationship between the endoscopic image and the point cloud, the second three-dimensional key points are... Mapping back to the second local region Obtain the second two-dimensional key points on the next frame image. .
[0181] Then, combining the initial two-dimensional key points, the two-dimensional coordinates of the tracked key points are obtained by minimizing a preset optimization function, and finally, the corresponding three-dimensional coordinates are obtained, including:
[0182] The optimization function mentioned above refers to:
[0183]
[0184] in, Represents the optimization function;
[0185] Cosine similarity of SIFT feature vectors:
[0186]
[0187] in, Representing the first two-dimensional key point The feature descriptor is a vector; Represents the magnitude of a vector; second two-dimensional key points The coordinates are , Representing the second two-dimensional key points neighborhood points, and It is the second two-dimensional key point coordinate offset, Representing neighborhood points The feature descriptor is a vector;
[0188] Indicating the impact of optical flow information:
[0189]
[0190] Where, vector Definition as follows Figure 5 As shown:
[0191] .
[0192] By traversal search and , can get and Make them satisfy the following expression:
[0193]
[0194] To obtain the ideal offset Then, obtain the key points obtained through tracking. The two-dimensional coordinates are then used to obtain the corresponding three-dimensional coordinates based on the mapping relationship between the endoscopic image and the point cloud.
[0195] In this embodiment of the invention, after obtaining the transformation matrix through three-dimensional affine transformation, the texture information and optical flow information of the endoscopic image are combined to construct an optimization function to accurately locate the three-dimensional key points, thereby avoiding the influence of the indistinct features of the internal environment on the tracking results to a certain extent.
[0196] In step S5, a virtual clamp of the restricted area type is established on the three-dimensional restricted area. The virtual clamp of the restricted area type constrains the surgical instruments to enter the protected tissue through a force feedback mechanism; specifically including:
[0197] S51. Convert the real-time three-dimensional position information of the surgical instrument and the point cloud obtained in the previous steps to the same coordinate system, and use the K-nearest neighbor algorithm to search for the distance from the end of the surgical instrument in the three-dimensional forbidden region. nearest point and the relative distance between the two points. ;
[0198] Specifically, during the surgery, the surgeon monitors the real-time three-dimensional position information of the surgical instruments. The position of the protected tissue obtained by measuring the position of the surgical instruments in three dimensions and estimating the depth is transformed into the same coordinate system after coordinate transformation based on the relative position.
[0199] S52, Establishing a framework regarding Artificial obstacle avoidance vector field:
[0200] Define all points on a 3D forbidden region as contained in a point set. In the middle, seek subset ,in For additional investigation radius;
[0201]
[0202] in, for The artificial obstacle avoidance vector field; For artificial vector length function, As a point Point of view The vector, when Different artificial vector length functions are established for different organizations. The conditions are met. ; It is a subset The number of elements in the middle;
[0203] S53. Establish a virtual force field that includes repulsive force and viscous resistance:
[0204]
[0205] in, For virtual force fields; This is the proportionality coefficient between the repulsive force at that location and the artificial obstacle avoidance vector. The damping coefficient varies with relative distance. Move; for The speed.
[0206] In this embodiment of the invention, the guiding physician draws the protected area and acquires images of the intraoperative images using a binocular endoscope. A CNN convolutional neural network is used to calculate depth estimation, resulting in a dynamic three-dimensional point cloud model of the internal environment. Different tissues within the body are prioritized for obstacle avoidance according to their importance, and a spatial artificial vector field for collision avoidance is established to determine the avoidance direction and speed of surgical instruments when facing different tissues.
[0207] This invention provides a three-dimensional virtual fixture generation system for prohibited areas to ensure surgical safety, comprising:
[0208] The selection module is used to read endoscopic images, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area;
[0209] The tracking module is used to track a first local region containing a first two-dimensional key point on the current frame image, obtain a second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image.
[0210] The mapping module is used to map a first local region onto a first local point cloud and a second local region onto a second local point cloud according to the mapping relationship between the endoscopic image and the point cloud; and to determine the first three-dimensional key points of the first two-dimensional key points on the first local point cloud, and to obtain the second three-dimensional key points on the second local point cloud through coordinate transformation.
[0211] The optimization module is used to map the second 3D key points back to the second local region, obtain the second 2D key points on the next frame image; and combine the initial 2D key points to obtain the 2D coordinates of the tracked key points by minimizing the preset optimization function, and finally obtain the corresponding 3D coordinates; and summarize the 3D coordinates of each key point to obtain the 3D forbidden region.
[0212] The constraint module is used to create a restricted area type virtual fixture on a three-dimensional restricted area. The restricted area virtual fixture constrains surgical instruments to enter the protected tissue through a force feedback mechanism.
[0213] A storage medium storing a computer program for generating prohibited area-type three-dimensional virtual fixtures for surgical safety assurance, wherein the computer program causes a computer to execute the prohibited area-type three-dimensional virtual fixture generation method as described above.
[0214] An electronic device, comprising:
[0215] One or more processors;
[0216] Memory; and
[0217] One or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs including methods for performing the prohibited area type three-dimensional virtual fixture generation method as described above.
[0218] It is understood that the prohibited area type three-dimensional virtual jig generation system, storage medium and electronic device for surgical safety provided in the embodiments of the present invention correspond to the prohibited area type three-dimensional virtual jig generation method for surgical safety provided in the embodiments of the present invention. The explanation, examples and beneficial effects of the relevant contents can be referred to the corresponding parts of the prohibited area type three-dimensional virtual jig generation method, and will not be repeated here.
[0219] In summary, compared with existing technologies, it has the following beneficial effects:
[0220] 1. In this invention, the selected key points are initially tracked through three-dimensional affine transformation; and the precise dynamic positioning and tracking of three-dimensional key points in the in vivo environment is realized by combining texture and optical flow information; in addition, the three-dimensional coordinates of each key point are summarized to obtain a three-dimensional forbidden region, and a virtual force field containing repulsive force and viscous resistance is established on the three-dimensional forbidden region, namely a forbidden region type virtual clamp, which precisely constrains the surgical instruments to enter the protected tissue through a force feedback mechanism.
[0221] 2. In this embodiment of the invention, the doctor marks two-dimensional key points in the intraoperative images for subsequent display and updating of key points on the three-dimensional point cloud, which ensures the intuitiveness and accuracy of information transmission and improves surgical efficiency.
[0222] 3. In the embodiments of the present invention, when tracking three-dimensional key points, a local area is first determined by the position of the key points, and the local area is tracked to reduce the error tracking of three-dimensional key points.
[0223] 4. In this embodiment of the invention, the guiding physician draws the protected area and acquires images of the intraoperative images using a binocular endoscope. A CNN convolutional neural network is used to calculate depth estimation, resulting in a dynamic three-dimensional point cloud model of the internal environment. Different tissues within the body are prioritized for obstacle avoidance according to their importance, and a spatial artificial vector field for collision avoidance is established to determine the avoidance direction and speed of surgical instruments when facing different tissues.
[0224] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0225] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for generating a prohibited area type three-dimensional virtual fixture for surgical safety assurance, characterized in that, include: S1. Read the endoscope image, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area; S2. Track the first local region containing the first two-dimensional key point on the current frame image, obtain the second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image. S3. Based on the mapping relationship between the endoscopic image and the point cloud, the first local region is mapped onto the first local point cloud and the second local region is mapped onto the second local point cloud respectively; and the first two-dimensional key point is determined as the first three-dimensional key point on the first local point cloud, and the second three-dimensional key point on the second local point cloud is obtained through coordinate transformation. S4. Map the second three-dimensional key points back to the second local region to obtain the second two-dimensional key points on the next frame image; By combining the initial two-dimensional key points and minimizing the preset optimization function, the two-dimensional coordinates of the tracked key points are obtained, and finally the corresponding three-dimensional coordinates are obtained. And by summarizing the three-dimensional coordinates of each key point, the three-dimensional forbidden area can be obtained; S5. Establish a prohibited area type virtual fixture on the three-dimensional prohibited area. The prohibited area type virtual fixture constrains the surgical instruments to enter the protected tissue through a force feedback mechanism.
2. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 1, characterized in that, S5 specifically includes: S51. Convert the real-time three-dimensional position information of the surgical instrument and the point cloud obtained in the previous steps to the same coordinate system, and obtain the distance from the end of the surgical instrument in the three-dimensional prohibited area. nearest point and the relative distance between the two points. ; S52, Establishing a framework regarding Artificial obstacle avoidance vector field: Define all points on a 3D forbidden region as contained in a point set. In the middle, seek subset ,in For additional investigation radius; in, for The artificial obstacle avoidance vector field; For artificial vector length function, As a point Point of view The vector, when Different artificial vector length functions are established for different organizations. The conditions are met. ; It is a subset The number of elements in the middle; S53. Establish a virtual force field that includes repulsive force and viscous resistance: in, For virtual force fields; This is the proportionality coefficient between the repulsive force at that location and the artificial obstacle avoidance vector. The damping coefficient varies with relative distance. Move; for The speed.
3. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 1 or 2, characterized in that, S2 includes: S21, First define the first Frame image is ,in Indicates the width of the endoscopic image. Indicates the height of the endoscopic image; the first Frame image is First two-dimensional key points Its coordinates are ; Using the first two-dimensional key points The center of the first local region The first local region is determined based on the preset region shape and side length. ; S22. Image processing using optical flow method. Perform feature matching of feature points to obtain the image. Above and The center of the corresponding second local region , in, Representing an image Feature points on, Representing an image The number of feature points; S23, according to the center The second local region is determined by the preset region shape and side length. ; S24. Optical flow method is used again to directly determine the first two-dimensional key points. Initial 2D key points on the next frame image .
4. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 3, characterized in that, S3 includes: S31. Perform depth estimation on the endoscopic image to obtain a depth image corresponding to the endoscopic image; obtain the spatial and color information of each pixel from the depth image and the endoscopic image respectively by reading line by line to obtain the first local point cloud. Second local point cloud ; S32. Determine the first two-dimensional key points. First local point cloud The first three-dimensional key point , express arrive The mapping relationship between them is denoted as ; S33. Obtaining local regions using optical flow methods Feature point pairs, respectively denoted as and ,but and There is a coordinate transformation relationship between them: in, These are the parameters of the fitted function; by using least squares, It can be obtained from the following formula: but The transformation matrix of the affine transformation between them is: in, ; S34, By analyzing the first three-dimensional key points Perform a 3D affine transformation, in matrix form: Second local point cloud Search for the nearest point to obtain the second and third-dimensional key points. The initial position.
5. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 4, characterized in that, In step S31, depth estimation is performed on the endoscopic image to obtain a depth image corresponding to the endoscopic image. The binocular depth estimation network used has fast overlearning capabilities and can continuously adapt to new scenes using self-supervised information. Specifically, it includes: S311. Acquire binocular endoscope images and extract multi-scale features of the current frame image using the encoder network of the current binocular depth estimation network. S312. Using the decoder network of the current binocular depth estimation network, multi-scale features are fused to obtain the disparity of each pixel in the current frame image. S313. Based on the camera's intrinsic and extrinsic parameters, convert the parallax into depth and output it as the result of the current frame image; S314. Without introducing external ground truth, update the parameters of the current stereo depth estimation network using self-supervised loss for depth estimation of the next frame image.
6. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 3, characterized in that, The optimization function in S4 refers to: in, Represents the optimization function; Cosine similarity of SIFT feature vectors: in, Representing the first two-dimensional key point The feature descriptor is a vector; Represents the magnitude of a vector; second two-dimensional key points The coordinates are , Representing the second two-dimensional key points neighborhood points, and It is the second two-dimensional key point coordinate offset, Representing neighborhood points The feature descriptor is a vector; Indicating the impact of optical flow information: in, 。 7. The method for generating a prohibited area type three-dimensional virtual fixture as described in claim 6, characterized in that, In step S4, the two-dimensional coordinates of the tracked key points are obtained by minimizing a preset optimization function, and finally the corresponding three-dimensional coordinates are obtained, including: By traversal search and , can get and Make them satisfy the following expression: To obtain the ideal offset Then, obtain the key points obtained through tracking. The two-dimensional coordinates are obtained, and then the corresponding three-dimensional coordinates are obtained based on the mapping relationship between the endoscopic image and the point cloud.
8. A three-dimensional virtual fixture generation system for prohibited areas aimed at ensuring surgical safety, characterized in that, include: The selection module is used to read endoscopic images, obtain the prohibited area on the current frame image according to the doctor's selection, and obtain all the first two-dimensional key points within the prohibited area; The tracking module is used to track a first local region containing a first two-dimensional key point on the current frame image, obtain a second local region on the next frame image, and determine the initial two-dimensional key point of the first two-dimensional key point on the next frame image. The mapping module is used to map a first local region onto a first local point cloud and a second local region onto a second local point cloud according to the mapping relationship between the endoscopic image and the point cloud; and to determine the first three-dimensional key points of the first two-dimensional key points on the first local point cloud, and to obtain the second three-dimensional key points on the second local point cloud through coordinate transformation. The optimization module is used to map the second 3D key points back to the second local region and obtain the second 2D key points on the next frame image; By combining the initial two-dimensional key points and minimizing the preset optimization function, the two-dimensional coordinates of the tracked key points are obtained, and finally the corresponding three-dimensional coordinates are obtained. And by summarizing the three-dimensional coordinates of each key point, the three-dimensional forbidden area can be obtained; The constraint module is used to create a restricted area type virtual fixture on a three-dimensional restricted area. The restricted area virtual fixture constrains surgical instruments to enter the protected tissue through a force feedback mechanism.
9. A storage medium, characterized in that, It stores a computer program for generating prohibited area type three-dimensional virtual fixtures for surgical safety assurance, wherein the computer program causes the computer to execute the prohibited area type three-dimensional virtual fixture generation method as described in any one of claims 1 to 7.
10. An electronic device, characterized in that, include: One or more processors; Memory; And one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the programs including methods for performing the prohibited area type three-dimensional virtual fixture generation method as described in any one of claims 1 to 7.