Systems and methods for robotic endoscope system utilizing tomosynthesis and augmented fluoroscopy

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
The system addresses CT2BD and breath-hold inconsistencies in endoscope procedures by using tomosynthesis-based methods for accurate tool positioning and augmented fluoroscopy alignment, improving diagnostic accuracy and procedural efficiency in bronchoscopy.

WO2026142961A1PCT designated stage Publication Date: 2026-07-02NOAH MEDICAL CORP

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: NOAH MEDICAL CORP
Filing Date: 2025-12-19
Publication Date: 2026-07-02

Application Information

Patent Timeline

19 Dec 2025

Application

02 Jul 2026

Publication

WO2026142961A1

IPC: A61B34/20; A61B90/00; A61B6/02; A61B6/12; A61B34/10

AI Tagging

Technology Topics

Tomosynthesis3d image

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

System of determining patient-specific angular range for digital breast tomosynthesis and method thereof
US20260144511A1TomosynthesisMedical automated diagnosisTomosynthesisImage diagnosis
MODULAR X-RAY SOURCE AND METHOD FOR REPLACING AN X-RAY SOURCE TUBE FOR A MOTION-COMPENSATED TOMOSYNTHESE IMAGE SYSTEM
DE602022037581T2Engagement/disengagement of coupling parts Two-part coupling devicesTomosynthesisNuclear engineering
Tomosynthesis imaging system comprising guidance system with x-ray tomosynthesis registration and tracking
CN122070877AImage analysisTomosynthesisTomosynthesis3d image
Ultrasound diagnostic apparatus and control method of ultrasound diagnostic apparatus
US12648757B2Wave based measurement systems Organ movement/changes detectionTomosynthesisRadiology
System and method for cabinet radiography with tomosynthesis
US20260182935A1TomosynthesisNuclear medicine

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Current endoscope systems face challenges with CT-to-body divergence (CT2BD) due to discrepancies between electronically generated virtual targets and actual anatomical locations, leading to increased procedure length and diagnostic inefficiencies, particularly in bronchoscopy, and maintaining consistent breath-hold states across different imaging modalities is difficult, affecting lesion overlay accuracy.

Method used

The system employs tomosynthesis-based methods to determine the spatial relationship between medical tools and lesions with improved accuracy by analyzing tomosynthesis slices in the antero-posterior direction using segmentation or feature-based techniques, and employs augmented fluoroscopy to ensure accurate overlay alignment by detecting consistent physical conditions between imaging modalities.

Benefits of technology

This approach provides quantitative positional information, enhancing procedural guidance and reducing uncertainty in tool placement, thereby improving the accuracy and efficiency of medical interventions such as biopsies by ensuring correct tool positioning within target regions.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure US2025060621_02072026_PF_FP_ABST

Patent Text Reader

Abstract

A method and system for real‑time fluoroscopy guidance in robotic interventions. A tomosynthesis sweep of 2D fluoroscopic projections is used to reconstruct a 3D image of a target. The 3D target is projected as an overlay into live fluoroscopy frames. A best‑matching tomosynthesis projection is selected and a reference feature common to the reference projection and live frames is localized. Displacement of the reference feature is computed and compared to thresholds to determine whether the subject's physical state has changed and whether the overlay remains correctly positioned. The system provides qualitative indicators, recommends retake of tomosynthesis when misalignment exceeds tolerance, and can pause or limit robotic tool advancement until alignment is restored. Adaptive thresholding using a trained model is supported.

Need to check novelty before this filing date? Find Prior Art

Description

Atorney Docket No. 55441-738601SYSTEMS AND METHODS FOR ROBOTIC ENDOSCOPE SYSTEM UTILIZING TOMOSYNTHESIS AND AUGMENTED FLUOROSCOPYCROSS-REFERENCE

[0001] This application claims priority to U.S. Provisional Patent Application No. 63 / 739,170, filed on December 27, 2024, which is entirely incorporated herein by reference.BACKGROUND

[0002] Endoscopy (e.g., bronchoscopy) may involve accessing and visualizing the inside of a patient's lumen (e.g., airways) for diagnostic or therapeutic purposes. During a procedure, a flexible tubular tool such as, for example, an endoscope, may be inserted into the patient's body and an instrument can be passed through the endoscope to a tissue site identified for diagnosis or treatment.

[0003] The navigation of the endoscope and / or localization may involve utilizing a pre-planning CT scan to create an electronically generated virtual target. However, current endoscope systems can be prone to CT-to-body divergence (CT2BD). CT2BD is the discrepancy of the electronic virtual target and the actual anatomic location of the peripheral lung lesion. CT2BD can occur for a variety of reasons including atelectasis, neuromuscular weakness due to anesthesia, tissue distortion from the catheter system, bleeding, ferromagnetic interference, and perturbations in anatomy such as pleural effusions. In particular, CT2BD can increase the length of the procedure, frustrate the operator, and ultimately result in a nondiagnostic procedure.

[0004] Digital tomosynthesis algorithms have been recently introduced for the correction of CT2BD. Tomosynthesis (may also be referred to as “tomo”) is limited angle tomography in contrast to full-angle (e.g., 180-degree tomography). In some cases, features identified from tomosynthesis or cone beam computed tomography (CBCT) images that are acquired following patient intubation but before commencement of endoscopy (e.g., bronchoscopy) may be utilized to generate augmented fluoroscopy images. Augmented reality has previously been associated in biopsy with improvements in diagnostic accuracy, procedure time, and radiation dose.Specifically, augmented fluoroscopy may be utilized for reducing radiation exposure, without compromising diagnostic accuracy. Augmented fluoroscopy may display an augmented layer of information on top of live or real-time fluoroscopy view.

[0005] During procedures relying on real-time imaging for precision, maintaining consistent breath-hold states is necessary for accurate lesion overlay alignment. During medical imaging (either tomosynthesis or fluoroscopy), a patient may be instructed to hold their breath during the scan to minimize movement from breathing, resulting in clearer and more accurate images, particularly for organs that move with respiration like the lungs and liver. In some cases,Attorney Docket No. 55441-738601tomosynthesis imaging data are acquired at a breath-hold physical state or physical condition. The patient may be instructed to hold the breath again during the later augmented fluoroscopy. In the augmented fluoroscopy, because the lesion overlay is based on the lesion location in the most recent tomosynthesis image, the same breath hold state is required for acquiring the tomosynthesis image and the augmented fluoroscopy to ensure the correct location for the lesion overlay in the augmented fluoroscopy. Thus, the accuracy of the lesion location depends on the patient maintaining the same breath-hold state between the different imaging modalities.

[0006] Maintaining consistent breath-hold states can be difficult for several reasons. Patients may fail to hold their breath properly or may hold their breath at different conditions (depths). Physicians may time the inspiration of the breathing cycle in different inspiration phases of a breathing cycle. In some cases, variations in other parameters (e.g., pressure, air flow parameters etc. on the anesthesia machine) can also cause inconsistencies. Without proper synchronization, discrepancies reduce the effectiveness of real-time guidance and compromise procedural accuracy. If the patient’s breath-hold state changes between the tomosynthesis and the live fluoroscopy, the lesion overlay in the live fluoroscopy may not be at the correct location.SUMMARY

[0007] The present disclosure addresses the above issues by providing systems and method capable of detecting whether a physical condition / state of acquiring a first medical image data using a first imaging modality is consistent with a physical condition / state of acquiring a second medical image data using a second imaging modality. In particular, the first and second imaging modality may be a three-dimensional (3D) tomosynthesis imaging and a two-dimensional (2D) live fluoroscopy with augmented overlay. In some cases, the augmented overlay (e.g., lesion or other target object) may be displayed at a location determined based at least in part on information acquired from the 3D tomosynthesis image. The methods herein may be capable of automatically determining whether the location of the augmented overlay is correct in the live fluoroscopy by determining whether the physical condition / state of acquiring the 3D tomosynthesis image and the live fluoroscopy is consistent.

[0008] In another aspect, the present disclosure provides methods and systems for determining whether a medical tool, such as a biopsy needle, is accurately positioned within a target region (e.g., a lesion) using tomosynthesis imaging data. Tomosynthesis, as a limited-angle tomography technique, may generate three-dimensional (3D) reconstructions from multiple two-dimensional (2D) fluoroscopic images acquired at various angles. Although tomosynthesis offers the advantage of reduced scan time and radiation exposure compared to full 360-degree computed tomography (CT) scans, its resolution can vary, particularly in the depth (anterior-posterior, AP)Attorney Docket No. 55441-738601direction. This variation in resolution may complicate the task of confirming the positional relationship between a thin medical instrument and a lesion in the patient’s anatomy.

[0009] The present disclosure provides methods and systems may employ a tomosynthesis-based tool-in-lesion decision method that provides improved accuracy and efficiency. By identifying and evaluating the spatial relationship of the tool and the lesion in the depth direction, the method and systems may quantitatively determine whether the tool resides inside the lesion. Such an approach may involve analyzing tomosynthesis slices in the antero-posterior (AP) direction, using segmentation or feature-based techniques to isolate the tool and the lesion, and then comparing their relative depths. If the tool’s position aligns with the lesion’s location in a slice with sufficient resolution, the system may confirm that the tool is indeed located within the target region.

[0010] Unlike conventional systems and methods that rely on multiple orthogonal planes or manual inspection, the systems and methods disclosed herein provide quantitative, depth-based positional information enhancing procedural guidance, reducing uncertainty in tool placement, and ultimately support better clinical outcomes during medical interventions such as biopsies.

[0011] Digital tomosynthesis algorithms have been recently introduced for the correction of CT2BD. Tomosynthesis (may also be referred to as “tomo”) is limited angle tomography in contrast to full-angle (e.g., 180-degree tomography). However, tomosynthesis reconstruction does not have uniform resolution. For instance, resolution is often the poorest in the depth direction. The standard way to show a 3D volume dataset by three orthogonal planes (e.g., axial, sagittal and coronal) may be ineffective since two of the planes have poorer resolution. A common way to view tomosynthesis volume is to scroll in the depth direction where each slice has good resolution. In the case of pulmonology, it is viewed in the coronal plane and goes through the anterior-posterior (AP) direction by scrolling. Yet this has caused difficulty in determining the spatial relationship of the structures in the depth direction. It can be challenging to determine whether a tool e.g., biopsy needle) is inside a lesion in the AP direction of a chest tomosynthesis reconstruction.

[0012] A need exists for methods and systems capable of determining whether a tool is within a target (e.g., lesion) with improved accuracy or efficiency. The present disclosure addresses the above need by providing a tomosynthesis-based tool-in-lesion decision method with improved accuracy and efficiency. In particular, the provided method may provide a user with quantitative information of the spatial relationship of a thin tool and a target region (e.g., lesion) in the depth direction. The methods, systems, computer-readable media, and techniques herein may identify the positional relationship of the tool and the lesion (in the depth direction) by identifying theirAttorney Docket No. 55441-738601depth separately and determine whether the (thin) tool is within the lesion in a quantitative manner.

[0013] The method herein may be applied after a robotic platform is set up, target lesions are identified and / or segmented, an airway registration is performed, and an individual target lesion is selected. The method herein may be applied during or after a navigation process to identify a position of a portion of the tool relative to a target. An endoscopy navigation system may use different sensing modalities (e.g., camera imaging data, electromagnetic (EM) position data, robotic position data, etc.). In some cases, the navigation approach may depend on an initial estimate of where the tip of the endoscope is with respect to the airway to begin tracking the tip of the endoscope. Some endoscopy techniques may involve a three-dimensional (3D) model of a patient's anatomy (e.g., CT image), and guide navigation using an EM field and position sensors.

[0014] In some cases, a 3D image of a patient’s anatomy may be taken one or more times for various purposes. For instance, prior to a medical procedure, a 3D model of a patient anatomy may be created to identify the target location. In some cases, the precise alignment (e.g., registration) between the virtual space of the 3D model, the physical space of the patient's anatomy represented by the 3D model, and the EM field may be unknown. As such, prior to generating a registration, endoscope positions within the patient's anatomy cannot be mapped with precision to corresponding locations within the 3D model. In another instance, during surgical operation, 3D imaging may be performed to update / confirm the location of the target (e.g., lesion) in the case of movement of the target issue or lesion.

[0015] In some cases, fluoroscopic imaging systems may be used to determine the location and orientation of medical instruments and patient anatomy within the coordinate system of the surgical environment via fluoroscopy (may also be referred to as “fluoro”). Fluoroscopy is a method providing real-time X-ray imaging. In order for the imaging data to assist in correctly localizing the medical instrument, the coordinate system of the imaging system may be needed for reconstructing the 3D model. For example, multiple 2D fluoroscopy images may be used to create tomosynthesis or Cone Beam CT (CBCT) reconstruction to better visualize and provide 3D coordinates of the anatomical structures. During a CBCT scan, a CBCT scanner may acquire projections along a rotation of 180°- 360° angle (i.e. a full rotation of x-ray source and detector) over the region of interest to obtain a volumetric data set. The scanning software collects the data and reconstructs it, producing a digital volume composed of three-dimensional voxels of anatomical data that can then be manipulated and visualized. Tomosynthesis is similar to CBCT scan but uses a limited rotation angle (e.g., 15-60 degrees) thus it has a reduced scanning time as compared to CBCT. Tomosynthesis has an additional benefit over CBCT in that the limited range of motion required for tomosynthesis allows it to be used in more constrained patientAttorney Docket No. 55441-738601settings where full 360° access around the patient is challenging to achieve during a procedure. Tomosynthesis may be performed to determine the location and orientation of medical instruments and patient anatomy. However, conventional tomosynthesis has poor depth resolution (AP direction) causing difficulty in determining whether a tool is within a target region (e.g., lesion) or the position of a thin tool relative to a target region. Systems, methods, and techniques herein beneficially provide tool-in-lesion confirmation in a quantitative manner thereby improving the accuracy and correctness of localizing the tool (e.g., needle) with respect to the target region. As utilized herein, the term CBCT may also refer to tomosynthesis which are utilized interchangeably throughout the specification unless the context indicates otherwise.

[0016] As mentioned above, tomosynthesis or CBCT reconstruction of anatomical structures involves combining data from images of 2D projections taken at a plurality of angles with respect to an anatomical structure, and combining the plurality of 2D images to reconstruct a 3D view of the anatomical structure. The mathematical process of combing the 2D projections to create a 3D view requires as an input the relative poses (angles and position) of the camera at which each of the 2D projections is recorded. In some cases, the methods herein may employ pose estimation methods to obtain the relative pose of the camera. For instance, the relative poses of the camera may be obtained by using features within the images themselves. In some examples, when markers e.g., an array of artificial markers with known positions, or natural features such as bone) are captured within the images, then the relative positions of the markers to one another within the 2D projection may be processed using computer vision methods to estimate the pose of the camera in the 3D world reference frame. In other cases, the pose of the camera at which each of the 2D projections is recorded may be obtained from independent measurements of the camera location and orientation (e.g., accelerometer, IMU, separate imaging device, or other orientation sensors). The present disclosure may utilize the abovementioned methods to generate the construction of 3D views from a combination of 2D projections.

[0017] In some cases, features identified from tomosynthesis or CBCT images that are acquired following patient intubation but before commencement of bronchoscopy may be utilized to generate augmented fluoroscopy images. Augmented reality has previously been associated in biopsy with improvements in diagnostic accuracy, procedure time, and radiation dose.Specifically, augmented fluoroscopy may be utilized for reducing radiation exposure, without compromising diagnostic accuracy. Augmented fluoroscopy may display an augmented layer of information on top of live or real-time fluoroscopy view.

[0018] As described above, the overlay of the target (e.g., lesion) displayed in the augmented fluoroscopy may be based on the information obtained from the tomosynthesis or CBCT images.Attorney Docket No. 55441-738601However, a change in the physical condition or state of acquiring the image data between the tomosynthesis and the fluoroscopy may result in an incorrect location of the overlay.

[0019] In an aspect of the present disclosure, a method is provided for determining a location of an overlay in a real-time fluoroscopy image is valid or not. The invalid or incorrect location of the overlay may be caused by a change of physical condition or state of acquiring the fluoroscopy image compared to that of acquiring a tomosynthesis image data. The method may comprises: (a) acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, where the sequence of 2D fluoroscopy images is acquired at various angles; (b) reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles; (c) acquiring live fluoroscope image frames at a specific angle, the live fluoroscope image frames contain the target feature; (d) displaying an overlay of a projection of the target feature onto the live fluoroscope image frames, the projection of the target feature is based at least in part on the target feature in the reconstructed 3D image; and (e) determining whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference feature identified in a 2D fluoroscopy image selected from the sequence of 2D fluoroscopy images and identified in the live fluoroscope image frames.

[0020] In some embodiments, the method comprises comparing the displacement against a pre-determined threshold and upon determining the displacement is greater than the threshold, displaying an indicator indicating the overlay is not at the correct location. In some cases, the correct location is indicative of a physical condition of a subject during (a) and (c) being substantially the same. For example, the indicator is indicative of an alignment level of the physical condition of the subject between (a) and (c).

[0021] In some embodiments, the indicator represents an alignment level of the subject’s physical condition between the tomosynthesis acquisition and the live fluoroscopy acquisition. The indicator may present qualitative categories such as GREAT, GOOD, or POOR that map to multi-level threshold ranges and that are displayed on the graphical user interface to communicate overlay validity at a glance. In some cases the physical condition is a breath holding state. Matching of breath hold states ensures that anatomy positions used to generate the augmented overlay correspond to the patient’s anatomy during live guidance and thereby reduces the risk of inaccurate target localization.

[0022] In some embodiments the 2D fluoroscopy image selection is performed by matching viewpoint information between tomosynthesis projections and the live fluoroscopy viewpoint. In certain embodiments the selected projection is the tomosynthesis projection having the minimal angular difference to the live viewpoint. In further embodiments the system determines whether the angular difference is within a predetermined acceptance threshold and when the angularAttorney Docket No. 55441-738601difference exceeds the acceptance threshold the system performs a 2D-2D registration between the selected tomosynthesis projection and the live fluoroscopy frames prior to displacement computation. Pose metadata, bead pattern analysis or image-based pose estimation may be used to support selection and registration.

[0023] In some embodiments the reference feature comprises the diaphragm. In such embodiments the diaphragm serves as a physiologic landmark whose cranio-caudal displacement between acquisitions reflects a change in lung inflation or breath hold. In certain embodiments the diaphragm displacement is measured using contour-based object tracking algorithms where the diaphragm boundary is segmented and contour descriptors are used to compute displacement metrics. In some embodiments the reference feature comprises one or more anatomical or image features. Examples include rib or vertebral landmarks, vessel bifurcations, implanted fiducials, or automatically selected salient keypoints near the target. In some instances displacement of these features is measured using feature-based object tracking algorithms, where keypoints are detected, matched, and motion vectors are computed to derive displacement statistics.

[0024] In some embodiments the method displays an indicator to indicate whether the overlay is at the correct location. In certain embodiments the indicator represents an alignment level of the subject’s physical condition between acquisitions. In some cases when the indicator signals that the overlay is not correctly located the system additionally displays guidance or procedural information related to retaking the tomosynthesis sweep, for example a textual recommendation to request a breath hold and re-acquire the tomosynthesis images. In some embodiments the target feature is target tissue to be operated upon by a robotic tool. In certain embodiments the live fluoroscopy frames show a distal portion of the robotic tool and the overlay of the target feature to guide navigation of the tool. In some embodiments the tool is a biopsy needle delivered through a robotic endoscope, and the overlay validity assessment supports safe needle advancement and tool-in-lesion confirmation.

[0025] The method described above is implemented in software and stored on non-transitory computer readable media. The stored instructions, when executed by a processor, cause the processor to perform the acquisition, reconstruction, overlay rendering, reference projection selection, displacement computation, and overlay validity determination operations described herein.

[0026] In another aspect of the present disclosure, a method for real-time fluoroscopy image motion detection is provided, comprising (a) acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature. In some cases, the sequence of 2D fluoroscopy images is acquired at various angles. In some embodiments, the method comprises (b) reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy imagesAttorney Docket No. 55441-738601and the various angles. In some embodiments, the method comprises (c) selecting a 2D fluoroscopy image from the sequence of 2D fluoroscopy images based at least in part on the angle associated with the sequence of 2D fluoroscopy images and a viewpoint angle for acquiring 2D live fluoroscope imaging; (d) determining a displacement of a reference feature captured in the selected 2D fluoroscopy image and captured in the 2D live fluoroscope imaging; and (e) determining a difference in a physical state of a subject captured in the sequence of 2D fluoroscopy images and the 2D live fluoroscope imaging.

[0027] In some embodiments, the physical state is a breath holding state. In some embodiments, the selected one 2D fluoroscopy images has the associated angle closest to the viewpoint angle. In some cases, the method further comprises selecting the 2D fluoroscopy image as the projection from the sequence of 2D fluoroscopy images having a minimal angular difference with the viewpoint angle. For example, when an angular difference between the selected 2D fluoroscopy image and the viewpoint angle exceeds a predetermined angle threshold, performing a 2D-2D registration between the selected 2D fluoroscopy image and the 2D live fluoroscope image prior to determining the displacement.

[0028] In some embodiments, the reference feature comprises at least one anatomical landmark selected from the group consisting of a diaphragm contour, a rib landmark, a vertebral landmark, a vessel bifurcation, and an implanted fiducial. In some embodiments, determining the displacement comprises using a contour-based object tracking algorithm to segment and extract a contour of the reference feature in both the selected 2D fluoroscopy image and the 2D live fluoroscope image and computing one or more contour displacement metrics. In some embodiments, determining the displacement comprises using a feature-based method including detection of keypoints in the selected 2D fluoroscopy image and the 2D live fluoroscope image, matching corresponding keypoints, and computing displacement as a function of matched keypoint motion vectors.

[0029] In some embodiments, the method further comprises comparing the displacement to a predetermined displacement threshold and, upon determining the displacement is greater than the predetermined threshold, generating an indicator that the physical state difference exceeds an acceptable tolerance. For example, upon determining that the physical state difference exceeds a threshold, automatically signaling a robotic control module for the robotic system to pause or limit advancement of a tool of the robotic system until alignment is restored or operator confirmation is received. In some embodiments, the method comprises adaptively determining one or more displacement thresholds using a trained machine-learning model that receives inputs comprising imaging metadata, target size, reconstruction confidence, and historical alignmentAttorney Docket No. 55441-738601outcomes, and updating the model based on logged operator overrides and subsequent alignment corrections.

[0030] Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure.Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.INCORPORATION BY REFERENCE

[0031] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. To the extent publications and patents or patent applications incorporated by reference contradict the disclosure contained in the specification, the specification is intended to supersede or take precedence over any such contradictory material.BRIEF DESCRIPTION OF THE DRAWINGS

[0032] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings (also “Figure” and “FIG.” herein), of which:

[0033] FIG. 1 shows an example process of tomosynthesis image reconstruction.

[0034] FIG. 2 shows an example process of augmented fluoroscopy overlay generation.

[0035] FIG. 3 shows an example system of various state machines.

[0036] FIG. 4 shows an example of a configuration state machine.

[0037] FIG. 5 shows example state machine logic.

[0038] FIGs. 6A-C shows an example tomosynthesis board marker design.

[0039] FIG. 7A shows an example of blob detection of markers on an image of a tomosynthesis board.

[0040] FIG. 7B shows an example of candidate points on an image of a tomosynthesis board.

[0041] FIG. 7C shows an example of marker extraction on an image of a tomosynthesis board.

[0042] FIG. 8 shows an example process for robust tomosynthesis marker matching.Attorney Docket No. 55441-738601

[0043] FIG. 9 shows an example result for marker tracking across a tomosynthesis frame sequence on an image of a tomosynthesis board.

[0044] FIG. 10 shows an example of a camera pose estimation.

[0045] FIG. 11 shows an example of augmented fluoroscopy projection.

[0046] FIG. 12 shows examples of robotic bronchoscopy systems, in accordance with some embodiments of the disclosure.

[0047] FIG. 13 shows an example of a fluoroscopy (tomosynthesis) imaging system.

[0048] FIG. 14 and FIG. 15 show examples of a flexible endoscope.

[0049] FIG. 16 shows an example of an instrument driving mechanism providing mechanical interface to the handle portion of a robotic bronchoscope.

[0050] FIG. 17 shows an example of a distal tip of an endoscope.

[0051] FIG. 18 shows an example distal portion of the catheter with integrated imaging device and the illumination device.

[0052] FIG. 19 shows an example of a user interface comprising a tomosynthesis dashboard.

[0053] FIG. 20 shows an example of a user interface comprising a C-arm settings dashboard.

[0054] FIG. 21 shows an example of a user interface comprising a scope selection dashboard.

[0055] FIG. 22 shows an example of a user interface comprising a selection crosshair panel.

[0056] FIG. 23 shows an example of a user interface comprising a lesion selection dashboard.

[0057] FIG. 24 shows an example of a user interface comprising an augmented fluoroscopy panel.

[0058] FIG. 25 shows an example of a user interface for driving or navigating the endoscope.

[0059] FIG. 26 shows an example of the virtual endoluminal view displaying a target.

[0060] FIG. 27 shows a computer system that is programmed or otherwise configured to implement methods provided herein.

[0061] FIG. 28A illustrates an example method for real-time fluoroscopy imaging for a robotic system with real-time quality assessment; FIG. 28B illustrates another example method for realtime fluoroscopy imaging.

[0062] FIGs. 29A-29C show an example of a user interface for displaying fluoroscopic overlay alignment conditions.

[0063] FIG. 30 shows an example of a user interface presenting a live fluoroscopy view integrated with a CT cross-section and a 3D airway reconstruction.

[0064] FIG. 31 shows an example user interface (UI) illustrating how consistent or inconsistent breath-hold states affect lesion localization and guidance match.

[0065] FIG. 32 shows an example of a workflow integrating tomosynthesis reconstructions with real-time augmented fluoroscopy imaging.Attorney Docket No. 55441-738601

[0066] FIG. 33 shows an example of methods for detecting differences using contour-based and feature-based approaches.

[0067] FIG. 34 shows an example of a graphical user interface (GUI) for contour-based analysis indicating pixels enclosed by a contour and difference views.

[0068] FIG. 35 show examples of a GUI displaying the location difference information using the contour-based method.

[0069] FIG. 36 shows an example of a feature-based method analyzing motion fields and lesion displacement.

[0070] FIG. 37 shows an example of a feature-based motion analysis method illustrating computed differences and threshold discrepancies over time.

[0071] FIG. 38 shows an example of how the augmented fluoroscopy system employs respiratory gating to assess and maintain alignment accuracy.DETAILED DESCRIPTION

[0072] While exemplary embodiments will be primarily directed at tomosynthesis, augmented fluoroscopy, a bronchoscope, etc., one of skill in the art will appreciate that this is not intended to be limiting, and the systems, methods, and techniques described herein may be used for other therapeutic or diagnostic procedures and in other anatomical regions of a patient’s body such as a digestive system, including but not limited to the esophagus, liver, stomach, colon, urinary tract, or a respiratory system, including but not limited to the bronchus, the lung, and various others.

[0073] The embodiments disclosed herein can be combined in one or more of many ways to provide improved diagnosis and therapy to a patient. The disclosed embodiments can be combined with existing methods and apparatus to provide improved treatment, such as combination with known methods of pulmonary diagnosis, surgery and surgery of other tissues and organs, for example. It is to be understood that any one or more of the structures and steps as described herein can be combined with any one or more additional structures and steps of the methods and apparatus as described herein, the drawings and supporting text provide descriptions in accordance with embodiments.

[0074] Although the treatment planning and definition of diagnosis or surgical procedures as described herein are presented in the context of pulmonary diagnosis or surgery, the methods and apparatus as described herein can be used to treat any tissue of the body and any organ and vessel of the body such as brain, heart, lungs, intestines, eyes, skin, kidney, liver, pancreas, stomach, uterus, ovaries, testicles, bladder, ear, nose, mouth, soft tissues such as bone marrow, adipose tissue, muscle, glandular and mucosal tissue, spinal and nerve tissue, cartilage, hard biological tissues such as teeth, bone and the like, as well as body lumens and passages such as the sinuses, ureter, colon, esophagus, lung passages, blood vessels and throat.Attorney Docket No. 55441-738601

[0075] As used herein a processor encompasses one or more processors, for example a single processor, or a plurality of processors of a distributed processing system for example. A controller or processor as described herein generally comprises a tangible medium to store instructions to implement steps of a process, and the processor may comprise one or more of a central processing unit, programmable array logic, gate array logic, or a field programmable gate array, for example. In some cases, the one or more processors may be a programmable processor (e.g., a central processing unit (CPU) or a microcontroller), digital signal processors (DSPs), a field programmable gate array (FPGA) or one or more Advanced RISC Machine (ARM) processors. In some cases, the one or more processors may be operatively coupled to a non-transitory computer-readable medium. The non-transitory computer-readable medium can store logic, code, or program instructions executable by the one or more processors unit for performing one or more steps. The non-transitory computer-readable medium can include one or more memory units (e.g., removable media or external storage such as an SD card or random access memory (RAM)). One or more methods or operations disclosed herein can be implemented in hardware components or combinations of hardware and software such as, for example, ASICs, special purpose computers, or general purpose computers.

[0076] As used herein, the terms distal and proximal may generally refer to locations referenced from the apparatus and can be opposite of anatomical references. For example, a distal location of a bronchoscope or catheter may correspond to a proximal location of an elongate member of the patient, and a proximal location of the bronchoscope or catheter may correspond to a distal location of the elongate member of the patient.

[0077] As used herein, the terms “physical state,” or “physical condition,” in the context of acquiring image data may generally refer to any state or condition that can result in a misalignment of the location of the same feature appeared in different images acquired using different modalities. For example, the “physical state,” or “physical condition” may be related to a subject’s physical state such as breath-holding state (e.g., different inspiration phases of a breathing cycle, breath holding depth, etc.) that can cause a change in the location of an anatomical feature (e.g., lesion, diaphragm, etc.).

[0078] A system as described herein, includes an elongate portion or elongate member such as a catheter. The terms “elongate member”, “catheter”, “bronchoscope” is used interchangeably throughout the specification unless contexts indicate otherwise. The elongate member can be placed directly into the body lumen or a body cavity. In some embodiments, the system may further include a support apparatus such as a robotic manipulator (e.g., robotic arm) to drive, support, position or control the movements or operation of the elongate member. Alternatively or in addition to, the support apparatus may be a hand-held device or other control devices that mayAttorney Docket No. 55441-738601or may not include a robotic system. In some embodiments, the system may further include peripheral devices and subsystems such as imaging systems that would assist or facilitate the navigation of the elongate member to the target site in the body of a subject. Such navigation may require a registration process which will be described later herein. The term “breath-hold state” describes a patient’s act of holding their breath during image acquisition. Being able to detect an inconsistency of breath-hold state (due to internal organ movement), beneficially allows a user to be informed about the validity or accuracy of the overlay in the live fluoroscopy frames thereby enhancing the reliability of lesion localization.

[0079] As used herein, “tomosynthesis” refers to a limited-angle imaging technique in which multiple two-dimensional (2D) X-ray images are acquired at various angles within a restricted angular range, and subsequently reconstructed into a three-dimensional (3D) image. Unlike a full 360° computed tomography (CT) scan, tomosynthesis typically uses a narrower angular sweep, resulting in reduced radiation dose and shorter acquisition times. Although the resulting 3D reconstruction may have lower fidelity compared to a CT scan, it provides sufficient anatomical detail to guide interventional procedures.

[0080] In some embodiments of the present disclosure, a robotic bronchoscopy system is provided for performing surgical operations or diagnosis with improved performance at low cost. For example, the robotic bronchoscopy system may comprise a steerable catheter that can be entirely disposable. This may beneficially reduce the requirement of sterilization which can be high in cost or difficult to operate, yet the sterilization or sanitization may not be effective.Moreover, one challenge in bronchoscopy is reaching the upper lobe of the lung while navigating through the airways. In some cases, the provided robotic bronchoscopy system may have the capability to navigate through the airway having a small bending curvature in an autonomous or semi-autonomous manner. The autonomous or semi-autonomous navigation may require a registration process. Alternatively, the robotic bronchoscopy system may be navigated by an operator through a control system with vision guidance.

[0081] A typical lung cancer diagnosis and surgical treatment process can vary drastically, depending on the techniques used by healthcare providers, the clinical protocols, and the clinical sites. The inconsistent processes may cause delay to diagnose lung cancers in early state, lead to high cost of healthcare system for the patients to diagnose and treat lung cancers, and may cause high risk of clinical and procedural complications. The robotic bronchoscopy system herein may utilize integrated tomosynthesis to improve lesion visibility and tool-in-lesion confirmation, utilize augmented fluoroscopy allowing for real-time navigation updates and guidance in all areas of the lung, thus allowing for standardized early lung cancer diagnosis and treatment.Attorney Docket No. 55441-738601

[0082] FIG. 1 shows an example process 100 of tomosynthesis image reconstruction. In some cases, the tomosynthesis image reconstruction of the process 100 may comprise generating a 3D volume with a combination of X-ray projection images acquired at different angles (acquired by any type of C-arm systems). FIG. 2 shows an example process 200 of providing augmented fluoroscopy. The augmented fluoroscopy process 200 may comprise projecting a 3D lesion onto the 2D X-ray image as an overlay. The augmented fluoroscopy may display any number of overlays corresponding to multiple lesions or targets. The augmented fluoroscopy may display an overlay for any desired features in addition to a lesion or target. The tomosynthesis imaging mode and the augmented fluoroscopy mode can be accessed from any state (e.g., during navigation from the driving mode, during performance of operations at the target site, etc.) during an operation session.

[0083] Both the process 100 and the process 200 may begin, in some cases, with obtaining C-arm or O-arm video or imaging data using an imaging apparatus such as C-arm imaging system 105, 205, respectively. The C-arm or O-arm imaging system may comprise a source (e.g., an X-ray source) and a detector (e.g., an X-ray detector or X-ray imager). A C-arm imaging system has one or more X-ray sources opposites one or more X-ray detectors and arranged on an arm 1340 that has a “C” shape 1340, where the C-arm may be rotated through some range of angles around a patient. An O-arm is similar to a C-arm but consists of a complete unbroken ring (an “O”) and may be rotated through 360° around a patient. As utilized herein, the term O-arm may be utilized interchangeably throughout the specification with the term C-arm unless the context indicates otherwise.

[0084] In some cases, a single C-arm source may provide video or imaging data for the two processes 100 and 200. In some cases, different C-arm sources may provide video or imaging data for the two processes 100 and 200. In some embodiments, the raw video frames may be used for both tomosynthesis and fluoroscopy. However, tomosynthesis may require unique frames from the C-arm, while fluoroscopic view or augmented fluoroscopy may operate using duplicate frames from the C-arm as it is live video, the methods herein may provide a unique frame checking algorithm such that the video frames for tomosynthesis are processed to ensure uniqueness. For example, as illustrated in the process 160, upon receiving a new image frame, if the current mode is tomosynthesis, the image frame may be processed to determine whether it is a unique frame or a duplicate. The uniqueness check may be based on image intensity comparison threshold. For example, a duplicate frame may be identified by comparing the overall average intensity between two frames, or summing over all pixels the absolute difference in intensity between the same pixel in two frames, or summing over the square or other power of the difference in intensity between the same pixel in two frames. For example, when the intensityAttorney Docket No. 55441-738601difference against a previous frame is below a predetermine threshold, the frame may be identified as a duplicate frame and may be removed from being used for tomosynthesis reconstruction. In some cases, the uniqueness or duplicate frame may be identified based on other factors. For instance, the uniqueness check may be based on changes in stochastic noise within the image, even with identical average image intensity. As an example, a frame may be identified as duplicate based on identical average image intensity, but the frame may still be determined as unique if a per pixel comparison shows differences between images. If the current mode is fluoroscopy, the image frame may not be processed for checking uniqueness.

[0085] As illustrated, the two processes 100 and 200, may detect the video or imaging frames from the C-arm source at 110 and 210, respectively. In some cases, the video or imaging frames may be normalized. In some cases, normalization may be applied to the image frame to change the range of pixel intensity values in the video or imaging frames. In general, normalization may transform an n-dimension grayscale image I : {X Rn-> {Min, ... , Max} with intensity values in the range (Min, Max) into a new image INEw:Rn{MinNEW, ... , MaxNEW} with intensity values in the range (MinNEW, MaxNEW). Examples of possible normalization techniques that may be applied to the C-arm video or image frames in the two processes 100 and 200 (e.g., at 110 or 210), may comprise one or more of linear scaling, clipping, log scaling, z-score, or any other suitable types of normalization.

[0086] Accurate camera poses and camera parameters are important for both tomosynthesis image reconstruction and augmented fluoroscopy overlay. The accuracy of marker tracking can affect the pose estimation accuracy or performance. The present disclosure provides an improved method for tracking markers in a sequence of video frames. The method may allow for tomosynthesis reconstruction with improved success rate, allow for larger sweeping angles for tomosynthesis imaging, remove ghosting (due to wrong pose estimation from frame marker mistracking) in the 3D reconstructed tomosynthesis image, improve reconstruction quality by using all images and using more uniform angle sampling, and speed up the tomosynthesis reconstruction process.

[0087] The present disclosure may provide an improved and robust marker tracking methods with improved success rate and higher speed. As shown in the two processes 100 and 200, the same marker detection at 115 and 215, respectively, may be shared in both processes. As will be discussed in further detail in FIGs. 6A-6C, which depict one example of a tomosynthesis board, X-ray projections of markers on a tomosynthesis board may be markers in the X-ray image (obtained via the C-arm, for example). The markers may be detected at 115 and 215 using any suitable image processing techniques. For example, OpenCV’s blob detection algorithm may be used to detect markers that are blob-shaped. In some cases, the detected markers (e.g., blobs)Attorney Docket No. 55441-738601may be detected to have certain properties, such as position, shape, size, color, darkness / lightness, opacity, or other suitable properties of markers.

[0088] As illustrated, the two processes 100 and 200, may match markers to a board pattern at 120 and 220, respectively. The markers detected at operations 115 and 215 may be matched to the tomosynthesis board (e.g., the tomosynthesis board described with respect to FIG. 6). As described above, the markers may exhibit any number of various physical properties (e.g., position, shape, size, color, darkness / lightness, opacity, etc.) that may be detected at 115 and 215 and may be used for matching the markers to the board pattern at 120 and 220. For example, the tomosynthesis board may have different types of markers such as large blobs and small blobs. In some cases, the large blobs and small blobs may create a pattern which may be used to match the marker pattern in the video or image frames to the pattern on the tomosynthesis board. In some cases, after operations 120 and 220, the processes 100 and 200 may diverge.

[0089] As illustrated, after the operation of matching markers to the board pattern 120, the process 100 may find the best marker matching across all video or image frames at 125. The initial marker matching may be the match between markers in the frames and the tomosynthesis board. In some cases, the pattern of the matched markers may be compared over the tomosynthesis board to find the best matching using the Hamming distance. For each frame, the matching with a pattern matching score (e.g., number of matched markers divided by total number of detected markers) may be obtained. The best match may be determined as the match with the highest pattern matching score among all the frames at 125. In some cases, one or more image frames with top pattern matching scores may be identified.

[0090] The process 100 may perform frame-to-frame tracking 130. At a high level, the frame-to-firame tracking 130 may comprise one or more of propagating the marker matching from the best match determined at 125 to the rest of the image frames by a robust tomosynthesis marker tracking. In some cases, (i) the markers in a pair of consecutive frames may be initially matched; (ii) each marker in the first frame may then be matched to the k-nearest markers in a second frame; (iii) for each matched pair of markers, a motion displacement between two frames may be computed; (iv) all the markers in the first frame may be transferred to the second frame with the motion displacement; (v) if the motion displacement between a given transferred point from the first frame and a given point location in the second frame is smaller than a threshold, and the two given marker types are the same, then this match may be an inlier; (vi) the best matching may be the motion with the most inliers. From the computed tomosynthesis marker tracking 130, the existing marker matches in the current frame are transferred to the marker matches in the next frame. This process may be repeated for all frames at 135, finding the marker matches for all frames, where the markers in all frames are matched to the tomosynthesis board.Attorney Docket No. 55441-738601

[0091] In the augmented fluoroscopy process 200, after matching markers in the video or image frames to the tomosynthesis board at 220, may determine if the pattern matching is unique at 225.The camera pose estimation using markers for augmented fluoroscopy may be more challenging than that for tomosynthesis reconstruction, because (i) only a single video or image frame may be available for augmented fluoroscopy and (ii) the motion information may not be available for removing the ambiguity of the pose estimation. The augmented fluoroscopy algorithm may provide criteria to measure the uniqueness of the matching to the entire tomosynthesis board. In some cases, the marker pattern on the tomosynthesis board may be configured to ensure that the pattern in each sub-area is unique. In some cases, the pattern of the tomosynthesis board may be improved to maximize the Hamming distances between patches (e.g., any 5x5 patches). In some cases, an in-plane 180-degree rotation may be considered when optimizing the best pattern so that the coincidental alignment is minimized if the board is rotated by 180-degrees either physically or by C-arm setting. Details about the patch / marker matching algorithm and the unique marker design are described later herein.

[0092] If the matching is unique, according to the criteria for measuring uniqueness 225, the camera pose may be correctly estimated and the process 200 may advance to pose estimation operation 230. Otherwise, at 225, the augmented fluoroscopy overlay is not displayed and the process 200 advances to operation 250 which may indicate augmented fluoroscopy overlay is available.

[0093] Turning to imaging device pose estimation, the processes 100, 200, respectively may recover rotation and translation by minimizing the reprojection error from 3D-2D point correspondences to perform the pose estimation 140, 230. In some cases, Perspective-n-Point (PnP) pose computation may be used to recover the camera poses from n pairs of point correspondences. The minimal form of the PnP problem may be P3P and may be solved with three point correspondences. For each tomosynthesis frame, there may be multiple marker matches, and an estimation method such as RANSAC (Random Sampling with Consensus) variant of PnP solver may be used for posing estimation. In some cases, the pose estimation 140, 230 may be further refined by minimizing the reprojection error using a non-linear minimization method and starting from the initial pose estimate with the PnP solver.

[0094] At the tomosynthesis reconstruction 145, the process 100 may perform the tomosynthesis reconstruction based on the pose estimation 140. In some cases, the tomosynthesis reconstruction operation 145 may be implemented as a model in Python (or other suitable programming languages) using the open-source ASTRA (a MATLAB and Python toolbox of high-performance GPU primitives for 2D and 3D tomography) toolbox (or other suitable toolboxes or packages). In the tomosynthesis reconstruction, input to the model may be as follows: (i) undistorted andAttorney Docket No. 55441-738601inpainted (inpainting: a process to restore damaged image) projection images; (ii) estimated projection matrices, such as poses of each projection; and (iii) size, resolution, and estimated position of the targeted tomosynthesis reconstruction volume. The output of the model is the tomosynthesis reconstruction (e.g., volume in NifTI format) 145. As such, at operation 150, the process 100 may, in some cases, finish with outputting the tomosynthesis reconstruction for the C-arm systems, where the tomosynthesis reconstruction may comprise one or more of a 3D-volume with a combination of X-ray projection images acquired by the C-arm at various angles.

[0095] The operation 235 may comprise using the estimated pose from operation 230 and precalibrated camera parameters from operation 245 to project the lesion onto the videoframe. As an example, the lesions may be modeled as ellipsoids that are projected on the 2D fluoroscopic image from the video or image frames as ellipses. It should be noted that the lesion may be modeled using a graphical indicator of any suitable shape, color, transparency, or the like. The augmented fluoroscopy overlay may be displayed on top of the live or near real-time fluoroscope view corresponding to the lesion projected onto the x-ray image 240. The lesion may be 3D lesion and the 3D lesion are projected to the 2D fluoroscopic image based at least in part on the camera matrix or the pose estimation associated with each 2D fluoroscopic image. Information about the lesion may comprise one or more of 3D location information obtained from the tomosynthesis process. In some cases, shape and size of the lesion may be based on a 3D model of the lesion (created from pre-operation CT or any predetermined parameters). Details about obtaining the lesion information are described elsewhere herein.State Machines

[0096] The abovementioned tomosynthesis augmented fluoroscopy overlay methods may be utilized by a tracking system providing a user with real-time location of the lesions, as well as the relative position of the scope or needle and the lesion to correct navigation. FIG. 3 shows an example system 300 of various state machines for implementing a tracking system based at least in part on tomosynthesis and live or near real-time fluoroscopy with real-time location of the lesions. At a high level, the state machines included in the system 300 may read a set of inputs and change to a different state based on those inputs. The system 300 may comprise one or more of the state a TrackingSubsystem 310, a Vision subsystem 320, a Localizationsubsystem 330, a SystemControlSubsystem 340, a MediaControlSubsystem 350, and a UserlnputSub system 360.

[0097] In some cases, information for each state machine may comprise functional description of key functionality, system configuration parameters that are owned by the state machine, a state transition diagram, a table that contains details of state transitions, or a table that presents all input and output data of the state machine.Attorney Docket No. 55441-738601

[0098] The TrackingSubsystem 310 may comprise two state machines, a smTomoConfigManager 312 and a smTomo 314, as well as helper classes that support the interface between the TrackingSubsystem 310 and other subsystems, software, and hardware components. The TrackingSubsystem 310 may employ RTI data contracts and implement the described with respect to the smTomoConfigManager 312 and the smTomo 314. The smTomoConfigManager 312 may be responsible for loading tomosynthesis related configuration parameters from configuration files and sending parameters to other state machines through data contracts. In some cases, the configuration parameters have default values (e.g., previous values, recommended values, improved values, etc.) which can be overwritten by values specified in the configuration files. The smTomo 314 may receive configuration parameters from the smTomoConfigManager 312. The smTomo 314 may retrieve and process fluoroscopy images from smFluoroFrameGrabber 322 of the Vision subsystem 320. The smTomo 314 may receive user commands and may call tomosynthesis dynamic link library (DLL) modules to process and generate intermediate files before tomosynthesis reconstruction. The smTomo 314 may also provide captured unique fluoroscopy images to a treatment interface UI (e.g., as described with respect to FIGs. 19-24) for tip location selection for triangulation calculation to obtain 3D coordinates of a tip. Upon finishing reconstruction, the reconstruction volume may be provided to the treatment interface UI for displaying so that a user can identify and select lesion location coordinates. Tip-to-lesion offset can be obtained and broadcasted to a navigation unit for target driving updates. The smTomo 314 may be responsible for receiving normalized fluoroscopy images, passing to an algorithm, estimating the pose for fluoroscopy images, generating intermediate files, calling a reconstruction module (e.g., a toolbox of 2D and 3D tomography with high-performance GPU speedup) to generate the reconstruction result. The smTomo 314 may perform triangulation calculations to obtain tip coordinates and tip-to-lesion vector calculations based on EM sensor positions and lesion locations. Resulting reconstructions may be displayed in a Treatment UI for the user lesion selection, and lesion information may be broadcasted for augmented fluoroscopy overlay through data contracts.

[0099] FIG. 4 shows an example of a configuration state machine, a smTomoConfigManager 400 that may be a more detailed view of the smTomoConfigManager 312 of FIG. 3. In some cases, the smTomoConfigManager 400 may read tomosynthesis related configuration parameters. If no entry is found in configuration file for the tomosynthesis related configuration parameters, the smTomoConfigManager 400 may obtain default values (e.g., previous values, recommended values, improved values, etc.) instead. In some cases, the smTomoConfigManager 400 may broadcast tomosynthesis related configuration parameters through RTI data contracts.Attorney Docket No. 55441-738601

[0100] FIG. 5 shows an example of a configuration state machine, a smTomo 500 that may be a more detailed view of the smTomoConfigManager 312 of FIG. 3. The smTomo 314 may receive configuration parameters from the smTomoConfigManager (e.g., the smTomoConfigManager 312 or the smTomoConfigManager 400) at, for example, UpdateConfig module 510. In some cases, the smTomo 500 may receive normalized fluoroscopy image frames from smFluoroFrameGrabber (e.g., the smFluoroFrameGrabber 322). In some cases, the smTomo 500 may generate intermediate files for reconstruction (e.g., tomosynthesis reconstruction) via algorithm modules at, for example, GenerateReconstruction module 525. In some cases, the smTomo 500 may calculate tip coordinates (e.g., via CalculatingTipLesionOffset module 545).In some cases, the smTomo 500 may receive EM sensor data (e.g., from smRegi strati on 322).Using the EM sensor data, the smTomo 500 may calculate average EM coordinates and obtain a maximum deviation from the average EM coordinates. In some cases, the smTomo 500 may be responsible for pose estimation and generating intermediate images for tomosynthesis reconstruction. If no configuration parameters are found in the configuration file, default values (e.g., general, average, typical, etc.) may be used.Marker Board (Tomosynthesis Board)

[0101] In some embodiments, the systems herein may provide a marker board (tomosynthesis board) with unique marker design to assist pose estimation with improved efficiency and accuracy. The unique marker design may beneficially allow for a large sweeping angle. Large sweeping angle can beneficially improve reconstruction quality (e.g., improved axial view). FIGs. 6A-6C show an example of a tomosynthesis board 600A with a marker design layout 600B and layering shown in layout 600C. The marker boards described with respect to FIGs. 6A-6C may be applied to one or more of the tomosynthesis or the augmented fluoroscopy techniques also described herein.

[0102] The tomosynthesis board 600A may comprise a physical pattern that is unique to transformation or rotation. The physical pattern may be formed of markers in various sizes in predetermined code pattern. For example, as illustrated, the tomosynthesis board 600A may comprise dots in different sizes forming a code pattern. In some embodiments, the code pattern may be 3D. In some cases, the dots may be large and small blobs (e.g., beads) that are placed on two layers (with offset in the z-direction of the board as shown in the layout 600C) in a grid pattern according to the marker design layout 600B. In some cases, the offset of the two planes may be sufficient (e.g., offset is at least 20mm, 30mm, 40mm, 50mm, etc.) such that the 3D pattern of the markers may allow for calibration of the imaging device or pose estimation utilizing a single 2D image of the markers. In some cases, the 3D pattern of the markers may allow for calibration or pose estimation with improved accuracy by utilizing a plurality of 2DAttorney Docket No. 55441-738601images from at projections. In such cases, the offset of the two planes may be small (e.g., no greater than 10mm, 20mm, 30mm, etc.). In some embodiments, the marker board may have a 2D pattern. For example, dots of various sizes may be placed on the same plane.

[0103] The blobs may be made of a material visible on an X-ray image, such as metal. The two-layer marker design shown in the side view in the layout 600C of the marker design layout 600B improve accuracy of pose estimation using the tomosynthesis board 600A.

[0104] The marker design layout may have a predetermined size code pattern. In some cases, the marker design layout 600B may be size coded pattern such that the pattern in each sub-area 610 is unique (“1” represents large bead, “0” represents small bead). A sub-area 610 may be in any shape or size and the pattern within a sub-area is unique. The marker design layout 600B may be improved to maximize edit distance (e.g., a metric for determining dissimilarity between patterns, strings, etc.) between patches of the tomosynthesis board 600A. In some cases, the edit distance may be measured using the Hamming distances between patches. The patches may be square or rectangular, or some other shape. The patches may be small (e.g., 3x2 patches, 4x6 patches, 5x5 patches, etc.). The patches may be large (e.g., 5x7 patches, 2x9 patches, 9x9 patches, etc.). In some cases, the unique pattern in each sub-area may be configured such that the distance between patches with particular size(s) (e.g., e.g., 3x2 patches, 4x6 patches, 5x5 patches, etc.) may be maximized. Details about the marker matching algorithms are described later herein.

[0105] An in-plane rotation (e.g., 90-degree,180-degree, 270-degree, etc.) rotation may be considered when designing the marker design layout 600B so that the coincidental alignment is minimized if the tomosynthesis board 600A is rotated by rotation, either physically or by C-arm setting. In some cases, a vertical or a horizontal flip may be considered in the marker design layout 600B. A plurality of rows of marker blobs as shown in the side view of the layout 600C may be interlaced in layers (e.g., two layers, three layers, five layers, ten layers, etc.) on the tomosynthesis board 600A.Pattern Matching Techniques

[0106] FIGs. 7A-7C illustrate example images used in pattern matching for blob detection. The images and techniques described with respect to 7A-7C, as well as FIGs. 8 and 9, may be applied to one or more of the tomosynthesis or the augmented fluoroscopy techniques also described herein.

[0107] FIG. 7A shows an example of blob detection of markers on an image 700A of a tomosynthesis board. The image 700A includes X-ray projections of blobs (e.g., as discussed with respect to FIGs. 6A-6C) on a tomosynthesis board. The blobs are illustrated as markers in the image 700A. The blobs may be detected using any number of image processing techniques, machine learning (e.g., computer vision) techniques, masking techniques, or statisticalAttorney Docket No. 55441-738601techniques. For example, the blobs may be detected using any suitable blob detection algorithm. Each detected blob may be marked with various properties, such as center location and radius, as shown in the image 700A. The blobs may be classified into large markers or small markers according to their sizes (e.g., threshold by the median size of all markers). The large and small markers may create a pattern which is used to match the blob pattern on the tomosynthesis board. While the image 700A, illustrates the markers as blobs, many different patterns, shapes, nonpatterns, shading, coloring, etc. may be used (e.g., arrays of various polygons, lines, grid pattern, writings, symbols, etc.). In general, in some cases, the markers may be implemented in various different manners, provided the markers may be useful in matching tomosynthesis images to a spatial position (e.g., with relation to machinery, with relation to the patient, etc.).

[0108] FIG. 7B shows an example of candidate points on an image 700B of a tomosynthesis board. The candidate points on the tomosynthesis board grid may be chosen for the initial marker on grid matching. In some cases, a homography model may be used to remove outliers in the initial marker on grid matching. The homography may be computed based on the candidate points between points in an X-ray image (e.g., the images 700A or 700B, etc.) and the tomosynthesis board. For example, estimation techniques, such as RANSAC, may compute the homography based on the candidate points between points in the X-ray image and the tomosynthesis board. The estimation techniques, such as RANSAC, PROSAC (Progressive Sample Consensus), NAPSAC (N Adjacent Points Sample Concensus), etc., may estimate parameters of a mathematical model from a set of observations polluted by outliers. The estimation techniques may repeatedly sample the observations and may reject the outlier samples that do not fit the model and keep the inlier samples that fit the model.

[0109] The estimation techniques may implement a model that may be refined with the obtained inlier data via various optimization methods. In some cases, once the homography of one layer of the tomosynthesis board is computed, the rest of the markers on that layer may be extracted provided projections of the blobs are close enough to the markers. The markers left on the image may be fit to the other layer (e.g., the second layer) of the tomosynthesis board.

[0110] In some cases, initial marker matching is the match between markers in the image 700B and tomosynthesis board grid. The initial marker matching may be computed over one or more frames of the image 700B. In some cases, the initial match may be a best matched frame (e.g., the frame with the highest matching score among all the frames tested, which may be, in some cases, all the frames in the image 700B). The initial match, with the best matched frame may server as a starting point to propagate the marker matching to the rest of the frames of the image 700B. As such, once the initial marker matching is established, in some cases, then the pattern of the matched markers may “slide” over the image 700B of the tomosynthesis board to find the1Attorney Docket No. 55441-738601remaining best matches (e.g., using the Hamming distance). For each frame, a pattern matching score (e.g., number of matched markers divided by total number of detected markers) may obtained; for example, FIG. 7C shows an example of marker extraction on an image 700C depicting pattern matching and computing the pattern matching score. The best matching (e.g., the highest matching score among all the frames) may be chosen as the starting point of pattern matching for all frames of the tomosynthesis sweep.

[0111] As illustrated in FIG. 8, which illustrated process 800 for robust tomosynthesis marker matching, once the best matched frame is obtained (e.g., via techniques described with respect to FIGs. 7A-7C), the matching of the best frame may be propagated to all the other frames in tomosynthesis images. In some cases, the process 800 may begin with obtaining a pair of consecutive frames, at 805 and 815, with first markers and second markers, respectively. The process 800 may further include detecting (e.g., via computer vision techniques) at 810 and 820, markers included in the pair of consecutive frames, at 805 and 815, respectively. The process 800 may further include matching (e.g., via k-nearest neighbors), the markers included in the pair of consecutive frames obtained at 805 and 815. The process 800 may further include, for each pair of the matched markers, computing motion displacement between the pair of consecutive frames obtained at 805 and 815. The process 800 may further include, for each of the first markers in the first frame obtained at 805, transferring (e.g., mapping) the first markers to the second frame obtained at 815. The transferring of the first markers to the second markers of the second frame is illustrated in FIG. 9, which depicts an example result for marker tracking across a tomosynthesis frame sequence (of two consecutive frames) on an image of a tomosynthesis board.

[0112] Referring again to FIG. 8, for each of the first markers transferred to the second frame, if the distance between the transferred first marker and the corresponding second marker satisfies a threshold (e.g., is less than or equal to a certain distance), then the match between the first marker and the second marker is an inlier at 825. The initial matches may be generated based on distance (e.g., all points within a distance are a match). In some cases, the best matching is the matching with the most inliers. The process 800 may be iterative or repetitive, transferring existing marker matches in a current frame to a next (e.g., consecutive) frame, repeating for all frames at 830, until markers in all frames are matched to the blobs (e.g., beads) on the tomosynthesis board at 835. In some cases, the operation 830 may comprise taking all the above matches and finding the motion that contains the greatest number of matched marker points and call these matched point pairs inliers.Pose estimation techniques using markers in images

[0113] FIG. 10 shows an example diagram 1000 of a camera pose estimation. Reconstructing accurate camera pose and camera parameters may be a key aspect of both tomosynthesis imageAttorney Docket No. 55441-738601reconstruction and augmented fluoroscopy overlay. As previously discussed with respect to FIG.3 and FIG. 5, the smTomo 314 and 500, respectively, may be responsible for estimating poses (e.g., via triangulation) for fluoroscopy images. The pose estimates systems, methods, and techniques described may be applied to one or more of the tomosynthesis or augmented fluoroscopy techniques also described herein.

[0114] In the diagram 1000, a pinhole camera model is illustrated. The pinhole camera model of the diagram 1000 may be used to describe the geometry of an X-ray projection. As illustrated in the diagram 1000, pose estimation may comprise one or more of recovering rotation and translation of the camera (camera poses) by minimizing reprojection error from 3D-2D point correspondences. In some cases, an optimization algorithm may be used to refine camera calibration parameters by minimizing the reprojection error. The optimization algorithm may be a least squares algorithm, such as the global Levenberg-Marquardt optimization.

[0115] Recovering the camera pose may further include estimating the pose of a calibrated camera given a set of n 3D points in the world and their corresponding 2D projections in the images. The camera pose may comprise one or more of 6 degrees-of-freedom with rotation (e.g., roll, pitch, yaw) and 3D translation of the camera with respect to the world. Perspective-n-Point (PnP) pose computation may be used to recover the camera poses from n pairs of point correspondences. In some cases, n=3, therefore, the minimal form of the PnP problem is P3P, which may be solved with three point correspondences. For each tomosynthesis frame, there may be a plurality of marker matches and a RANSAC or other variant of the PnP solver may be used for the camera pose estimation. Once estimated, the pose may be further refined by minimizing the reprojection error using a non-linear minimization method and starting from the initial pose estimate with the PnP solver.

[0116] Performing the camera pose estimation for tomosynthesis reconstruction may comprise one or more of obtaining undistorted images e.g., from a robotic bronchoscopy system). The undistorted images may have some pre-processing done (e.g., image inpainting, etc.). The undistorted image may be normalized using a normalization algorithm. For example, the undistorted image may be normalized using a logarithmic normalization algorithm, such asBeer’s Law: — log log , where b is the input image and 8 is an offset to avoid a zerologarithm.Tomosynthesis reconstruction based on camera pose

[0117] The estimated camera poses or a directly measured camera pose may be utilized in reconstructing the 3D volume image i.e., tomosynthesis reconstruction. In some cases, projection matrices (e.g., estimated camera pose matrices) may be obtained. Additionally, in some cases, physical parameters (e.g., size, resolution, position, volume, geometry, etc.) of the tomosynthesisAttorney Docket No. 55441-738601reconstruction may be obtained. Inputs of one or more of the normalized images, the projection matrices, or the physical parameters, may allow generation of a reconstructed volume for the tomosynthesis reconstruction. To generate the reconstructed volume from the inputs, an algorithm (e.g., PM2vector algorithm) may convert the projection matrices in camera format to vector variables (e.g., in the ASTRA toolbox). Another algorithm may be the same as or similar to the ASTRA FDK Recon algorithm that may call the FDK (Feldkamp, Davis, and Kress) reconstruction module, where normalized projection images may be cosine weighted and ramp filtered, then back-projected to the volume according to the cone-beam geometry. Finally, in some cases, yet another algorithm may convert the reconstructed volume (as output from the ASTRA FDK Recon algorithm, for example) in an appropriate format. For example, a NifTI processing algorithm may save the reconstructed volume as a NifTI image with an affine matrix.Augmented fluoroscopy with camera pose estimation

[0118] Performing the camera pose estimation for augmented fluoroscopy may allow for achieving a goal of projecting a lesion onto an X-ray image. The present disclosure provides methods for precisely projecting a 3D lesion onto the 2D fluoroscopic image with accurate camera pose and camera parameters. The camera calibration and pose estimation approaches for generating the augmented layer or overlay of the lesion(s) on 2D image can be similar to those described for augmented fluoroscopy. However, the camera pose estimation for augmented fluoroscopy may be, in some cases, more difficult than the pose estimation for tomosynthesis reconstruction because only a single frame is available for the augmented fluoroscopy and motion information may not be available e.g., for removing the ambiguity of the pose estimation). One or more criteria may be implemented to measure the uniqueness of the matching to the tomosynthesis board. If the matching satisfies the criteria, then the matching may be determined to be unique. Further, when the matching is unique, then the camera pose may be determined to be correctly estimated. The estimated camera poses and pre-calibrated camera parameters may be used to project the 3D lesion onto the fluoroscopic video frame (2D image). If the matching is not unique and the camera pose is not correctly estimated, the augmented fluoroscopy overlay may not be displayed.

[0119] In some embodiments, the augmentation layer or the overlay of the target / lesion is displayed over the live or near real-time fluoroscopic view or the 2D fluoroscopic images in the fluoroscopy mode. The overlay of the target / lesion (e.g., one or more lesions) may be modeled as 3D shapes (e.g., ellipsoids, prisms, spheres, etc.) whose projections on the fluoroscopy image are 2D shapes (e.g., ellipses, polygons, circles, etc.). In some cases, a shape, size, or appearance of an overlay of the one or more lesions may be based at least in part on a projection of a lesion 3D models (e.g., 3D meshed model) onto the 2D fluoroscopic images.Attorney Docket No. 55441-738601

[0120] FIG. 11 shows an example of augmented fluoroscopy projection 1100 with a 3D lesion model projected onto a 2D plane (e.g., an image plane), consistent with examples described herein. As shown in the example, the lesion may be modeled as 3D mesh object with multiple comer points. The 3D mesh model may be generated from pre-operation CT or during planning. In the illustrated example, the comer points are projected to the 2D fluoroscopic image where the comer points form a projected polyline contour (from outermost points). Alternatively, the shape or appearance of the overlay for the lesion may be predetermined (e.g., circle, markers, etc.) that may not be based on a 3D meshed model from imaging.

[0121] In some cases, the location of the overlay may be determined based at least in part on the target / lesion location determined from the tomosynthesis or the reconstructed 3D tomosynthesis images and a pose estimation associated with the 2D fluoroscopic image.Pose estimation techniques without using markers

[0122] Relative camera poses at which images are acquired are required inputs for tomosynthesis reconstruction of 3D volumes and also for augmented fluoroscopy. Methods and systems for accurately determining the relative camera poses at which images are acquired can be utilized to provide the pose inputs required for tomosynthesis and augmented fluoroscopy. In some embodiments, the camera pose may be obtained without markers. In some cases, methods herein may obtain camera poses without utilizing makers which beneficially allow for higher quality images to be achieved as markers may partially obscure the images. For example, in tomosynthesis mode, a higher quality 3D reconstruction of the volume may be achieved without markers’ presence in the images, since prior to performing tomosynthesis the region of an image around each marker is typically excised from the image, reducing the overall amount of information available with which to generate the 3D reconstruction of the volume.

[0123] As illustrated in FIG. 13, the pose or motion of the fluoroscopy (tomosynthesis) imaging system may be measured directly using any suitable motion / location sensors 1310 disposed on the fluoroscopy (tomosynthesis) imaging system. The motion / location sensors may comprise one or more of, for example, inertial measurement units (IMUs)), one or more gyroscopes, velocity sensors, accelerometers, magnetometers, location sensors (e.g., global positioning system (GPS) sensors), vision sensors (e.g., imaging devices capable of detecting visible, infrared, or ultraviolet light, such as cameras), proximity or range sensors (e.g., ultrasonic sensors, lidar, time-of-flight or depth cameras), altitude sensors, attitude sensors (e.g., compasses) or field sensors (e.g., magnetometers, electromagnetic sensors, radio sensors). In some cases, the fluoroscopy system may comprise rotary or linear encoders or similar means of measuring rotational motion of the arm with respect to the structure supporting and holding the arm in position. The encoders may also be used to provide pose of the imaging devices. In some cases, the one or more sensors forAttorney Docket No. 55441-738601tracking the motion and location of the fluoroscopy (tomosynthesis) imaging station may be disposed on the imaging station or be located remotely from the imaging station, such as a wall-mounted camera 1320. The various poses may be captured by the one or more sensors as described above.

[0124] In some cases, when the source and detector relative poses are known from motion / location sensors, makers (e.g., a pattern of blobs or beads within a frame to estimate pose) may not be required to estimate the camera pose. In some cases, when a pose information is available from multiple sources, such as both from a direct pose measurement (e.g., motion / location sensors) and pose estimation (e.g., image analysis of features within a frame), the pose information (e.g., direct measurement and estimated pose) from the multiple sources may be combined to provide a more accurate pose estimation. For example, the direct pose measurement and the estimated pose based on computer vision may be averaged (or weighted) to generate a final pose for the imaging system.

[0125] In some cases, the C-arm imaging system undergoes only rotations around an axis of rotation with no overall translations, for tomosynthesis reconstruction or augmented fluoroscopy the pose information required for each image may comprise one or more of only the relative angles between the images. The relative angles between images may be measured by many of the methods described above. For example, a 3D accelerometer may be mounted to the C-arm and the direction of the acceleration due to Earth’s gravity may be used to determine relative changes of the angle of the camera as the C-arm is rotated. For the case where the C-arm may be both rotating and translating, the complete 6 degrees of freedom (6DOF) of the camera may need to be known as inputs into tomography or augmented fluoroscopy. For this case, for example, a binocular optical “localizer” system 1320 along with localizer fiducial markers 1350 mounted to the C-arm 1340 may provide the complete 6DOF information for the (x, y, z) location and (Rx, Ry, Rz) orientation of the frame of the fiducial markers. A (one time) camera calibration process may be performed to know the translation and rotation transformations from the localizer fiducial marker frame to the camera frame. After calibration, the 6DOF pose of the camera may be known at the time each image is acquired based on captured data from the localizer.Robotic Bronchoscopy System

[0126] FIG. 12 show examples of robotic bronchoscopy system 1200, 1230, in accordance with some examples. The robotic bronchoscopy system may implement the methods, subsystems and functional modules as described above. As shown in FIG. 12, the robotic bronchoscopy system 1200 may comprise a steerable catheter assembly 1220 and a robotic support system 1210, for supporting or carrying the steerable catheter assembly. The steerable catheter assembly can be a bronchoscope. In some embodiments, the steerable catheter assembly may be a single-use roboticAttorney Docket No. 55441-738601bronchoscope. In some embodiments, the robotic bronchoscopy system 1200 may comprise an instrument driving mechanism 1213 that is attached to the arm of the robotic support system. The instrument driving mechanism may be provided by any suitable controller device (e.g., hand-held controller) that may or may not include a robotic system. The instrument driving mechanism may provide mechanical and electrical interface to the steerable catheter assembly 1220. The mechanical interface may allow the steerable catheter assembly 1220 to be releasably coupled to the instrument driving mechanism. For instance, a handle portion of the steerable catheter assembly can be attached to the instrument driving mechanism via quick install / release means, such as magnets, spring-loaded levels and the like. In some cases, the steerable catheter assembly may be coupled to or released from the instrument driving mechanism manually without using a tool.

[0127] The steerable catheter assembly 1220 may comprise a handle portion 1223 that may comprise one or more of components configured to processing image data, provide power, or establish communication with other external devices. For instance, the handle portion 1223 may comprise one or more of a circuitry and communication elements that provides electrical communication between the steerable catheter assembly 1220 and the instrument driving mechanism 1213, and any other external system or devices. In another example, the handle portion 1223 may comprise circuitry elements such as power sources for powering the electronics (e.g., camera and LED lights) of the endoscope. In some cases, the handle portion may be in electrical communication with the instrument driving mechanism 1213 via an electrical interface e.g., printed circuit board) so that image / video data or sensor data can be received by the communication module of the instrument driving mechanism and may be transmitted to other external devices / systems. Alternatively or in addition to, the instrument driving mechanism 1213 may provide a mechanical interface only. The handle portion may be in electrical communication with a modular wireless communication device or any other user device (e.g., portable / hand-held device or controller) for transmitting sensor data or receiving control signals. Details about the handle portion are described later herein.

[0128] The steerable catheter assembly 1220 may comprise a flexible elongate member 1211 that is coupled to the handle portion. In some embodiments, the flexible elongate member may comprise a shaft, steerable tip, and a steerable section. The steerable catheter assembly may be a single use robotic bronchoscope. In some cases, only the elongate member may be disposable. In some cases, at least a portion of the elongate member (e.g., shaft, steerable tip, etc.) may be disposable. In some cases, the entire steerable catheter assembly 1220 including the handle portion and the elongate member can be disposable. The flexible elongate member and the handle portion are configured such that the entire steerable catheter assembly can be disposed ofAttorney Docket No. 55441-738601at low cost. Details about the flexible elongate member and the steerable catheter assembly are described later herein.

[0129] In some embodiments, the provided bronchoscope system may also comprise a user interface. As illustrated in the example system 1230, the bronchoscope system may comprise one or more of a treatment interface module 1231 (user console side) or a treatment control module 1233 (patient and robot side). The treatment interface module may allow an operator or user to interact with the bronchoscope during surgical procedures. In some embodiments, the treatment control module 1233 may be a hand-held controller. The treatment control module may, in some cases, comprise a proprietary user input device and one or more add-on elements removably coupled to an existing user device to improve user input experience. For instance, physical trackball or roller can replace or supplement the function of at least one of the virtual graphical element (e.g., navigational arrow displayed on touchpad) displayed on a graphical user interface (GUI) by giving it similar functionality to the graphical element which it replaces. Examples of user devices may comprise one or more of, but are not limited to, mobile devices, smartphones / cellphones, tablets, personal digital assistants (PDAs), laptop or notebook computers, desktop computers, media content players, and the like. Details about the user interface device and user console are described later herein.

[0130] The user console 1231 may be mounted to the robotic support system 1210. Alternatively or in addition to, the user console or a portion of the user console (e.g., treatment interface module) may be mounted to a separate mobile cart.

[0131] The present disclosure provides a robotic endoluminal platform with integrated tool-in-lesion tomosynthesis technology. In some cases, the robotic endoluminal platform may be a bronchoscopy platform. The platform may be configured to perform one or more operations consistent with the method described herein. FIG. 13 shows an example of a robotic endoluminal platform and its components or subsystems, in accordance with some embodiments of the disclosure provides. In some embodiments, the platform may comprise a robotic bronchoscopy system and one or more subsystems that can be used in combination with the robotic bronchoscopy system of the present disclosure.

[0132] In some embodiments, the one or more subsystems may comprise one or more of imaging systems such as a fluoroscopy imaging system for providing real-time imaging of a target site (e.g., comprising lesion). Multiple 2D fluoroscopy images may be used to create tomosynthesis or Cone Beam CT (CBCT) reconstruction to better visualize and provide 3D coordinates of the anatomical structures. FIG. 13 shows an example of a fluoroscopy (tomosynthesis) imaging system 1300. For example, the fluoroscopy (tomosynthesis) imaging system may perform accurate lesion location tracking or tool-in-lesion confirmation before or during surgicalAttorney Docket No. 55441-738601procedure as described above. In some cases, lesion location may be tracked based on location data about the fluoroscopy (tomosynthesis) imaging sy stem / station (e.g., C arm) and image data captured by the fluoroscopy (tomosynthesis) imaging system. The lesion location may be registered with the coordinate frame of the robotic bronchoscopy system.

[0133] In some cases, a location, pose or motion of the fluoroscopy imaging system may be measured / estimated to register the coordinate frame of the image to the robotic bronchoscopy system, or for constructing the 3D model / image. In some cases, the pose of the imaging system may be estimated using the pose estimation methods as described elsewhere herein. For example, pose estimation method based on the unique marker boards may be employed to obtain the imaging device pose associated with each 2D image.

[0134] Alternatively, the pose or motion of the fluoroscopy (tomosynthesis) imaging system may be measured directly using any suitable motion / location sensors 1310 disposed on the fluoroscopy (tomosynthesis) imaging system. The motion / location sensors may comprise one or more of, for example, inertial measurement units (IMUs)), one or more gyroscopes, velocity sensors, accelerometers, magnetometers, location sensors (e.g., global positioning system (GPS) sensors), vision sensors (e.g., imaging devices capable of detecting visible, infrared, or ultraviolet light, such as cameras), proximity or range sensors (e.g., ultrasonic sensors, lidar, time-of-flight or depth cameras), altitude sensors, attitude sensors (e.g., compasses) or field sensors (e.g., magnetometers, electromagnetic sensors, radio sensors). In some cases, the fluoroscopy system may comprise rotary or linear encoders or similar means of measuring rotational motion of the arm with respect to the structure supporting and holding the arm in position. The encoders may also be used to provide pose of the imaging devices. In some cases, the one or more sensors for tracking the motion and location of the fluoroscopy (tomosynthesis) imaging station may be disposed on the imaging station or be located remotely from the imaging station, such as a wall-mounted camera 1320. The various poses may be captured by the one or more sensors as described above. For the case where the source and detector relative poses are known from motion / location sensors it is not required to use a pattern of blobs or beads within a frame to estimate pose. In some cases, when a pose information is available from multiple sources, such as both from a direct pose measurement (e.g., motion / location sensors) and pose estimation (e.g., image analysis of features within a frame), the pose information (e.g., direct measurement and estimated pose) from the multiple sources may be combined to provide a more accurate pose estimation. For example, the direct pose measurement and the estimated pose based on computer vision may be averaged (or weighted) to generate a final pose for the imaging system.

[0135] In some embodiments, a location of a lesion may be segmented in the image data captured by the fluoroscopy (tomosynthesis) imaging system with aid of a signal processing unitAttorney Docket No. 55441-7386011330. One or more processors of the signal processing unit may be configured to further overlay treatment locations (e.g., lesion) on the real-time fluoroscopic image / video. For example, the processing unit may be configured to generate an augmented layer comprising augmented information such as the location of the treatment location or target site. In some cases, the augmented layer may also comprise a graphical marker indicating a path to this target site. The augmented layer may be a substantially transparent image layer comprising one or more graphical elements (e.g., box, arrow, etc.). The augmented layer may be superposed onto the optical view of the optical images or video stream captured by the fluoroscopy (tomosynthesis) imaging system, or displayed on the display device. The transparency of the augmented layer allows the optical image to be viewed by a user with graphical elements overlay on top of the optical image. In some cases, both the segmented lesion images and an optimum path for navigation of the elongate member to reach the lesion may be overlaid onto the real time tomosynthesis images. This may allow operators or users to visualize the accurate location of the lesion as well as a planned path of the bronchoscope movement. In some cases, the segmented and reconstructed images (e.g., CT images as described elsewhere) provided prior to the operation of the systems described herein may be overlaid on the real time images.

[0136] In some embodiments, the one or more subsystems of the platform may comprise one or more treatment subsystems such as manual or robotic instruments (e.g., biopsy needles, biopsy forceps, biopsy brushes) or manual or robotic therapeutical instruments (e.g., RF ablation instrument, Cryo instrument, Microwave instrument, and the like).

[0137] In some embodiments, the one or more subsystems of the platform may comprise a navigation and localization subsystem. The navigation and localization subsystem may be configured to construct a virtual airway model based on the pre-operative image (e.g., pre-op CT image or tomosynthesis). The navigation and localization subsystem may be configured to identify the segmented lesion location in the 3D rendered airway model and based on the location of the lesion, the navigation and localization subsystem may generate an improved path from the main bronchi to the lesions with a recommended approaching angle towards the lesion for performing surgical procedures (e.g., biopsy).

[0138] In some cases, to assist with reaching the target tissue location, the location and movement of the medical instruments may be registered with intra-operative 3D images of the patient anatomy. In some cases, this may be accomplished by determining the transformation from the reference frame of the 3D images to the reference frame of the EM field or other navigation solution, allowing the location of the lesion within the 3D model of the patient anatomy to be updated based on data from the intra-operative 3D images of the patient anatomy. The transformation between the reference frame of the 3D image and the reference frame of theAttorney Docket No. 55441-738601EM or other navigation system (co-regi strati on) may comprise the three rotations between the frames and the three translations between the frames.

[0139] The present disclosure may provide co-registration methods to co-register the reference frame of the 3D images and the navigation reference frame (e.g., reference frame of the EM field). In some cases, the co-registration method may utilize markers visible within the image dataset to establish the reference frame for the 3D image. The marker reference frame and the EM reference frame (or other navigation reference frame) may have known transformation relationship (e.g., rotations and translations between the marker reference frame and the EM reference frame are known during equipment setup or from device mechanical constraints). The location of the patient anatomy is found within the 3D image reference frame, and from mechanical construction or setup a transformation from the 3D image reference frame to the navigation reference frame is obtained, and the position of the patient anatomy within the navigation reference frame may be updated based upon the measured patient location within the 3D image.

[0140] In some cases, instead of obtaining the rotations and translations between the marker reference frame and the navigation reference frame (e.g., EM reference frame) from equipment setup or from device mechanical constraints, only the rotations of the marker reference frame with respect to an navigation reference frame (e.g., with both a marker frame and an EM generator affixed to a bed with the (x,y,z) axes of the frames parallel to the principal axes of the bed) are obtained from the equipment setup or from device mechanical constraints, whereas the translations between the frames are obtained based on real-time measurements. For example, the translations relationship may be obtained by measuring the (x, y, z) positions of afeatures / structure (e.g., tip, any fiducial marker, part of the endoscope, etc.) in both the navigation and the imaging systems. For instance, with EM navigation, the (x, y, z) position of the tip of an endoscope is measured in the frame of the EM navigation system. The (x, y, z) position of that tip of the endoscope in the 3D image reference frame is measured by locating the tip within the 3D data set containing the tip concurrently with the EM measurement. It should be noted any structure / feature (e.g., endoscope tip, tool, marker on the tool, etc.) with a position that can be measured in both the navigation system and the imaging system may be utilized to determine the translation relationship between the two frames.

[0141] In some cases, both the rotation and translation relationship between the two frame can be obtained by using a structure or feature which can be located in (x,y,z) by both the marker reference frame and the navigation reference frame. For example, with EM navigation, both the (x,y,z) position and the (Rx, Ry, Rz) angular orientation of the tip of an endoscope in the frame of the EM navigation system is measured. Structures or features can be constructed into anAttorney Docket No. 55441-738601endoscope that are opaque to x-rays and that allow the position and angular orientation of the structure to be determined by the 3D reconstruction. Based on the endoscope tip (x, y, z) and (Rx, Ry, Rz) determined in both the 3D image frame and the EM frame, the two reference frames can be co-registered.

[0142] In some cases, a radio opaque marker affixed to the EM (or other navigation system) may be utilized to obtain the transformation matrix. For instance, by construction or calibration, the markers affixed to the EM system may have known translations and orientations with respect to the EM frame. The markers on the EM frame may be visible in the 3D images which can be used to determine the translations and orientations of the markers affixed to the EM system with respect to the markers in the 3D image. Next, the method may combine the transformations to determine the position of the physiology (e.g., a lesion) within the EM frame asEMframe T lesion = EMframe T EMmarkers * EMmarkers_T_3Dframe * 3Dframe_T_lesion, where each Frame2_T_Framel label means the 4 x 4 rotation and translation transformation matrix that provides the (x, y, z) position of a point in the Frame 2 coordinate system given the (x, y, z) position of the same point in the Frame 1 coordinate system.

[0143] In some cases, the co-regi strati on method may comprise independently measuring the relative position and orientation of the c-arm camera with respect to the navigation reference frame, e.g., using one set of 3D localization tools 1310 or 1350 affixed to structures 1340 in physical connection to and with known orientation to the camera, and a second set of 3D localization tools affixed to structures in physical connection to and with known orientation to the EM frame.

[0144] In some cases, the co-regi ster method may comprise one or more of structures or features to the C-arm that allow the EM navigation (or other navigation system) to measure the translations and angles of the camera with respect to the EM frame. For example, a 6-DOF EM sensor may be affixed to the C-arm such that the position and orientation of the sensor can be measured by the EM navigation system in the EM frame. The position and orientation of the EM sensor with respect to the camera may be known from construction or calibration, and so measurement of the position and orientation of the EM sensor provides the position and orientation of the camera in the EM frame. The 3D image may be reconstructed from multiple 2D projection using the individual camera poses as described elsewhere herein, and as the camera pose is already in the EM frame, the reconstructed 3D image frame is automatically coregistered to the EM reference frame (they are the same frame). This method beneficially eliminates the requirement of using computer vision or features in image for co-regi strati on.

[0145] With the image-guided instruments registered to the images, the instruments may navigate natural or surgically created passageways in anatomical systems such as the lungs, theAttorney Docket No. 55441-738601colon, the intestines, the kidneys, the heart, the circulatory system, or the like. In some instances, after the medical instrument (e.g., needle, endoscope) reaches the target location or after a surgical operation is completed, 3D imaging may be performed to confirm the instrument or operation is at the target location.

[0146] At a registration step before driving the bronchoscope to the target site, the system may align the rendered virtual view of the airways to the patient airways. Image registration may consist of a single registration step or a combination of a single registration step and real-time sensory updates to registration information. The registration process may comprise one or more of finding a transformation that aligns an object (e.g., airway model, anatomical site) between different coordinate systems (e.g., EM sensor coordinates, and patient 3D model coordinates based on pre-operative CT imaging). Details about the registration are described later herein.

[0147] Once registered, all airways may be aligned to the pre-operative rendered airways. During robotic bronchoscope driving towards the target site, the location of the bronchoscope inside the airways may be tracked and displayed. In some cases, location of the bronchoscope with respect to the airways may be tracked using positioning sensors. Other types of sensors (e.g., camera) can also be used instead of or in conjunction with the positioning sensors using sensor fusion techniques. Positioning sensors such as electromagnetic (EM) sensors may be embedded at the distal tip of the catheter and an EM field generator may be positioned next to the patient torso during procedure. The EM field generator may locate the EM sensor position in 3D space or may locate the EM sensor position and orientation with 5 or 6 degrees of freedom (5DOF or 6DOF), consisting of the 3 spatial coordinates and 2 or 3 orientation angles. This may provide a visual guide to an operator when driving the bronchoscope towards the target site.

[0148] In real-time EM tracking, the EM sensor comprising of one or more sensor coils embedded in one or more locations and orientations in the medical instrument (e.g., tip of the endoscopic tool) measures the variation in the EM field created by one or more static EM field generators positioned at a location close to a patient. The location information detected by the EM sensors is stored as EM data. The EM field generator (or transmitter) may be placed close to the patient to create a low intensity low frequency alternating magnetic field that the embedded sensor may detect. The alternating magnetic field induces small currents in the sensor coils of the EM sensor, which may be analyzed to determine the distance and angle between the EM sensor and the EM field generator. These distances and orientations may be intra-operatively registered to the patient anatomy (e.g., 3D model) to determine the registration transformation that aligns a single location in the coordinate system with a position in the pre-operative model of the patient's anatomy.Attorney Docket No. 55441-738601

[0149] In some embodiments, the platform herein may utilize fluoroscopic imaging systems to determine the location and orientation of medical instruments and patient anatomy within the coordinate system of the surgical environment. In particular, the systems and methods herein may employ a mobile C-arm fluoroscopy as a low-cost and mobile real-time qualitative assessment tool. Fluoroscopy is an imaging modality that obtains real-time moving images of patient anatomy, and medical instruments. Fluoroscopic systems may comprise one or more of C-arm systems which provide positional flexibility and are capable of orbital, horizontal, or vertical movement via manual or automated control. Fluoroscopic image data from multiple viewpoints (i.e., with the fluoroscopic imager moved among multiple locations) in the surgical environment may be compiled to generate two-dimensional or three-dimensional tomographic images. When using a fluoroscopic imager system that includes a digital detector (e.g., a flat panel detector), the generated and compiled fluoroscopic image data may permit the sectioning of planar images in parallel planes according to tomosynthesis imaging techniques. The C-arm imaging system may comprise a source (e.g., an X-ray source) and a detector (e.g., an X-ray detector or X-ray imager). The X-ray detector may generate an image representing the intensities of received x-rays. The imaging system may reconstruct 3D image based on multiple 2D image acquired from a wide range of angels. In some cases, the rotation angle range may be at least 120-degree, 130-degree, 140-degree, 150-degree, 160-degree, 170-degree, 180-degree or greater. In some cases, the 3D image may be generated based on a pose of the X-ray imager.

[0150] The bronchoscope or the catheter may be disposable. FIG. 14 illustrates an example of a flexible endoscope 1400, in accordance with some embodiments of the present disclosure. As shown in FIG. 14, the flexible endoscope 1400 may comprise a handle / proximal portion 1409 and a flexible elongate member to be inserted inside of a subject. The flexible elongate member can be the same as the one described above. In some embodiments, the flexible elongate member may comprise a proximal shaft (e.g., insertion shaft 1401), steerable tip (e.g., tip 1405), and a steerable section (active bending section 1403). The active bending section, and the proximal shaft section can be the same as those described elsewhere herein. The endoscope 1400 may also be referred to as steerable catheter assembly as described elsewhere herein. In some cases, the endoscope 1400 may be a single-use robotic endoscope. In some cases, the entire catheter assembly may be disposable. In some cases, at least a portion of the catheter assembly may be disposable. In some cases, the entire endoscope may be released from an instrument driving mechanism and can be disposed of. In some embodiments, the endoscope may contain varying levels of stiffness along the shaft, as to improve functional operation.

[0151] The endoscope or steerable catheter assembly 1400 may comprise a handle portion 1409 that may comprise one or more of one or more components configured to process image data,Attorney Docket No. 55441-738601provide power, or establish communication with other external devices. For instance, the handle portion may comprise one or more of circuitry and communication elements that allow electrical communication between the steerable catheter assembly 1400 and an instrument driving mechanism (not shown), and any other external system or devices. In another example, the handle portion 1409 may comprise circuitry elements such as power sources for powering the electronics (e.g., camera, electromagnetic sensor, and LED lights) of the endoscope.

[0152] The one or more components located at the handle may be improved such that expensive and complicated components may be allocated to the robotic support system, a hand-held controller or an instrument driving mechanism thereby reducing the cost and simplifying the design of the disposable endoscope. The handle portion or proximal portion may provide an electrical and mechanical interface to allow for electrical communication and mechanical communication with the instrument driving mechanism. The instrument driving mechanism may comprise a set of motors that are actuated to rotationally drive a set of pull wires of the catheter. The handle portion of the catheter assembly may be mounted onto the instrumentdrive mechanism so that its pulley / capstans assemblies are driven by the set of motors. The number of pulleys may vary based on the pull wire configurations. In some cases, one, two, three, four, or more pull wires may be utilized for articulating the flexible endoscope or catheter.

[0153] The handle portion may be configured allowing the robotic bronchoscope to be disposable at reduced cost. For instance, classic manual and robotic bronchoscopes may have a cable in the proximal end of the bronchoscope handle. The cable often includes illumination fibers, camera video cable, and other sensors fibers or cables such as electromagnetic (EM) sensors, or shape sensing fibers. Such complex cables can be expensive adding to the cost of the bronchoscope. The provided robotic bronchoscope may have an improved design such that simplified structures and components can be employed while preserving the mechanical and electrical functionalities. In some cases, the handle portion of the robotic bronchoscope may employ a cable-free design while providing a mechanical / electrical interface to the catheter.

[0154] The electrical interface (e.g., printed circuit board) may allow image / video data or sensor data to be received by the communication module of the instrument driving mechanism and may be transmitted to other external devices / systems. In some cases, the electrical interface may establish electrical communication without cables or wires. For example, the interface may comprise pins soldered onto an electronics board such as a printed circuit board (PCB). For instance, receptacle connector (e.g., the female connector) is provided on the instrument driving mechanism as the mating interface. This may beneficially allow the endoscope to be quickly plugged into the instrument driving mechanism or robotic support without utilizing extra cables. Such type of electrical interface may also serve as a mechanical interface such that when theAttorney Docket No. 55441-738601handle portion is plugged into the instrument driving mechanism, both mechanical and electrical coupling is established. Alternatively or in addition to, the instrument driving mechanism may provide a mechanical interface only. The handle portion may be in electrical communication with a modular wireless communication device or any other user device (e.g., portable / hand-held device or controller) for transmitting sensor data or receiving control signals.

[0155] In some cases, the handle portion 1409 may comprise one or more mechanical control modules such as lure 1411 for interfacing the irrigation system / aspiration system. In some cases, the handle portion may comprise one or more of lever / knob for articulation control.Alternatively, the articulation control may be located at a separate controller attached to the handle portion via the instrument driving mechanism.

[0156] The endoscope may be attached to a robotic support system or a hand-held controller via the instrument driving mechanism. The instrument driving mechanism may be provided by any suitable controller device (e.g., hand-held controller) that may or may not include a robotic system. The instrument driving mechanism may provide mechanical and electrical interface to the steerable catheter assembly 1400. The mechanical interface may allow the steerable catheter assembly 1400 to be releasably coupled to the instrument driving mechanism. For instance, the handle portion of the steerable catheter assembly can be attached to the instrument driving mechanism via quick install / release means, such as magnets, spring-loaded levels and the like. In some cases, the steerable catheter assembly may be coupled to or released from the instrument driving mechanism manually without using a tool.

[0157] In the illustrated example, the distal tip of the catheter or endoscope shaft is configured to be articulated / bent in two or more degrees of freedom to provide a desired camera view or control the direction of the endoscope. As illustrated in the example, imaging device e.g., camera), position sensors (e.g., electromagnetic sensor) 1407 is located at the tip of the catheter or endoscope shaft 1405. For example, line of sight of the camera may be controlled by controlling the articulation of the active bending section 1403. In some instances, the angle of the camera may be adjustable such that the line of sight can be adjusted without or in addition to articulating the distal tip of the catheter or endoscope shaft. For example, the camera may be oriented at an angle (e.g., tilt) with respect to the axial direction of the tip of the endoscope with aid of an optical component.

[0158] The distal tip 1405 may be a rigid component that allow for positioning sensors such as electromagnetic (EM) sensors, imaging devices (e.g., camera) and other electronic components (e.g., LED light source) being embedded at the distal tip.

[0159] In real-time EM tracking, the EM sensor comprising of one or more sensor coils embedded in one or more locations and orientations in the medical instrument (e.g., tip of theAttorney Docket No. 55441-738601endoscopic tool) measures the variation in the EM field created by one or more EM field generators positioned at a location close to a patient. The location information detected by the EM sensors is stored as EM data. The EM field generator (or transmitter), may be placed close to the patient to create a low intensity alternating magnetic field that the embedded sensor may detect. The alternating magnetic field induces small currents in the sensor coils of the EM sensor, which may be analyzed to determine the distance and angle between the EM sensor and the EM field generator. For example, the EM field generator may be positioned close to the patient torso during procedure to locate the EM sensor position in 3D space or may locate the EM sensor position and orientation in 5DOF or 6DOF. This may provide a visual guide to an operator when driving the bronchoscope towards the target site.

[0160] The endoscope may have a unique design in the elongate member. In some cases, the active bending section 1403, and the proximal shaft of the endoscope may consist of a single tube that incorporates a series of cuts (e.g., reliefs, slits, etc.) along its length to allow for improved flexibility, a desirable stiffness as well as the anti-prolapse feature (e.g., features to define a minimum bend radius).

[0161] As described above, the active bending section 1403 may be configured to allow for bending in two or more degrees of freedom e.g., articulation). A greater bending degree such as 180-degrees and 270-degrees (or other articulation parameters for clinical indications) can be achieved by the unique structure of the active bending section. In some cases, a variable minimum bend radius along the axial axis of the elongate member may be provided such that an active bending section may comprise two or more different minimum bend radii.

[0162] The articulation of the endoscope may be controlled by applying force to the distal end of the endoscope via one or multiple pull wires. The one or more pull wires may be attached to the distal end of the endoscope. In the case of multiple pull wires, pulling one wire at a time may change the orientation of the distal tip to pitch up, down, left, right or any direction needed. In some cases, the pull wires may be anchored at the distal tip of the endoscope, running through the bending section, and entering the handle where they are coupled to a driving component (e.g., pulley). This handle pulley may interact with an output shaft from the robotic system.

[0163] In some embodiments, the proximal end or portion of one or more pull wires may be operatively coupled to various mechanisms (e.g., gears, pulleys, capstans, etc.) in the handle portion of the catheter assembly. The pull wire may be a metallic wire, cable, or thread, or it may be a polymeric wire, cable, or thread. The pull wire can also be made of natural or organic materials or fibers. The pull wire can be any type of suitable wire, cable, or thread capable of supporting various kinds of loads without significant deformation or breakage. The distal end / portion of one or more pull wires may be anchored or integrated to the distal portion of theAttorney Docket No. 55441-738601catheter, such that operation of the pull wires by the control unit may apply force or tension to the distal portion which may steer or articulate (e.g., up, down, pitch, yaw, or any direction inbetween) at least the distal portion (e.g., flexible section) of the catheter.

[0164] The pull wires may be made of any suitable material such as stainless steel (e.g., SS316), metals, alloys, polymers, nylons, or biocompatible material. Pull wires may be a wire, cable, or a thread. In some embodiments, different pull wires may be made of different materials for varying the load bearing capabilities of the pull wires. In some embodiments, different sections of the pull wires may be made of different material to vary the stiffness or load bearing along the pull. In some embodiments, pull wires may be utilized for the transfer of electrical signals.

[0165] The proximal design may improve the reliability of the device without introducing extra cost allowing for a low-cost single-use endoscope. In another aspect of the disclosure provides, a single-use robotic endoscope is provided. The robotic endoscope may be a bronchoscope and can be the same as the steerable catheter assembly as described elsewhere herein. Conventional endoscopes can be complex in design and are usually configured to be re-used after procedures, which require thorough cleaning, dis-infection, or sterilization after each procedure. The existing endoscopes are often configured with complex structures to ensure the endoscopes can endure the cleaning, dis-infection, and sterilization processes. The provided robotic bronchoscope can be a single-use endoscope that may beneficially reduce cross-contamination between patients and infections. In some cases, the robotic bronchoscope may be live or near real-time to the medical practitioner in a pre-sterilized package and are intended to be disposed of after a single-use.

[0166] As shown in FIG. 15, a robotic bronchoscope 1510 may comprise a handle portion 1513 and a flexible elongate member 1511. In some embodiments, the flexible elongate member 1511 may comprise a shaft, steerable tip, and a steerable / active bending section. The robotic bronchoscope 1510 can be the same as the steerable catheter assembly as described in FIG. 14.The robotic bronchoscope may be a single-use robotic endoscope. In some cases, only the catheter may be disposable. In some cases, at least a portion of the catheter may be disposable. In some cases, the entire robotic bronchoscope may be released from the instrument driving mechanism and can be disposed of. In some cases, the bronchoscope may contain varying levels of stiffness along its shaft, as to improve functional operation. In some cases, a minimum bend radius along the shaft may vary.

[0167] The robotic bronchoscope can be releasably coupled to an instrument driving mechanism 1520. The instrument driving mechanism 1520 may be mounted to the arm of the robotic support system or to any actuated support system as described elsewhere herein. The instrument driving mechanism may provide mechanical and electrical interface to the robotic bronchoscope 1510.The mechanical interface may allow the robotic bronchoscope 1510 to be releasably coupled toAttorney Docket No. 55441-738601the instrument driving mechanism. For instance, the handle portion of the robotic bronchoscope can be attached to the instrument driving mechanism via quick install / release means, such as magnets and spring-loaded levels. In some cases, the robotic bronchoscope may be coupled or released from the instrument driving mechanism manually without using a tool.

[0168] FIG. 16 shows an example of an instrument driving mechanism 1600B providing mechanical interface to the handle portion 1613 of the robotic bronchoscope. As shown in the example, the instrument driving mechanism 1600B may comprise a set of motors that are actuated to rotationally drive a set of pull wires of the flexible endoscope or catheter. The handle portion 1613 of the catheter assembly may be mounted onto the instrument drive mechanism so that its pulley assemblies or capstans are driven by the set of motors. The number of pulleys may vary based on the pull wire configurations. In some cases, one, two, three, four, or more pull wires may be utilized for articulating the flexible endoscope or catheter.

[0169] The handle portion may be configured to allow the robotic bronchoscope to be disposable at reduced cost. For instance, classic manual and robotic bronchoscopes may have a cable in the proximal end of the bronchoscope handle. The cable often includes illumination fibers, camera video cable, and other sensors fibers or cables such as electromagnetic (EM) sensors, or shape sensing fibers. Such complex cable can be expensive, adding to the cost of the bronchoscope. The provided robotic bronchoscope may have an improved design such that simplified structures and components can be employed while preserving the mechanical and electrical functionalities. In some cases, the handle portion of the robotic bronchoscope may employ a cable-free design while providing a mechanical / electrical interface to the catheter.

[0170] FIG. 17 shows an example of a distal tip 1700 of an endoscope. In some cases, the distal portion or tip of the catheter 1700 may be substantially flexible such that it can be steered into one or more directions (e.g., pitch, yaw). The catheter may comprise a tip portion, bending section, and insertion shaft. In some embodiments, the catheter may have variable bending stiffness along the longitudinal axis direction. For instance, the catheter may comprise multiple sections having different bending stiffness (e.g., flexible, semi-rigid, and rigid). The bending stiffness may be varied by selecting materials with different stiffness / rigidity, varying structures in different segments (e.g., cuts, patterns), adding additional supporting components or any combination of the above. In some embodiments, the catheter may have variable minimum bend radius along the longitudinal axis direction. The selection of different minimum bend radius at different location long the catheter may beneficially provide anti-prolapse capability while still allow the catheter to reach hard-to-reach regions. In some cases, a proximal end of the catheter needs not be bent to a high degree thus the proximal portion of the catheter may be reinforced with additional mechanical structure (e.g., additional layers of materials) to achieve a greaterAttorney Docket No. 55441-738601bending stiffness. Such design may provide support and stability to the catheter. In some cases, the variable bending stiffness may be achieved by using different materials during extrusion of the catheter. This may advantageously allow for different stiffness levels along the shaft of the catheter in an extrusion manufacturing process without additional fastening or assembling of different materials.

[0171] The distal portion of the catheter may be steered by one or more pull wires 1705. The distal portion of the catheter may be made of any suitable material such as co-polymers, polymers, metals, or alloys such that it can be bent by the pull wires. In some embodiments, the proximal end or terminal end of one or more pull wires 1705 may be coupled to a driving mechanism (e.g., gears, pulleys, capstan etc.) via the anchoring mechanism as described above.

[0172] The pull wire 1705 may be a metallic wire, cable, or thread, or it may be a polymeric wire, cable, or thread. The pull wire 1705 can also be made of natural or organic materials or fibers. The pull wire 1705 can be any type of suitable wire, cable, or thread capable of supporting various kinds of loads without significant deformation or breakage. The distal end or portion of one or more pull wires 1705 may be anchored or integrated to the distal portion of the catheter, such that operation of the pull wires by the control unit may apply force or tension to the distal portion which may steer or articulate (e.g., up, down, pitch, yaw, or any direction in-between) at least the distal portion (e.g., flexible section) of the catheter.

[0173] The catheter may have a dimension so that one or more electronic components can be integrated to the catheter. For example, the outer diameter of the distal tip may be around 4 to 4.4 millimeters (mm), and the diameter of the working channel may be around 2 mm such that one or more electronic components can be embedded into the wall of the catheter. However, it should be noted that based on different applications, the outer diameter can be in any range smaller than 4 mm or greater than 4.4 mm, and the diameter of the working channel can be in any range according to the tool dimensional or specific application.

[0174] The one or more electronic components may comprise an imaging device, illumination device, or sensors. In some embodiments, the imaging device may be a video camera 1713. The imaging device may comprise optical elements and image sensor for capturing image data. The image sensors may be configured to generate image data in response to a broad range of wavelengths of light or to specific wavelengths of light. A variety of image sensors may be employed for capturing image data such as complementary metal oxide semiconductor (CMOS) or charge-coupled device (CCD). The imaging device may be a low-cost camera. In some cases, the image sensor may be provided on a circuit board. The circuit board may be an imaging printed circuit board (PCB). The PCB may comprise a plurality of electronic elements for processing the image signal. For instance, the circuit for a CCD sensor may comprise A / DAttorney Docket No. 55441-738601converters and amplifiers to amplify and convert the analog signal provided by the CCD sensor. Optionally, the image sensor may be integrated with amplifiers and converters to convert analog signal to digital signal such that a circuit board may not be required. In some cases, the output of the image sensor or the circuit board may be image data (digital signals) can be further processed by a camera circuit or processors of the camera. In some cases, the image sensor may comprise an array of optical sensors.

[0175] The illumination device may comprise one or more light sources 1711 positioned at the distal tip. The light source may be a light-emitting diode (LED), an organic LED (OLED), a quantum dot (QD), an array or combination of multiple LEDs, OLEDs, or QDs, or any other suitable light source. In some cases, the light source may comprise one or more of miniaturized LEDs for a compact design or Dual Tone Flash LED Lighting.

[0176] The imaging device and the illumination device may be integrated to the catheter. For example, the distal portion of the catheter may comprise suitable structures matching at least a dimension of the imaging device and the illumination device. The imaging device and the illumination device may be embedded into the catheter. FIG. 18 shows an example distal portion of the catheter with integrated imaging device and the illumination device. A camera may be located at the distal portion. The distal tip may have a structure to receive the camera, illumination device or the location sensor. For example, the camera may be embedded into a cavity 1810 at the distal tip of the catheter. The cavity 1810 may be integrally formed with the distal portion of the cavity and may have a dimension matching a length / width of the camera such that the camera may not move relative to the catheter. The camera may be adjacent to the working channel 1820 of the catheter to provide near field view of the tissue or the organs. In some cases, the attitude or orientation of the imaging device may be controlled by controlling a rotational movement (e.g., roll) of the catheter.

[0177] The power to the camera may be provided by a wired cable. In some cases, the cable wire may be in a wire bundle providing power to the camera as well as illumination elements or other circuitry at the distal tip of the catheter. The camera or light source may be supplied with power from a power source located at the handle portion via wires, copper wires, or via any other suitable means running through the length of the catheter. In some cases, real-time images or video of the tissue or organ may be transmitted to an external user interface or display wirelessly. The wireless communication may be WiFi, Bluetooth, RF communication or other forms of communication. In some cases, images or videos captured by the camera may be broadcasted to a plurality of devices or systems. In some cases, image or video data from the camera may be transmitted down the length of the catheter to the processors situated in the handle portion via wires, copper wires, or via any other suitable means. The image or video data may be transmittedAttorney Docket No. 55441-738601via the wireless communication component in the handle portion to an external device / system. In some cases, the system may be configured such that no wires are visible or exposed to operators.

[0178] In conventional endoscopy, illumination light may be provided by fiber cables that transfer the light of a light source located at the proximal end of the endoscope, to the distal end of the robotic endoscope. In some embodiments of the disclosure, miniaturized LED lights may be employed and embedded into the distal portion of the catheter to reduce the design complexity. In some cases, the distal portion may comprise a structure 1430 having a dimension matching a dimension of the miniaturized LED light source. As shown in the illustrated example, two cavities 1430 may be integrally formed with the catheter to receive two LED light sources. For instance, the outer diameter of the distal tip may be around 4 to 4.4 millimeters (mm) and diameter of the working channel of the catheter may be around 2 mm such that two LED light sources may be embedded at the distal end. The outer diameter can be in any range smaller than 4 mm or greater than 4.4 mm, and the diameter of the working channel can be in any range according to the tool's dimensional or specific application. Any number of light sources may be included. The internal structure of the distal portion may be configured to fit any number of light sources.

[0179] In some cases, each of the LEDs may be connected to power wires which may run to the proximal handle. In some embodiment, the LEDs may be soldered to separated power wires that later bundle together to form a single strand. In some embodiments, the LEDs may be soldered to pull wires that supply power. In other embodiments, the LEDs may be crimped or connected directly to a single pair of power wires. In some cases, a protection layer such as a thin layer of biocompatible glue may be applied to the front surface of the LEDs to provide protection while allowing light emitted out. In some cases, an additional cover 1831 may be placed at the forwarding end face of the distal tip providing precise positioning of the LEDs as well as sufficient room for the glue. The cover 1831 may be composed of transparent material with similar refractive index to that of the glue so that the illumination light may not be obstructed.Examples of User Interfaces

[0180] The systems, methods, and techniques described herein may be implemented at least in part with the use of a user interface that may be presented on a graphical user interface (e.g., the UI 2740 of FIG. 27). FIGs. 19-26 illustrate example user interfaces. At a high level, the user interfaces may be used for performing and interpreting tomosynthesis and augmented fluoroscopy.

[0181] The graphical user interface (GUI) may allow a user to switch between multiple modes in a guided workflow. In some cases, a user interface for tomosynthesis may be accessible from user interfaces for driving or navigation. For example, when a user drives the endoscope via theAttorney Docket No. 55441-738601driving or navigation interface 2500 as shown in FIG. 25, the user may choose to enter tomosynthesis mode by clicking on the icon 2501. For example, upon clicking on the icon 2501 switch to the tomosynthesis mode, a GUI (such as 1900 FIG. 19) of the tomosynthesis mode may be displayed. The tomosynthesis mode GUI may allow a user to return to the driving mode at any point such as by clicking the icon in the header 1901.

[0182] From the driving screen 2500, the user may continue viewing the camera feed 2505 from the bronchoscope and using the controller to drive through the lung. A user may choose to configure the driving GUI 2500 may adding or removing additional views. For example, the driving screen may be configured to display a virtual endoluminal view 2507, and virtual lungs 2509 which is a computer-generated 3D model of the lungs. The user may be permitted to add, remove, swap out one or more of other views such as the axial, coronal, and sagittal CTs and the like.

[0183] The virtual endoluminal view 2507 provides the user with a computer-recreated view of the camera feed along with a graphical element (e.g., ribbon) indicating the path to the currently selected target. In some cases, the path is also represented on the virtual lungs 2509. A user may switch to the tomosynthesis mode at any given time. For example, once the endoscope tip is within a biopsy -range of the target, the user may allow the tomosynthesis mode to help verify the relative distance to the lesion by clicking on the icon 2501. Details about the tomosynthesis operations and GUI are described later herein. After the tomosynthesis process is complete, the user may return to the driving screen 2500.

[0184] In some cases, upon completion of the tomosynthesis, the virtual endoluminal view may display a floating target based on the results of the tomography scan. FIG. 26 shows an example of the virtual endoluminal view 2600 displaying a target 2601 along with a graphical element 2603 (e.g., ribbon) indicating a path to the target. The angle of the target 2615 is displayed as seen from the point of view of the working channel, where a tool (e.g., needle instrument) will exit the bronchoscope. In some cases, an exit axis of the working channel may not be aligned to the axial axis of the endoscope distal tip (an example in FIG. 17 shows the exit axis 1721 of the working channel 1703). The point of view of the working channel may be based on a known dimension, structure, or configuration of the distal tip (e.g., exit axis 1721 of the working channel with respect to the endoscope tip, the imaging device 1713) and / or a real-time orientation and location of the distal tip. The angle of the target 2615 relative to the exit axis of the working channel may be determined based at least in part on the layout of the working channel within the distal tip, a real-time location and orientation of the distal tip and location of the target. The target and the angle arrow 2615 may help to assist the user in lining up the tool with the lesionAttorney Docket No. 55441-738601before taking a biopsy. The user may also choose to repeat the tomosynthesis process while the tool is expected to be in the lesion to increase confidence in the biopsy.

[0185] The virtual endoluminal panel displays a rendered view of the internal airways 2600. In some cases, the virtual endoluminal panel may allow a user to enter a targeting mode 2610. In some cases, once the user switches into targeting mode 2610, the rendered internal airways may disappear and the target 2611 may be displayed (e.g., depicted as a filled elliptical shape) in free space when the target is within a predetermined proximity range from the tip. The predetermined proximity range may be determined by the system or configurable by a user. In some cases, a graphical element (e.g., crosshair 2613, and arrow 2615) may display in the center of the panel with a triangular shaped indicator around its edge to show the target’s position relative to the direction the scope is facing.

[0186] In some cases, after tomosynthesis has been completed, the automated guided workflow may allow a user to adjust the position of the lesion (target) based on at least in part on a tomosynthesis calculation. In some cases, a tomosynthesis calculation may comprise one or more of a relationship between the position of the lesion and position of the scope tip. For instance, the position of the lesion may be automatically updated based on the relationship between the scope tip and lesion according to the tomosynthesis calculation. In some cases, a user may toggle the tomosynthesis calculated adjustments to the target via the graphical icon 2503 shown in the driving screen 2500 in FIG. 25. When the toggle is on, the position of the target in the virtual lung and virtual endoluminal panels may be adjusted or updated to reflect the calculations made by the tomosynthesis process based on the user-selected scope and lesion. When the toggle is off, such calculations may be disregarded and the position of the scope tip may solely rely on EM data and the position of the target may solely rely on the planned target on the CT scans. In alternative cases, a user may choose to adjust the position of the scope instead of or in addition to adjusting the location of the lesion / target.

[0187] In some cases, after tomosynthesis has been completed, an augmented fluoroscopy may be available in the fluoroscopic mode. A user may allow the augmented fluoroscopy such as via the toggle 2401 displayed within the user interface 2400 of a fluoroscopy panel to switch on the augmented fluoroscopy mode. The fluoroscopic view mode may be accessed from the driving mode during the entire navigation process. For example, a user may switch to the fluoroscopy view mode from the driving mode via the driving screen. The fluoroscopy view may provide real-time fluoroscopy images / video. The user interface 2400 of the fluoroscopy panel may display an augmented fluoroscopy feature allowing a user to allow / disable the augmentation to the fluoroscopy view. For instance, if the augmented fluoroscopy is toggled on (“Enabled”), an overlay of the target / lesion 2403 may be displayed on the fluoroscopy view. In some cases, theAttorney Docket No. 55441-738601option to toggle on / off the augmented fluoroscopy may be available regardless the tomosynthesis is completed. If the augmented fluoroscopy is toggled on (“Enabled”) prior to completion of tomosynthesis (when a target location is not available), there may not be a display of the overlay of the target / lesion. The availability of the target / lesion information from the tomosynthesis can be obtained as described above. For example, lesion information may be broadcasted for the augmented fluoroscopy overlay through data contracts between the state machines as described above.

[0188] Existing endoscopic systems utilizing tomosynthesis techniques may not be compatible with any types of imaging apparatus (e.g., C-arm system). For example, current endoscopic systems may either be compatible with selected C-arm system or require cumbersome setting up for each C-arm system. The endoscopic systems herein employ an improved tomosynthesis algorithm as described above that can be compatible with any type of C-arm with minimum or reduced information about the C-arm system. For example, the system herein may provide a user interface allowing easy and convenient set up of C-arm systems.

[0189] FIG. 19 shows the example user interface 1900 of a tomosynthesis process dashboard. As illustrated, the user interface 1900 includes a header, a camera panel, a step indicator, instructions, visual guidance, an exit tomography function, and progression buttons. The user interface 1900 for tomosynthesis may be accessible from user interfaces for driving or navigation. Each of the header may remain present and the camera panel may remain visible for the entire tomosynthesis process. Users of the user interface 1900 may be guided through tomosynthesis by screens that may be broken down into a series of steps with an indication to the user of where in the tomosynthesis process they are currently in (see the step indicator). Within each screen, the instructions, and the visual guidance in the form of images or videos may be displayed. At any point during the tomosynthesis process, the user may be able to exit the tomosynthesis screens of the user interface 1900 and return to the driving user interfaces. The progression buttons may also allow the user to navigate through the steps of the tomosynthesis process as necessary.

[0190] FIG. 20 shows an example user interface 2000 of a C-arm settings dashboard. As illustrated, the user interface 2000 includes a C-arm drop-down and C-arm settings. At the user interface 2000, the user may select the connected and compatible C-arm from the drop-down. Once the C-arm is selected, possible settings for that model of the C-arm may be displayed. The displayed settings may be default settings, previous settings, recommended setting, optimal, improved settings, or the like. Once the C-arm settings are selected (e.g., by the user), the user may be instructed to adjust the C-arm to the selected settings.Attorney Docket No. 55441-738601

[0191] Upon setting up the imaging devices in the GUI (e.g., user interface 2000), the user may be guided to capture fluoroscopic images using the imaging device and may be guided to select a scope location via a GUI. FIG. 21 shows an example user interface 2100 of a scope selection dashboard. As illustrated, the user interface 2100 includes a fluoroscopy image and angle controls. At the user interface 2200, the fluoroscopy image may be displayed. The fluoroscopy image may be 2D images without augmentation. The user may be able to scroll through different angles of the scope captured from the C-arm using a slider shown in the user interface 2100 to choose one or more fluoroscopy images for selecting a location of the scope. For example, a user may click on the fluoroscopic image indicating a location of the scope tip. FIG. 22 shows an example user interface 2200 of a selection crosshair panel. The user interface 2200 may show a more detailed illustration of the fluoroscopy image of the user interface 2100. The user interface 2200 may comprise one or more of a selection crosshair. The user interface 2200 may display the selection cross hair upon the scope selection on the fluoroscopic image displayed within the user interface 2100 indicative of the location of the scope.

[0192] In some cases, upon selecting the location of the scope, the user may be guided to select a location for the target (e.g., lesion). FIG. 23 shows an example user interface 2300 of a lesion selection (target selection) dashboard. As illustrated, the user interface 2300 may display a reconstructed tomography 2310, CT Panels 2320, a selection crosshair 2315, a scrollbar 2313, a reset button, a depth indicator 2311, instructions, brightness and contrast controls, a view angle indicator. At the user interface 2300, once tomosynthesis scans have been captured, the user may be presented with pairs of CT 2320 and reconstructed tomography images 2310 from multiple orientations. As illustrated on the user interface 2300, within each set of images, the tomosynthesis images 2310 and the CT scans 2320 may be displayed with their corresponding view angle e.g., view angle is indicated in the upper left corner). Crosshairs 2315 may be displayed in the user interface 2300 across all scans for a user to mark the lesion selection. The layers of each scan may be parsed via the scrollbar 2313 overlaid on the tomosynthesis image, with the depth of the view 2311 being displayed indicated within the image. The view may be able to be reset to its default by clicking on the reset button. The instructions as well as the brightness and contrast controls may be provided underneath the scans to guide the user through the process and allow them to adjust the image views as needed.

[0193] FIG. 24 shows an example user interface 2400 of an augmented fluoroscopy panel. As illustrated, the user interface 2400 includes a user selected lesion location indicator and an augmented fluoroscopy toggle. In some cases, after the tomosynthesis process has been completed, an overlay of the target location is available (e.g., based on the target location determined from the tomosynthesis and projected onto the 2D fluoroscopic image as described inAttorney Docket No. 55441-738601FIG. 11 and elsewhere herein) which may allow an augmented fluoroscopy feature 2401. The user selected lesion location will be indicated on the fluoroscopy panel as an overlay on the user interface 2400. The overlay can be toggled e.g., by the user) via the augmented fluoroscopy toggle. However, if the augmented fluoroscopy toggle is allowed and no overlay is available (e.g., the camera pose may not be reconstructed), then no change will be displayed on the fluoroscopy view.Detecting incorrect overlay location in augmented fluoroscopy

[0194] Performing augmented fluoroscopy alongside tomosynthesis may introduce unique challenges for maintaining accurate overlay alignment between the different imaging modalities. As described above, variations in physical conditions / states such as patient posture, breath-hold states, and the like may create discrepancies that affect procedural accuracy. The discrepancies may comprise, for example, change of distance between a target area (e.g., lesion) and an anatomical structure (e.g., bone, airway, etc.) or other reference features, change of size / shape of a target area and the like due to the different physical states (e.g., different breath holding phases). For example, in bronchoscopy, a patient’s lung may be in different states between the 3D tomosynthesis imaging and the 2D live fluoroscopy resulting in the lesion overlay location determined based on the tomosynthesis incorrect in the live fluoroscopy. In some cases, such discrepancy or misalignment between the tomosynthesis and live fluoroscopy caused by the different physical state of the subject may not be known or observable to the physician (i.e., user), and the physician may proceed with biopsy based on the incorrect lesion overlay in the live fluoroscopy.

[0195] The present disclosure provides systems and methods capable of determining whether the augmented live fluoroscopy displays the target overlay (e.g., lesion overlay) in a correct or incorrect location. The methods and / or algorithms herein may be able to quantitatively determine whether it is safe or acceptable to proceed with a surgical operation based on the augmented live fluoroscopy. In some cases, the methods herein may determine whether a location change associated with the target object is beyond an acceptable range. In some cases, the method may generate suggested actions to change a physical condition (e.g., breath holding state) of a subject prior to the user proceeding with a surgical operation based on the augmented live fluoroscopy.

[0196] In some embodiments, the method may determine whether the location change is within or beyond a predetermined threshold by tracking an anatomical reference feature in both the 3D tomosynthesis image and the 2D live fluoroscopy image. The reference feature may be selected to be an anatomical feature whose location may change with a change in physical state of the subject, for example, the diaphragm or a portion of the diaphragm. The reference feature may comprise any anatomical landmark, marker, fiducial, lesion, or any other visually identifiableAttorney Docket No. 55441-738601structure that is visible across multiple imaging frames. Examples of a reference feature may include a naturally occurring anatomical landmark such as a diaphragm or a portion of a diaphragm, a distinctive vascular pattern, or a radiopaque marker introduced into the patient’s anatomy. The reference feature may have a movement or location change due to a physical condition / state change (e.g., breath holding state).

[0197] The method may then determine whether the lesion overlay’s location in the real-time or live fluoroscopy is correct or incorrect. By identifying and monitoring these reference features, the disclosed systems and methods can determine whether the displayed lesion overlay remains accurately placed, or whether new fluoroscopy image / tomosynthesis image should be acquired as patient anatomy shifts due to changes in breath-hold states or other clinical conditions.

[0198] The present disclosure provides systems and methods for real-time fluoroscopy image motion detection. In particular, the motion detected in the live fluoroscopy may indicate undesired location change of a target object (e.g., lesion) due to changes of physical conditions of the subject between a three-dimensional (3D) imaging and a two-dimensional (2D) live imaging. The method may be capable of identifying a location change of a feature in a 3D image and 2D image caused by a physical condition change rather than other factors (e.g., different viewing angles, image distortions, etc.). In some cases, the physical condition change may be identified by selecting a 2D image from a sequence of 2D images (that are proceed for constructing the 3D image) that has an associated angle substantially matches a viewing angle of the 2D live image and compare the location of a feature in both the selected 2D image and the 2D live image.

[0199] In some embodiments, the disclosed systems and methods are configured for acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature. In some instances, the sequence of 2D fluoroscopy images is acquired at various angles. In some cases, the disclosed systems and methods are configured for reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles. In some cases, the disclosed systems and methods are configured for acquiring live fluoroscope image frames at a specific angle. In some instances, the live fluoroscope image frames contain the target feature. In some instances, the disclosed systems and methods are configured for displaying an overlay of a projection of the target feature onto the live fluoroscope image frames. In some instances, the projection of the target feature is based at least in part on the target feature in the reconstructed 3D image. In some instances, the disclosed systems and methods are configured for determining whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference feature identified in one of the sequence of 2D fluoroscopy images and / or identified in the live fluoroscope image frames.Attorney Docket No. 55441-738601

[0200] In some embodiments, the present disclosure provides systems and methods for determining the angle at which each two-dimensional fluoroscopy image is acquired during tomosynthesis may be facilitated by the use of markers such as bead patterns or other reference markers attached to the patient support structure. In some cases, these bead patterns may be displayed in each acquired fluoroscopy image, allowing the system to derive the imaging angle through geometric and computational analysis, rather than relying solely on mechanical angle sensors. In some instances, this angle determination ensures that each acquired 2D fluoroscopy image can be accurately associated with its corresponding viewpoint. The method may estimate the angle associated with each of the sequence of 2D fluoroscopy images for constructing the 3D tomosynthesis image. The method may select a 2D fluoroscopy image from the sequence of 2D fluoroscopy images acquired at the various angles with an angle that best matches the live fluoroscopy viewpoint, to determine whether the physical conditions between the tomosynthesis and the live fluoroscopy are consistent.

[0201] FIG 31 shows an example of inconsistent breath-hold states affecting lesion localization and guidance alignment. As depicted in FIG. 31, the left image 3101 is an augmented live fluoroscopy view showing a lesion overlay (e.g., a circle) 3102 with an indicator such as a (green) “Guidance Match” bar 3104, confirming correct location or good conditions for accurate lesion localization. In some instances, such stability is indicative that patient posture and breathing patterns remain steady enough to correlate the target feature (e.g., lesion) identified in the tomosynthesis data with that is appeared in the live fluoroscopy frames. For example, maintaining a reliable breath-hold state reduces misalignment and enhances procedural confidence. In further examples, subtle patient coaching or minor imaging parameter adjustments may achieve this favorable alignment. As an example, real-time system feedback (e.g., such as to an operator) helps reinforce improved breath-hold states, ultimately enhancing lesion targeting accuracy.

[0202] In some embodiments, inconsistent breath-hold conditions produce noticeable lesion displacement and reduced localization reliability. As depicted in FIG. 31, the right image 3106 presents a lesion overlay (e.g., a blue circle) 3103 displayed at an incorrect location accompanied by an indicator such as a (red) “Guidance match” bar 3105, indicating deteriorated imaging conditions caused by patient movement or varying respiratory patterns. The contrast between stable conditions in the left image 3101 and unstable conditions in the right image 3106 underscores the importance of maintaining consistent breath-hold states during both preacquisition and live imaging. In some instances, real-time diaphragm monitoring, along with adjustments to imaging angles, patient posture, or real-time breath coaching, helps restore breathhold consistency.Attorney Docket No. 55441-738601

[0203] In some embodiments, the system verifies whether the subject’s physical condition corresponds to a breath-holding state by evaluating the subject’s respiratory parameters against a second physical condition threshold. In some cases, the system uses respiratory monitors to determine if movement remains below this threshold, thus maintaining stable imaging references. In some instances, by confirming that a breath-hold state is achieved, the system may reject inaccurate overlays that may otherwise mislead procedural decisions (e.g., FIG. 31, where stable breath-hold aligns with “GUIDANCE MATCH”). For example, if breath-hold parameters are not met, the system may discard current overlay assessments to prevent erroneous localization. In further examples, if breath parameters exceed the threshold, the system may initiate corrective actions, such as reacquiring images or generating alerts, to help re-establish a suitable breathhold state. As an example, this approach may ensure consistent positioning, accurate lesion targeting, and improved procedural outcomes.

[0204] FIG. 32 shows an example of a method 3200 for integrating tomosynthesis with real-time augmented fluoroscopy imaging and detecting a physical condi tion / state change (e.g., breath hold stat 1, breath hold state 2) between the tomosynthesis and the augmented fluoroscopy. The method can be the same as those described above such as FIG. 2. In some embodiments, the method 3200 comprises processing 2D X-ray or fluoroscopy images 3202 and associated poses 3203 during a first breath hold state 3201. In some cases, the system may also acquire live fluoroscopy frames 3206 during a second breath hold state 3204. The poses associated with a sequence of 2D X-ray or fluoroscopy images 3202 can be estimated using markers as described above.

[0205] For example, in the first breath hold state 3201, the system 3200 processes the sequence of 2D fluoroscope images 3202 and the associated poses 3203, and feed into a tomosynthesis module 3209, which reconstructs the acquired data into a 3D reconstructed tomosynthesis image 3208. The tomosynthesis module 3209 may perform 3D reconstruction using a sequence of 2D X-ray images 3203 taken from different orientations, different viewpoints and / or different angles as described elsewhere herein. In some instances, augmented fluoroscopy 3206 comprises projecting a target object (e.g., lesion) in the 3D reconstruction 3208 onto the live fluoroscopic images. For example, projecting the 3D lesion onto the 2D fluoroscopy image at a location in the 2D fluoroscopy and displaying the projection as a lesion overlay can be the same as those described in FIG. 11. In some cases, the parameter “t” represents a timing or temporal parameter that correlates each acquired image 3202 with its corresponding X-ray pose 3203 at a particular moment during the first breath hold state 3201.

[0206] In the example of FIG. 32, during the second breath hold state 3204, an X-ray Pose 3205 or viewpoint for acquiring the 2D live fluoroscopy is obtained and processed by an augmentedAttorney Docket No. 55441-738601fluoroscopy module 3206. This x-ray pose may be utilized for projecting the 3D lesion 3208 into the live fluoroscopy frames. Based on the lesion overlay displayed on the augmented fluoroscopy, a physician may perform various surgical operations such as biopsy procedures 3207 (e.g., lesion localization). In some instances, the output from the augmented fluoroscopy module 3206 guides a biopsy 3207, permitting accurate and precise procedures with confidence, as the displayed lesion location in the second breath hold state 3204 aligns with the lesion location in the first breath hold state 3201. However, the projection of the 3D lesion into the 2D live fluoroscopy image assumes that the first breath-hold state and the second breath-hold state are substantially the same. A change in the breath hold state can result in a change in the location of the lesion overlay. The method herein can automatically detect such location change caused by the breath hold state change and determine whether new images should be acquired with a consistent breath hold state.

[0207] In some embodiments, the method may comprise detecting such location change or physical condition change by comparing location of an anatomical reference feature captured in the 3D tomosynthesis image and the 2D live fluoroscopy image. Rather than project a 3D reconstructed mesh model of the anatomical reference feature into the 2D live fluoroscopy image, the method may identify the reference feature in a 2D fluoroscope image selected from the sequence of 2D fluoroscope images acquired at various angels, and compare the location of same reference feature in the 2D live fluoroscope image and the selected 2D fluoroscope image. The 2D fluoroscope image is selected to have an angle best matching the angle or viewpoint of the live 2D fluoroscopy. Details about identifying the reference feature from the image are described later herein.

[0208] In some embodiments, the system may comprise a processor configured to establish a displacement threshold that defines a permissible range of displacement values for the reference feature. The threshold may be a predetermined limit on the acceptable positional shift of a reference feature between a the 2D fluoroscope image during tomosynthesis and a live fluoroscopy. If measured displacement exceeds this threshold, the system may determine that the target object overlay in the augmented live fluoroscopy is incorrect and prompt corrective actions or imaging adjustments.

[0209] In some embodiments, the method herein may be capable of identifying the location change caused by the subject’s physical state change. The method may comprise selecting the 2D fluoroscope image from the sequence of 2D fluoroscope images and / or registering the images to rule out location mismatch caused by other factors (e.g., different viewing angles, image distortion). For example, the method may register the selected 2D fluoroscope image to the 2D live fluoroscope image when there is no 2D fluoroscope image matching the viewpoint of the 2DAttorney Docket No. 55441-738601live fluoroscope image to eliminate discrepancy due to the angle difference. This beneficially allows for generating recommendation for adjusting the physical state for retaking or re-acquiring the medical image data. In some cases, the system and method herein may generate a quantitative breath hold state difference. By comparison the breath hold state difference against a “breathhold difference threshold”, the system may output an indicator indicative about the validity or quality of the augmented live fluoroscopy imaging. As described later herein, the system may display, for example, a “POOR” indicator indicating the lesion overlay does not accurately represent the lesion’s actual location when the breath hold state difference is above the breathhold difference threshold. Details about the user interface for displaying the indicator and recommended corrective action are described later herein.

[0210] The breath-hold difference threshold is a predetermined limit that evaluates whether the patient’s second respiratory state during the second imaging (e.g., augmented fluoroscopy) matches the state present during the first imaging (e.g., tomosynthesis imaging). Exceeding this threshold indicates a change in internal anatomy positioning due to altered breathing, potentially prompting re-establishment of the desired breath-hold state or re-acquisition of images to restore accurate overlay alignment.

[0211] In some cases, the system may comprise a memory storing instructions that compute a displacement metric between a location of the reference feature in the live fluoroscope image frame and the location of the reference feature in the selected 2D fluoroscope image from the tomosynthesis, and compare it to the displacement threshold. In some instances, the system may comprise a display configured to present the overlay as correct if the measured displacement remains at or below the displacement threshold.

[0212] In some embodiments, the location displacement may be visually displayed on a GUI for a user to visualize the difference. Alternatively, the location displacement may be calculated in the backend and not presented on the GUI.

[0213] FIGs. 29A-29C shows an example of a user interface (UI) 2900 displaying the location displacement information. FIG. 29A shows an example of a user interface (UI) 2900 for live or real-time fluoroscopy image along with the location displacement information or motion detection information. In some embodiments, the UI 2900 may comprise one or more of multiple image panels to display whether an overlay of a target object aligns correctly with a live fluoroscopy frame. For example, the UI may display information regarding whether a lesion overlay 2904 in the augmented fluoroscopy view 2901 is displayed at the correct location.

[0214] In the example of FIG. 29A, the GUI may 2900 include a display e.g., bottom panel 2901 of an augmented live fluoroscopy view 2901 with a lesion overlay 2904 (e.g., red circle). The lesion overlay may be color coded (e.g., red) indicating incorrect location of the lesionAttorney Docket No. 55441-738601overlay i.e., displacement is above breath-hold difference threshold. In some cases, such conditions indicate that the patient’s breath-hold state differs from the pre-acquired tomosynthesis data, making the circumstances unsuitable to proceed with procedures such as biopsies. The UI may display an indicator such as “POOR” indicating that the current augmented fluoroscopy is not good for biopsy procedure.

[0215] In the example of FIG. 29A, the GUI may comprise a display e.g., upper left panel, of a computed difference 2902 between the reference fluoroscopy frame (e.g., 2D fluoroscopy frame selected from a tomosynthesis sequence that has an angle best matching the viewpoint of the augmented live fluoroscopy) and the live fluoroscopy frame 2901. The difference map 2902 may be a monochrome image showing the difference between two images. In some cases, the difference map 2902 may be generated by computing a location difference of a reference feature (e.g., diaphragm) in both the reference fluoroscopy frame and the augmented live fluoroscopy. For example, the reference feature may be at least a portion of a diaphragm detected from the reference fluoroscopy frame and the augmented live fluoroscopy, and the location of the detected reference feature is compared to generate the difference map.

[0216] In the example of FIG. 29A, the GUI may comprise a display e.g., upper right panel, of the reference fluoroscopy frame 2903 (e.g., 2D fluoroscopy frame selected from a tomosynthesis sequence) overlaid with a contour 2905 (e.g., blue outline) of the location difference of the reference feature (e.g., diaphragm). The contour 2905 represents the location difference between the reference frame and the live frame, providing a visual measurement of displacement of the targe feature (e.g., lesion) location or the reference feature (e.g., diaphragm). Any suitable methods and algorithms (e.g., contour-based, deep learning-based feature identification, featurebased motion detection, etc.) may be employed to compute the location difference or breath holding state difference. Details about computing the difference in the physical condition change (e.g., breath holding state change) are described later herein.

[0217] In some embodiments, the system determines viewpoints (angles) of the reference fluoroscopy frame 2903 based on the pose estimation (e.g., a tomosynthesis board and bead patterns as described above) to select a best matched reference frame 2903. The viewpoint of the live fluoroscopy frame 2901 may be obtained from the system (e.g., setup of the live fluoroscopy angle or configured by a user).

[0218] As illustrated, the “REFERENCE + TILT MATCH: POOR” indicator associated with the computed difference 2902 may indicate that the lesion overlay in 2901 may not accurately represent the lesion’s actual location due to the change in the physical state (e.g., breath holding state). For example, when the difference exceeds a predetermined threshold, “REFERENCE + TILT MATCH: POOR” indicates a misalignment, indicating the displayed lesion location mayAttorney Docket No. 55441-738601be unreliable. As used herein, “POOR,” “GOOD,” and “GREAT” conditions refer to qualitative indicators of alignment quality. “POOR” may signify that measured displacement or angular discrepancy greatly exceeds thresholds, “GOOD” indicates improved but not improved alignment, and “GREAT” represents minimal discrepancies and nearly ideal overlay placement. As described above, the thresholds determining the different misalignment levels can be manually set up by a user or automatically determined by the system.

[0219] In some cases, the system may issue an alert indicating that conditions are not suitable for biopsy. In some instances, by employing the pose estimation algorithm (e.g., using a tomosynthesis module and bead patterns), the system may select a reference frame 2903 that best matches the live fluoroscopy angle based on the estimated poses and generate the difference based on a reference feature in the reference frame and the live fluoroscopy.

[0220] In some embodiments, the GUI may display both quantitative difference such as the difference map 2902, and difference contour 2905 as well as qualitative difference such as the indicator of poor, good or great match, or the color-coded target feature 2904. Displaying the visual indicators (e.g., 2902, 2904, and 2905) alongside the reference fluoroscopy frame 2903 allows the system to assess the patient’s breath-hold state and determine if improved alignment is needed before proceeding.

[0221] FIG 29B illustrates a scenario showing a “GOOD” condition, representing reduced discrepancies in the computed difference 2902, contour 2905, and location difference 2904. In the example of FIG. 29C, the reduced discrepancies indicate accurate alignment between the reference frame and the live fluoroscopy frames 2901. In some cases, the lesion overlay circle 2904 may be color-coded indicative of an alignment level e.g., a blue circle), confirming that the current breath-hold state matches the one during reference image acquisition. For example, under these “GOOD” conditions, the lesion is accurately aligned within the contour 2905, allowing image-based guidance tasks to proceed confidently. In further examples, the system may adjust parameters or re-check thresholds to refine lesion localization accuracy. As an example, with stable alignment and consistent breath-hold parameters, the target lesion location is reliably represented, allowing the biopsy or other interventional procedures to continue as planned.

[0222] FIG. 29C shows an example of a “GREAT” condition 2902, representing the highest level of consistent physical conditions. Under the great condition 2902, the computed difference 2902, contour 2905, and location difference 2904 may display minimal variation, indicating a highly consistent physical states between the tomosynthesis imaging and the live fluoroscopy 2901. In some cases, the lesion overlay circle 2904 may be color-coded indicative of a great alignment (e.g., a green circle), showing that the current breath-hold state matches the one present during the reference image acquisition. In some instances, this stable alignment facilitatesAttorney Docket No. 55441-738601precise lesion localization, permitting the system to maintain high accuracy during interventional procedures such as biopsies.

[0223] In some cases, the system may comprise one or more of feedback circuitry that continuously monitors displacement over time. In some instances, the system operates in live to verify alignment whenever updated measurements approach a first displacement threshold, ensuring ongoing stability and reliability in lesion localization. For example, the alignment indicator (e.g., the lesion overlay circle 2904) dynamically updates its color or position in response to real-time measurements, providing immediate visual feedback to clinicians regarding the suitability of current conditions. In some instances, the system is configured to dynamically update alignment indicators, such as the color of the lesion overlay 2904, based on real-time measurements of displacement or misalignment. For example, the system functionality provides real-time updates to alignment indicators, such as the lesion overlay circle 2904, based on displacement or misalignment measurements. As an example, the system may identify deviations from a predetermined displacement threshold and initiate necessary adjustments or recalibrations in real-time, maintaining improved alignment and ensuring procedural accuracy and reliability.

[0224] In alternative embodiments, the UI may not display the quantitative difference information and / or the qualitative difference information. The displacement or physical state evaluation may be performed in the backend. FIG. 30 illustrates a user interface (UI) 3000 for an augmented fluoroscopy system. In some cases, the user interface 3000 displays a live fluoroscopy view integrated with a CT cross-section and a 3D airway reconstruction. In some embodiments, UI 3000 displays a real-time fluoroscopy frame, a target positioning indicator, and a 3D anatomical reference such as an airway reconstruction. In some instances, UI 3000 does not display backend difference maps or tomosynthesis overlays, as these processes occur in the background. In some instances, UI 3000 does display backend difference maps or tomosynthesis overlays, as these processes occur in the background.

[0225] The UI 3000 may display any visual element (e.g., 3001, 3004, 3005, and 3006) that may assist with navigation accuracy and interventional procedures. In some instances, the visual element may integrate fluoroscopic, CT, and 3D anatomical references. Additionally, in some instances, the real-time positional data and distance measurements displayed, such as near 3005, may provide guidance for target lesion localization. In some cases, when a physical state change is detected or when the overlay location 3002 is determined to be incorrect, the UI 3000 may display a warning message or indicator prompting adjustment of the physical state.

[0226] In some embodiments, the system comprises at least one imaging module configured to acquire a sequence of two-dimensional (2D) fluoroscopy images containing a target feature. In some cases, the sequence of 2D fluoroscopy images is acquired at various angles. In some cases,Attorney Docket No. 55441-738601the system comprises a reconstruction module configured to reconstruct a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles.

[0227] In some cases, the system comprises at least one fluoroscopy acquisition module configured to acquire live fluoroscope image frames at a specific angle. In some instances, the live fluoroscope image frames contain the target feature. For example, the system may comprise a display module configured to present an overlay of a projection of the target feature onto the live fluoroscope image frames. As an example, the projection of the target feature is based, at least in part, on the target feature in the reconstructed 3D image.

[0228] In some cases, the system may comprise at least one analysis module configured to determine whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference feature identified in both the sequence of 2D fluoroscopy images and the live fluoroscope image frames. In some instances, the system may comprise at least one condition verification module to verify that the correct location occurs when the patient’s physical condition — such as a breath-hold state — during acquisition of the sequence of 2D fluoroscopy images and acquisition of the live fluoroscope image frames is substantially the same.

[0229] In some embodiments, the system measures a displacement or location change of a reference feature (e.g., preselected anatomical feature or any feature identified in the image), compares the measured displacement values against a predetermined displacement threshold to detect a change in the physical state or detect an incorrect location of the target feature in the augmented fluoroscope image. For example, if the measured displacement value surpasses the predetermined displacement threshold, the system may present indicators — such as color-coded overlays, text labels, or warning messages — indicating that the overlay is not at the correct location. In further examples, this feedback provides immediate, visually interpretable information about overlay accuracy, prompting further analysis, adjustments, or corrections as needed to restore the consistent physical state (e.g., breath hold state).

[0230] In some cases, the system may dynamically adjust parameters or re-check thresholds to achieve and maintain these improved conditions. The displacement threshold or the breath-hold difference threshold may be manually set up by a physician. In some cases, the threshold may be manually configurable via a graphical user interface (GUI). Additionally or alternatively, the threshold may be automatically generated by the system. For instance, the threshold may be generated based on empirical data, and may be adjusted dynamically based on the specific application, types of organ / tissue imaged, the subsequent surgical operations to be performed (e.g., different operations may have different accuracy requirements) and / or the patient condition. In some cases, the threshold may be predicted by a machine learning algorithm trained model.Attorney Docket No. 55441-738601The model may take as input the information about the medical imaging (e.g., at least one image frame, target tissue, operation type, imaging apparatus parameters, etc.) and output a threshold. In some cases, the model may be automatically updated based on feedback data. For instance, a model predicted threshold may be displayed on the GUI, and a user may adjust the threshold. The difference between the predicted threshold and the user inputted threshold may be used as feedback to further update the model parameters. This beneficially allows for a threshold adaptive to a user preference (e.g., different physicians may have different trainings or tolerances for variations in the physical conditions) or adaptive to a specific surgical operation (e.g., different surgical operations have different accuracy requirements).

[0231] In some instances, the system periodically recalibrates displacement thresholds to account for variations in patient posture, respiratory patterns, or environmental factors. For example, by continuously evaluating and updating these thresholds, the system maintains clinically useful imaging guidance and stable alignment throughout image-guided procedures. In some embodiments, the system adjusts the pre-determined threshold over time based at least in part on patient-specific or environmental factors, applying gradual calibration changes to maintain imaging reliability as conditions evolve. In some cases, the pre-determined threshold comprises a numerical value or range representing acceptable displacement limits for a reference feature, such as a diaphragm contour, relative to a previously acquired reference position. In some instances, the pre-determined threshold is selected from a set of pre-determined thresholds, each tailored to different clinical scenarios (e.g., varying patient conditions, procedural stages, or imaging equipment configurations). For example, the system may detect slight variations in patient posture or breathing patterns, prompting threshold refinements. The system may monitor patient trends over multiple imaging sessions and dynamically modify the pre-determined threshold, accordingly, implementing small increments or decrements as needed to preserve imaging quality and overlay accuracy. In further examples, by continuously evaluating these physiological variations that modify breath-hold stability and lesion localization precision, the system remains responsive and effective. As an example, adapting the pre-determined threshold in this manner ensures that imaging guidance remains clinically relevant, reliable, and capable of accommodating changing conditions over the course of image-guided interventions.

[0232] In some embodiments, selecting an image from a sequence of two-dimensional (2D) fluoroscopy images comprises comparing each candidate image’s angle to a specified predetermined angle threshold. As used herein, an “angle threshold” is a predetermined acceptable angular difference between pre-acquired fluoroscopy images and the real-time fluoroscopy viewpoint. Selecting images with angular differences below this threshold ensures minimal distortion, reducing misalignment and improving lesion localization accuracy.Attorney Docket No. 55441-738601

[0233] In some cases, by identifying images whose angles lie within pre-determined angle threshold relative to the angle of the live fluoroscope image frames, the system ensures minimal mismatch and reduces the risk of “POOR” alignment conditions not due to physical state change. As a result, the chosen image closely matches the current viewpoint, diminishing discrepancies and improving imaging states, as illustrated by scenarios in which conditions transition from “POOR” to “GOOD” or better. Moreover, by periodically adjusting the angle threshold and logging selected images, the system may dynamically refine accuracy and stability in lesion localization over time.

[0234] In some cases, the system may also consider at least one physical condition state, such as a stable breath-hold. Combining angle-based image selection with verification of the patient’s breath-hold state may further enhance overlay accuracy, ensuring that the displayed lesion location aligns with the patient’s actual anatomy. In some instances, by simultaneously verifying angular alignment and respiratory stability, the system maintains clinically relevant, live imaging guidance for precise and reliable lesion targeting during image-guided procedures.

[0235] The method may comprise selecting the 2D fluoroscope image from the sequence of 2D fluoroscope images and / or registering the images to rule out location mismatch caused by other factors (e.g., different viewing angles, image distortion). For example, the method may register the selected 2D fluoroscope image to the 2D live fluoroscope image when there is no 2D fluoroscope image matching the viewpoint of the 2D live fluoroscope image to eliminate discrepancy due to the angle difference. This beneficially allows for generating recommendation for adjusting the physical state for retaking or re-acquiring the medical image data. In some cases, the registration algorithm applies geometric transformation procedures to correct orientation mismatches and improve alignment fidelity once the angle difference surpasses the pre-determined angle threshold.Location difference algorithm

[0236] As described above, the methods herein may quantitatively measure a physical state change between tomosynthesis imaging and augmented fluoroscopy based on a location difference or displacement of a reference feature identified in the image. The methods herein may employ algorithms (e.g., contour-based, deep learning-based feature identification, featurebased motion detection, etc.) to compute the location difference or breath holding state difference.

[0237] FIG. 33 shows an example of a system 3300 for detecting differences using a contourbased method or a feature-based method. In the example of FIG. 33, the system 3300 is configured to determine a physical state change by measuring discrepancies between 3D tomosynthesis and 2D augmented fluoroscopy (AF). As described above, the method mayAttorney Docket No. 55441-738601comprise selecting a 2D fluoroscope reference image from the sequence of fluoroscope reference images from a tomosynthesis sweep 3301. The 2D fluoroscope reference image may be selected to have an angle best matching the viewing angle of the AF. In this example, a find best match algorithm 3301 may be executed to select a reference image from a sequence of fluoroscopy images that best match the augmented fluoroscopy (AF) viewpoint. For instance, the algorithm may determine how closely the live viewpoint (e.g., AF viewpoint or view angle) matches the angle associated with a reference image. The angles associated with the sequence of fluoroscopy images may be obtained using the pose estimation algorithm as described above. The find best match algorithm may find the estimated pose that is closest to the AF viewpoint and use the 2D fluoroscope image associated with the estimated pose as the reference image. In some cases, the find best match algorithm may further determine a difference between the closest estimated pose and the AF viewpoint, and if the difference is above an angle threshold, the find best match algorithm may perform image registration between the reference image and the AF image prior to computing the image difference 3304 or correcting global motion 3308.

[0238] A difference map 3307 may be generated by a contour-based method 3302 quantitatively showing the difference of location of a reference feature. As shown, the contour-based method 3302 may visually represent misalignment by analyzing difference in location of a reference feature in the tomosynthesis reference image and the AF image. As described elsewhere herein, the reference feature may be an anatomical feature (e.g., at least a portion of a diaphragm) whose location is affected by a change in the physical state of a patient (e.g., different breath holding states result in different distances between tissues). The reference feature may have known shape, size or general location thus the reference feature may be identified from the tomosynthesis reference image and the AF image using any suitable segmentation or object recognition method. For instance, the reference feature’s boundary may be identified by applying edge detection algorithms to the image frame, creating a "contour" around the reference feature. The system may attempt to match the detected contour in the two images, and measure an image difference 3304

[0239] In some embodiments, the disclosed systems and methods employ a morphology-based segmentation approaches that refine anatomical structures in fluoroscopy images 3305. As used herein, “morphology-based segmentation” involves using image processing techniques, such as erosion and dilation, to refine structures within fluoroscopy frames. These operations remove small noise elements and isolate stable anatomical landmarks suitable as reference features.Alternatively, deep learning-based segmentation may be employed utilizing advanced neural network models trained on diverse patient datasets, allowing robust feature detection and segmentation even in complex or variable imaging conditions. Integrating both approaches offerAtorney Docket No. 55441-738601a balanced segmentation pipeline that can adapt to a wide range of anatomical and clinical scenarios.

[0240] In some cases, conventional image processing operations like erosion or dilation remove noise and isolate stable reference features. In some instances, these reference features serve as anchors for tracking, allowing the operator to maintain accurate overlays even if patient conditions shift. For example, once the system identifies a reliable reference feature, it may monitor that feature’s position over time. In further examples, relying on morphology -based segmentation results in improved stability in overlay placement. As an example, continuous refinement of segmented features supports sustained accuracy in lesion localization.

[0241] In some embodiments, the disclosed systems and methods incorporate deep learningbased segmentation techniques to handle complex anatomical scenarios and adapt to patientspecific imaging conditions. In some cases, these advanced models detect and segment reference features with a high degree of accuracy, even under challenging conditions. In some instances, this ensures consistent reference points for calculating displacement and maintaining correct overlay alignment. For example, using deep learning models may improve robustness against patient movements or variations in imaging angles. In further examples, integrating both morphology -based and deep learning-based approaches provides a balanced segmentation pipeline that remains effective across diverse clinical settings. As an example, the synergy of conventional and deep learning methods real-time or near real-timer improved reliability and sustained guidance accuracy.

[0242] In some instances, the contour-based approach comprises performing image differencing 3304 to detect pixel-level disparities of the reference feature, then performing morphological operations 3305 to enhance meaningful regions and remove noise. In some instances, the system 3300 then draws a contour around the resultant mask 3306, producing a drawn contour 3307 that directly outlines areas of positional deviation. In some cases, quantitative information such as the contour region of the difference (e.g., contour area, pixels enclosed by contour, etc.) may be displayed on a GUI allowing a user to visualize the physical state change in each augmented fluoroscope frame. FIG. 34 and FIG. 35 show examples of the GUI displaying the location difference information using the contour-based method.

[0243] FIG. 34 shows an example of a user interface 3400 for monitoring and assessing alignment conditions during imaging procedures. In the example of FIG. 34, the user interface 3400 displays the number of pixels enclosed by the contour 3401, providing a quantitative measurement of the detected area. In some instances, a plot of the contour area over time 3402 shows changes across sequential frames to track alignment stability. In some instances, a scale for controlling the frame rate 3403 allows the operator to adjust imaging speed for liveAttorney Docket No. 55441-738601monitoring, while a button to start the procedure 3404 initiates the imaging process and associated alignment evaluations. In some instances, an alignment indicator 3405 provides immediate feedback on imaging quality, displaying conditions such as an alignment level of “GREAT” or “POOR.” In some instances, a visualization of the tomosynthesis and augmented fluoroscopy frame differences 3406 is provided to assess alignment. In some instances, a contour overlay 3407 highlights positional discrepancies on the tomosynthesis image, and the current augmented fluoroscopy frame 3408 presents real-time imaging data for comparison. As shown, an ellipse indicator 3409 visually signals alignment status, where red indicates poor alignment and green indicates good alignment. This interface collectively provides the operator to monitor alignment parameters, assess discrepancies, and maintain imaging accuracy throughout the procedure.

[0244] FIG. 35 illustrates an example of a user interface 3500 for monitoring alignment and contour conditions during imaging procedures. In the example of FIG. 35, the system status and alignment condition are displayed at 3501, where the indicator (e.g., “GREAT”) signifies improved reference and tilt match alignment. In the example of FIG. 35, the current augmented fluoroscopy frame 3502 includes a lesion overlay (e.g., red circle) for tracking and localization, while a grid pattern 3503 provides spatial references for motion assessment. In the example of FIG. 35, a contour overlay 3504 is applied to the tomosynthesis image, highlighting positional discrepancies or shifts in alignment, and the current tomosynthesis reference frame 3505 displays the contour for visual comparison to ensure alignment accuracy. In some instances, this combination of visual feedback, overlays, and quantitative indicators supports precise lesion localization and real-time system alignment during imaging procedures.

[0245] Alternatively, the system may employ a feature-based method 3303 to quantify displacement based on features detected in the image (not pre-known feature), complementing the contour-based visualization with numerical data. In some embodiments, the systems and methods disclosed herein employ feature-based methods to identify reference points or descriptors from live fluoroscopy frames. In some cases, quantifying the displacement of these reference points allows the system to determine when overlay adjustments are necessary. In some instances, real-time semantic extraction provides immediate feedback if patient movements, breath-hold variations, or angle misalignments occur. For example, detecting reference point shifts guides the operator to correct conditions before misalignment becomes severe. In further examples, this immediate awareness aids the operator in maintaining stable procedural conditions and improved lesion targeting outcomes. As an example, continuous reference point monitoring complements segmentation strategies and ensures timely operator intervention when needed.Attorney Docket No. 55441-738601

[0246] The feature-based method may comprise correcting a global motion 3308 between tomosynthesis and fluoroscopy to address large-scale misalignments. Next, the feature-based method finds feature matches around the scope tip 3309, establishing reference pairs that allow accurate reference point (key points) displacement computation 3310, thus determining how specific points have moved. Unlike using known anatomical feature (e.g., diaphragm), the reference features or key points herein can be any salient features such as edges, comers, or other distinct points that are near the scope tip. The key points may be identified using any suitable methods such as Canny edge detection, Harris corner detection, or other algorithms designed to find areas with high intensity gradients or local variations.

[0247] In some instances, the system 3300 then generates a feature-based map 3311 that consolidates computed displacements into a structured output, guiding further adjustments and ensuring stable, accurate lesion localization. After the feature extraction and optional morphological refinement (refine the feature shapes, remove small noise points, and improve the feature boundaries for better tracking), the features from the reference image are matched with corresponding features in the AF image based on their spatial proximity and intensity similarity. By analyzing the displacement of matched features between frames, the motion vectors representing the movement of key points 3310, 3311 can be calculated.

[0248] FIG.36 and FIG.37 show examples of feature-based method and the motion map generated by the method. FIG.36 illustrates how motion field visualization modifies lesion displacement assessment. In the top scenario, smaller (blue) arrows 3601 indicate minimal movement and stable alignment, resulting in unnoticeable lesion displacement 3604. In the bottom scenario, larger (blue) arrows 3602 represent more substantial movement and increased misalignment, with 3603 showing noticeable lesion displacement. In some instances, by comparing these conditions, the system detects alignment discrepancies in real-time or near realtime, allowing timely adjustments to maintain accurate lesion localization.

[0249] FIG.37 presents a sequence of images 3700 at different time points. The motion field 3701 (blue arrows at t=0) measures movement within the augmented fluoroscopy frame, while the tomosynthesis frame 3702 and augmented fluoroscopy frame 3703 provide reference and live data, respectively. A computed difference map 3704 highlights pixel-level discrepancies, and a threshold difference map 3705 isolates areas of significant misalignment. In the subsequent row, a motion field 3706, tomosynthesis frame 3707, augmented fluoroscopy frame 3708, difference map 3709, and threshold difference map 3710 repeat the analysis at another time point. By comparing these sets of images overtime, the system refines imaging conditions, ensuring stable alignment and precise lesion targeting as clinical factors evolve.Attorney Docket No. 55441-738601

[0250] In some cases, the method may combine, the contour-based method’s visual delineation of discrepancies with the feature-based method’s quantitative analysis to improve a measurement result. For example, the system 3300 and its “find best match” module 3301 achieve refined alignment and reliable conditions for image-guided interventions. In some instances, the contourbased method alone can highlight misalignments and facilitate corrections. In some instances, the feature-based method alone can numerically confirm object positions, each approach individually provides a sufficient measure of alignment accuracy if needed. In some instances, employing both methods together offers a robust, adaptable solution that utilizes their complementary strengths to further enhance accuracy and reliability during image-guided procedures.

[0251] In some embodiments, the system includes an alignment evaluation module configured to assess and correct misalignment between tomosynthesis-based reference images and real-time fluoroscopy images. This module may combine a contour-based component, which identifies and visualizes positional discrepancies through image differencing, morphological operations, and contour delineation, with a feature-based component, which computes reference point displacements, applies global motion correction, and performs feature matching to refine alignment calculations. In some instances, by integrating these complementary techniques — visual cues from contour-based methods and quantitative analyses from feature-based methods — the system can detect misalignment, dynamically adjust imaging parameters or thresholds, retrigger image acquisitions, and produce final displacement outputs that guide corrective actions. As a result, alignment conditions are continuously monitored and maintained in real-time or near real-time, ensuring that the displayed lesion location remains accurate and suitable for interventional procedures.

[0252] In some embodiments, the reference feature comprises a diaphragm and / or another anatomical landmark suitable for detecting positional changes within the patient.

[0253] In some embodiments, the system measures the displacement of a reference feature using a contour-based object tracking algorithm. In some cases, the contour-based object tracking algorithm is configured for detecting edges and tracking shifts in the reference feature’s shape or position in real-time or near real-time. In some instances, if the measured displacement surpasses a predetermined contour threshold, the system recognizes misalignment and initiates adjustments.

[0254] In some embodiments, the system refines thresholds for patient anatomy to enhance contour-based tracking accuracy. In some cases, this calibration accounts for variations in patient structure or imaging conditions. In some instances, the system archives contour displacement calculations over time to inform dynamic threshold adjustments. For example, as repeated measurements accumulate, the system may analyze historical data to identify patterns and adapt thresholds accordingly. In further examples, continuously monitoring trends in contour-basedAttorney Docket No. 55441-738601outputs allows the system to maintain stringent quality assurance protocols. As an example, iterative refinement of contour thresholds leads to robust, high-fidelity alignment throughout imaging and procedural workflows, ensuring stable conditions that support accurate lesion localization.

[0255] In some embodiments, the system comprises at least one workflow management module. In some cases, this module includes a motion assessment and correction component that integrates global motion correction with localized lesion alignment methods. By continuously assessing and correcting motion while aligning the lesion with tomosynthesis-based reference data in real-time or near real-time, the workflow management module maintains stable imaging conditions and precise lesion targeting. This adaptive approach improves efficiency and enhances patient outcomes.

[0256] In some embodiments, the system comprises at least one motion field visualization module. In some cases, this module includes a motion representation component that displays motion fields with arrows (e.g., blue) indicating movement magnitude and direction.Differentiating minimal motion (smaller arrows) from substantial movement (larger arrows) provides immediate insight into alignment quality. This visualization facilitates timely adjustments to maintain accuracy, guiding parameter refinement for stable alignment. Examples of the motion field are illustrated in FIG. 36 and FIG. 37.

[0257] FIG. 38 illustrates a system 3800 for evaluating augmented fluoroscopy imaging accuracy under respiratory gating. A CT reference image 3801, combined with a reference point 3802, establishes a baseline for comparing real-time imaging data. Two fluoroscopy frames are displayed side-by-side: the left panel 3803 indicates a “Guidance Match” condition (e.g., green bar) signifying stable imaging, while the right panel 3806 indicates a “Guidance Mismatch” e.g., red bar) signifying lesion displacement due to respiratory motion or misalignment. A caption at the top may help operators easily identify these conditions. The lesion overlay 3807 may display in both panels, allowing immediate assessment of the target location’s consistency across varying breath-hold states. By integrating CT reference data with live augmented fluoroscopy, the system promptly detects respiratory-induced misalignment, providing essential feedback for accurate lesion localization and improved procedural outcomes.

[0258] In some embodiments, the system comprises at least one alignment verification module. In some cases, the at least one alignment verification module comprises at least one respiratory gating component. In some instances, the at least one respiratory gating component comprises comparing augmented fluoroscopy frames with a reference computed tomography (CT) image 3801. For example, the at least one respiratory gating component may be configured to maintain consistent lesion localization under varying breath-hold states. In further examples, the systemAttorney Docket No. 55441-738601verifies alignment accuracy by assessing real-time imaging conditions against pre-acquired CT reference data.

[0259] In some embodiments, the system comprises at least one fluoroscopy display module. In some cases, the at least one fluoroscopy display module comprises at least one lesion indicator component. In some instances, the at least one lesion indicator component comprises presenting a fluoroscopy frame 3802 with a blue lesion overlay, indicating the target’s position. In some cases, the at least one fluoroscopy display module comprises at least one guidance evaluation component. In some instances, the at least one guidance evaluation component comprises displaying a guidance scale 3804 to reflect alignment accuracy. For example, when alignment matches established references, the system shows a “GUIDANCE MATCH” near the green end of the scale. In further examples, if misalignment occurs 3803, the system may adjust imaging parameters, issue alerts, or re-trigger imaging, shifting the guidance scale toward red to reflect challenging conditions.

[0260] In some embodiments, the system comprises at least one adaptive respiratory evaluation module. In some cases, the at least one adaptive respiratory evaluation module comprises at least one breathing pattern evaluation component. In some instances, the at least one breathing pattern evaluation component comprises analyzing changes in patient breathing to adapt alignment conditions. For example, the at least one adaptive respiratory evaluation module may continuously assess augmented fluoroscopy data to sustain accurate lesion localization. In further examples, this adaptive approach ensures reliable real-time imaging, maintaining precise lesion targeting and improved procedural outcomes despite respiratory variations.

[0261] In some embodiments, the system comprises at least one physical state comparison module. In some cases, this module includes at least one state difference detection component. In some instances, the component comprises determining differences in the subject’s physical state (e.g., a breath-hold state) between tomosynthesis image and live 2D fluoroscopy images. For example, detecting changes in respiratory conditions informs the system about alignment quality. In further examples, recognizing these differences allows for adaptive adjustments to maintain procedural accuracy.

[0262] In some embodiments, the system compares the subject’s physical state against a physical state difference threshold. As used herein, a “breath-hold difference threshold” is a predetermined limit that evaluates whether the patient’s current respiratory pattern matches the state present during pre-acquisition imaging. Exceeding this threshold indicates a change in internal anatomy positioning due to altered breathing, potentially prompting re-establishment of the desired breath-hold state or re-acquisition of images to restore accurate overlay alignment.Attorney Docket No. 55441-738601

[0263] In some cases, if variations exceed the threshold, the system indicates misalignment and prompts corrective measures, such as adjusting imaging parameters or re-acquiring images. In further examples, reconfiguring the threshold accommodates patients struggling to maintain breath-holds, preserving the clinical usefulness of imaging guidance. As an example, documenting threshold exceedances provides iterative refinement of imaging strategies, continually improving outcomes. The system and method herein may be capable of identifying location displacement of the target feature due to physical state change and generate recommendation (on the GUI) prompting a physician to adjust imaging parameters or instruct a patient to properly hold breath prior to acquiring the augmented fluoroscopy image.Computer System

[0264] The present disclosure provides computer systems that are programmed to implement methods of the disclosure. FIG. 27 shows a computer system 2701 that is programmed or otherwise configured to operate any method, system, process, or technique described herein (such as systems or methods of generating tomosynthesis reconstructions or augmented fluoroscopy, described herein). For example, the user interface 2740 may present one or more of the user interfaces described with respect to FIGs. 19-38.

[0265] The computer system 2701 can regulate various aspects of the present disclosure, such as, for example, techniques for tomosynthesis (e.g., tomosynthesis reconstruction) or fluoroscopy (e.g., augmented fluoroscopy). The computer system 2701 can be an electronic device of a user or a computer system that is remotely located with respect to the electronic device. The electronic device can be a mobile electronic device.

[0266] The computer system 2701 includes a central processing unit (CPU, also “processor” and “computer processor” herein) 2705, which can be a single core or multi core processor, or a plurality of processors for parallel processing. The computer system 2701 also includes memory or memory location 2710 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 2715 e.g., hard disk), communication interface 2720 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 2725, such as cache, other memory, data storage or electronic display adapters. The memory 2710, storage unit 2715, interface 2720 and peripheral devices 2725 are in communication with the CPU 2705 through a communication bus (solid lines), such as a motherboard. The storage unit 2715 can be a data storage unit (or data repository) for storing data. The computer system 2701 can be operatively coupled to a computer network (“network”) 2730 with the aid of the communication interface 2720. The network 2730 can be the Internet, an internet or extranet, or an intranet or extranet that is in communication with the Internet. The network 2730 in some cases is a telecommunication or data network. The network 2730 can include one or more computerAttorney Docket No. 55441-738601servers, which can allow distributed computing, such as cloud computing. The network 2730, in some cases with the aid of the computer system 2701, can implement a peer-to-peer network, which may allow devices coupled to the computer system 2701 to behave as a client or a server.

[0267] The CPU 2705 can execute instructions on computer-readable media, which can be embodied in a program or software. The instructions may be stored in a memory location, such as the memory 2710. The instructions can be directed to the CPU 2705, which can subsequently program or otherwise configure the CPU 2705 to implement methods of the present disclosure. Examples of operations performed by the CPU 2705 can include fetch, decode, execute, and writeback.

[0268] The CPU 2705 can be part of a circuit, such as an integrated circuit. One or more other components of the system 2701 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).

[0269] The storage unit 2715 can store files, such as drivers, libraries, and saved programs. The storage unit 2715 can store user data, e.g., user preferences and user programs. The computer system 2701 in some cases can include one or more additional data storage units that are external to the computer system 2701, such as located on a remote server that is in communication with the computer system 2701 through an intranet or the Internet.

[0270] The computer system 2701 can communicate with one or more remote computer systems through the network 2730. For instance, the computer system 2701 can communicate with a remote computer system of a user (e.g., a medical device operator). Examples of remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-allowed device, Blackberry®), or personal digital assistants. The user can access the computer system 2701 via the network 2730.

[0271] Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 2701, such as, for example, on the memory 2710 or electronic storage unit 2715. The instructions may be code stored on the computer-readable media can be provided in the form of software. During use, the code can be executed by the processor 2705. In some cases, the code can be retrieved from the storage unit 2715 and stored on the memory 2710 for ready access by the processor 2705. In some situations, the electronic storage unit 2715 can be precluded, and machineexecutable instructions are stored on memory 2710.

[0272] The code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code or can be compiled during runtime. The code can be supplied in aAttorney Docket No. 55441-738601programming language that can be selected to allow the code to execute in a pre-compiled or as-compiled fashion.

[0273] Aspects of the systems and methods provided herein, such as the computer system 2701, can be embodied in programming. Various aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of computer-readable media storing instructions as code or associated data that is carried on or embodied in a type of computer-readable media. Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.“Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may allow loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server. Thus, another type of media that may bear the software elements includes optical, electrical, and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links, or the like, also may be considered as media bearing the software. As used herein, unless restricted to non-transitory, tangible “storage” media, terms such as computer or machine “readable media” refer to any medium or media that participates in providing instructions to a processor for execution.

[0274] Hence, a computer-readable media, such as computer-executable code, may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings. Volatile storage media include dynamic memory, such as main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, aAttorney Docket No. 55441-738601PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code or data. Many of these forms of computer-readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.

[0275] The computer system 2701 can include or be in communication with an electronic display 2735 that comprises a user interface (UI) 2740 for providing, for example, for tomosynthesis (e.g., tomosynthesis reconstruction) or fluoroscopy (e.g., augmented fluoroscopy) data, such as text, video, images, etc. Examples of UI’s include, without limitation, a graphical user interface (GUI), a web-based user interface, or an Application Programming Interface (API). The UI 2740 may be, in some cases, also used for input via touchscreen capabilities.

[0276] Methods, systems, instructions, and techniques of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by the central processing unit 2705. The algorithm may comprise: (a) providing a first graphical user interface (GUI) for a tomosynthesis mode and a second GUI for a fluoroscopic view mode for viewing a portion of the endoscopic device and a target within a subject; (b) receiving a sequence of fluoroscopic image frames containing the portion of the endoscopic device, a marker, and the target, where the sequence of fluoroscopic image frames correspond to various poses of an imaging system acquiring the sequence of fluoroscopic image frames; (c) upon switching to the tomosynthesis mode, i) performing a uniqueness check on the sequence of fluoroscopic image frames and ii) generating a reconstructed 3D tomosynthesis image based at least in part on the poses of the imaging system estimated using the marker; and (d) upon switching to the fluoroscopic view mode, i) generating an estimated pose of the imaging system associated with a fluoroscopic image frame from the sequence of fluoroscopic image frames based at least in part on the marker contained in the fluoroscopic image frame and ii) generating an overlay of the target displayed onto the fluoroscopic image frame based at least in part on the estimated pose. In some embodiments, the fluoroscopic images for the tomosynthesis and augmented fluoroscopy model may be acquired utilizing a Cone Beam CT (CBCT).

[0277] In some embodiments, the algorithm may implement operations including: (a) in a navigation mode of a graphical user interface (GUI), navigating the endoscopic device towards a target within a subject, the GUI displays a virtual view with visual elements to guide navigating the endoscopic device; (b) upon switching to a tomosynthesis mode of the GUI, i) receiving a sequence of fluoroscopic image frames containing a portion of the endoscopic device and the target, where the sequence of fluoroscopic image frames correspond to various poses of an imaging system acquiring the sequence of fluoroscopic image frames, ii) generating aAttorney Docket No. 55441-738601reconstructed 3D tomosynthesis image based at least in part on the poses of the imaging system and iii) determining a location of the target based at least in part on the reconstructed 3D tomosynthesis image; and (c) upon switching to a fluoroscopic view mode of the GUI, i) obtaining a pose of the imaging system associated with a fluoroscopic image frame acquired in the fluoroscopic view mode, and ii) generating an overlay of the target displayed onto the fluoroscopic image frame based at least in part on the pose of the imaging system and the location of the target determined in (b).

[0278] In some embodiments, the virtual view in the navigation mode comprises upon determining a distal tip of the endoscopic device is within a predetermined proximity of the target, rendering a graphical representation of the target and an indicator indicative of an angle of the target relative to an exit axis of a working channel of the endoscopic device. In some embodiments, a location of the target displayed in the navigation mode is updated based on the location of the target determined in (b). In some embodiments, the poses of the imaging system in the tomosynthesis mode are estimated using a marker contained in the sequence of fluoroscopic image frames. In some embodiments, the poses of the imaging system in the tomosynthesis mode are measured by one or more sensors.

[0279] In some embodiments, the pose of the imaging system associated with the fluoroscopic image frame in the fluoroscopic view mode is estimated using a marker contained in the fluoroscopic image frame. In some cases, the marker has a 3D pattern. In some instances, the marker comprises a plurality of features placed on at least two different planes. In some cases, the marker has a plurality of features of different sizes arranged in a coded pattern. In some instances, the coded pattern comprises a plurality of sub-areas each has a unique pattern. In some instances, the pose of the imaging system is estimated by matching a patch of the plurality of features in the fluoroscopic image frame to the coded pattern.

[0280] In some embodiments, the pose of the imaging system associated with the fluoroscopic image frame in the fluoroscopic view mode is measured by one or more sensors. In some embodiments, in the tomosynthesis mode the sequence of fluoroscopic image frames is processed by performing a uniqueness check on the sequence of fluoroscopic image frames. In some cases, the uniqueness check comprises determining whether a fluoroscopic image frame from the sequence of fluoroscopic image frames is unique based at least in part on an intensity comparison.Example Method

[0281] FIG. 28A illustrates an example method 2800 for real-time fluoroscopy imaging for a robotic system with real-time quality assessment. The method may comprise: acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, and theAttorney Docket No. 55441-738601sequence of 2D fluoroscopy images is acquired at various angles 2801; reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles 2803; acquiring live fluoroscope image frames at a specific angle, wherein the live fluoroscope image frames contain the target feature 2805; displaying an overlay of a projection of the target feature onto the live fluoroscope image frames, wherein the projection of the target feature is based at least in part on the target feature in the reconstructed 3D image 2807; and determining whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference feature identified in a 2D fluoroscopy image selected from the sequence of 2D fluoroscopy images and identified in the live fluoroscope image frames 2809.

[0282] As described elsewhere herein, the system acquires a tomosynthesis sweep comprising a temporally ordered sequence of 2D fluoroscopic projection images while the C-arm or detector traverses a limited angular range around the patient. Alternatively, the 3D reconstructed image may be acquired using CBCT that has a wider angle range for the sweep. Each acquired 2D projection is time-stamped and associated with an estimated pose (such as a viewpoint angle and translation) obtained either from encoder / robot logs or from a pose-estimation routine that uses an imaged bead board, radiopaque markers, or natural image features. In some cases, metadata describing the sweep (start / end pose, number of frames, imaging geometry) is stored with the sequence to allow for later selection of a reference projection that best matches an AF viewpoint.

[0283] Reconstruction of a three-dimensional (3D) image from the 2D sequence and associated poses 2803 may comprise using the acquired fluoroscopic sequence and the associated pose information, to generate a 3D volumetric representation of the anatomy in the swept region. Reconstruction may be implemented using any suitable limited-angle tomography algorithm, such as filtered b ackprojection (FBP) adapted for limited angles, iterative algebraic methods (SART, SIRT), model-based iterative reconstruction (MBIR) with regularization (total variation, edge-preserving), or compressed-sensing approaches.

[0284] During augmented fluoroscopy (AF) guided navigation, a live fluoroscope acquires 2D frames at a fixed or user-configured viewpoint (the AF viewpoint) 2805. The live frames are acquired in real time and buffered for analysis. They are corrected for detector artifacts and optionally aligned temporally to tool telemetry (robot pose, EM tracker). The AF viewpoint is specified by the C-arm angle or by an operator selection in the GUI. In some cases, the AF viewpoint may be also represented internally as a pose in the same coordinate reference frame used by the tomosynthesis acquisition. Live acquisition parameters (e.g., frame rate, exposure) are chosen to balance image quality and radiation dose. To reduce computational load, downstream analysis may operate on a region-of-interest (RO I) around the projected targetAttorney Docket No. 55441-738601location and scope tip. Live frames are made available to the overlay rendering and condition verification modules with minimal latency to support near-real-time feedback.

[0285] A display module renders an augmented overlay of the target feature onto the incoming live fluoroscopy frames 2907. The overlay is generated by projecting the 3D target coordinates from the 3D reconstruction into the 2D AF image plane using the estimated AF pose and the calibrated imaging model (e.g., intrinsics and extrinsics). Projection accounts for detector distortion and any geometric corrections applied during reconstruction. In some cases, the overlay may include a target centroid, boundary contour, uncertainty ellipse, or color coding (for example, green / amber / red) that represents overlay confidence or alignment quality. Rendering parameters include line thickness, translucency, and dynamic repositioning to compensate for small tool-induced deformations or system latency. In some cases, the method may employ latency compensation method which may use short-term motion prediction (linear or Kalman filtering).

[0286] In some embodiments, an analysis module evaluates the validity of anaugmented-fluoroscopy (AF) overlay by measuring a displacement of one or more reference features between a selected two-dimensional (2D) projection from a tomosynthesis sweep and live AF frames, and by comparing the measured displacement to one or more thresholds 2809.The determination comprises selection of an appropriate reference projection, optional 2D-2D registration when required, localization of one or more reference features in both the reference projection and the live AF frames, computation of one or more displacement metrics together with corresponding uncertainty estimates, comparison of the displacement metrics to configurable thresholds using decision logic that accounts for temporal stability, and generation of operator feedback and machine actions based on the decision.

[0287] In some cases, the analysis module selects a candidate reference 2D projection from the tomosynthesis sequence whose viewpoint most closely matches the AF viewpoint. The selection may use logged pose metadata, where the projection having minimal angular and translational difference to the AF pose is chosen. When logged pose metadata are unavailable or insufficient, the module may estimate projection poses using image-based pose estimation such as bead-pattern analysis, marker detection, or feature-based pose refinement, and select the projection with minimal estimated viewpoint difference. If the angular and translational differences between the chosen projection and the AF viewpoint are within a configurable angle and translation threshold, the selected projection is designated as the reference for subsequent comparison. If the minimum angular difference exceeds the configurable threshold, the module indicates that registration is required prior to difference computation.Attorney Docket No. 55441-738601

[0288] In some cases, when the angular / viewpoint difference between the chosen reference projection and the AF view exceeds the configured threshold, the system performs a 2D registration to compensate for global transform differences prior to displacement computation. Registration may be intensity-based (for example, mutual information or cross-correlation) or feature-based (for example, descriptor matching with robust transform estimation). The registration may estimate and apply similarity, affine, or perspective (homography) transforms as appropriate to the estimated misalignment. The registration may return parameters describing the transform and a registration confidence metric.

[0289] The method 2800 may comprise determination whether the overlay is at a correct location based on displacement of a reference feature between a selected 2D tomosynthesis projection and live AF frames 2809. In some embodiments, an analysis module evaluates the validity of an augmented-fluoroscopy (AF) overlay by measuring a displacement of one or more reference features between a selected two-dimensional (2D) projection from a tomosynthesis sweep and live AF frames, and by comparing the measured displacement to one or more thresholds. The determination comprises selection of an appropriate reference projection, optional 2D-2D registration when required, localization of one or more reference features in both the reference projection and the live AF frames, computation of one or more displacement metrics together with corresponding uncertainty estimates, comparison of the displacement metrics to configurable thresholds using decision logic that accounts for temporal stability, and generation of operator feedback and machine actions based on the decision.

[0290] The system selects a candidate reference 2D projection from the tomosynthesis sequence whose viewpoint most closely matches the AF viewpoint. The selection may use logged pose metadata, where the projection having minimal angular and translational difference to the AF pose is chosen. When logged pose metadata are unavailable or insufficient, the module may estimate projection poses using image-based pose estimation such as bead-pattern analysis, marker detection, or feature-based pose refinement, and select the projection with minimal estimated viewpoint difference. If the angular and translational differences between the chosen projection and the AF viewpoint are within a configurable angle and translation threshold, the selected projection is designated as the reference for subsequent comparison. If the minimum angular difference exceeds the configurable threshold, the module indicates that registration is required prior to difference computation.

[0291] When the angular / viewpoint difference between the chosen reference projection and the AF view exceeds the configured threshold, the system performs a 2D registration to compensate for global transform differences prior to displacement computation. Registration may be intensity-based (for example, mutual information or cross-correlation) or feature-based (forAttorney Docket No. 55441-738601example, descriptor matching with robust transform estimation). The registration may estimate and apply similarity, affine, or perspective (homography) transforms as appropriate to the estimated misalignment. The registration routine returns parameters describing the transform and a registration confidence metric; low registration confidence is propagated to downstream decision logic and may downgrade alignment status or prompt operator confirmation.

[0292] The system may, in some cases, identify one or more reference features in both the selected reference projection and in live AF frames. Exemplary reference features include diaphragm contours, rib or bone landmarks, implanted fiducials, vessel bifurcations, or automatically chosen salient keypoints near the target region. Reference feature localization may be implemented using contour-based segmentation, deep learning segmentation, feature / keypoint detection, or a combination thereof. Contour-based segmentation may employ edge detection (for example, Canny), gradient analysis, active-contour or level-set refinement, and morphological operations (for example, erosion, dilation, opening, closing) to produce a clean binary mask and extract contour centroids, apex points, and boundary curves. Deep learning segmentation may employ convolutional neural networks (for example, U-Net variants) trained on annotated fluoroscopy datasets to output per-pixel probability maps, segmentation confidence scores, and uncertainty maps for the diaphragm or other landmarks. Feature / keypoint detection may employ classical descriptors (for example, Harris, FAST, ORB, SIFT) or learned keypoint detectors to extract and describe salient points for matching.

[0293] Once corresponding reference features are localized in the reference projection and the live AF frames, the system computes one or more displacement metrics. Candidate metrics include boundary displacement defined as the mean, median, or maximum perpendicular distance between paired contour boundaries; centroid offset defined as the Euclidean vector distance between contour or landmark centroids expressed in pixels or converted to millimeters using the imaging geometry; area overlap metrics such as Jaccard index or percent mask overlap; motion-field based measures derived from dense optical flow (for example, Fameback or TV-L1) or from sparse motion vectors produced by matched keypoints; and composite metrics that combine magnitude and spatial extent, for example the area of a contour region where displacement exceeds a specified displacement dO. The module also computes an uncertainty estimate for each displacement metric that accounts for segmentation confidence, registration confidence, imaging noise, and pose estimation uncertainty.

[0294] The system compares the computed displacement metrics and their uncertainties to one or more thresholds to determine alignment quality. Thresholds may be absolute values (for example, 5 mm), relative values (for example, percent area change), or multi-level ranges that map to qualitative statuses such as GREAT, GOOD, and POOR. Thresholds may be static and user-set,Attorney Docket No. 55441-738601procedurally adaptive to the instrument tolerance or procedure type, or generated by a trained machine-learning model that takes inputs such as imaging modality, target size, and historical performance to output a recommended threshold with a confidence measure. Decision logic may incorporate hysteresis and temporal smoothing to avoid flicker, for example by requiring N consecutive frames that exceed or fall below a threshold before changing qualitative status. When a measured displacement metric exceeds a configured limit, the system flags the overlay as unreliable and invokes the prescribed downstream actions.

[0295] The system issues multi-modal outputs that inform the operator and other subsystems of overlay validity. Outputs may include visual overlays such as contours showing displacement, color-coded target overlays (for example, green for GREAT, amber for GOOD, red for POOR), numerical readouts of displacement magnitude, contour area, registration confidence, and a qualitative alignment indicator. The graphical user interface may present actionable recommendations such as “Request breath-hold and retake Tomo,” “Proceed with caution,” or “Abort biopsy attempt,” and may provide an auditable log of measurements, timestamps, poses, and decisions for retrospective analysis. The system may also emit programmatic signals to workflow and robotic control modules to pause tool advancement or to request re-acquisition when alignment status falls below defined safety thresholds.

[0296] In optional embodiments, the system executes both contour-based and feature-based pipelines in parallel and fuses their outputs into a single alignment metric. Fusion may weight each pipeline’s output by its respective confidence score and produce a composite metric that exploits the intuitive visualization of contour outputs and the dense quantitative motion information of feature maps.

[0297] In some cases, to limit sensitivity to noise and transient artifacts, displacement metrics are computed over a sliding temporal window and smoothed using techniques such as Kalman filtering or exponential moving averages. The GUI can display trend plots (for example, contour area over time), and the system can expose controls to adjust frame-rate sampling to balance responsiveness against stability.

[0298] In some cases, the system monitors segmentation and registration confidence to detect occlusions or tool interference. When the robotic tool or other high-contrast objects occlude the selected reference feature, or when segmentation confidence abruptly declines, the system may switch to an alternative preselected reference feature, suspend alignment assessment, or signal the operator to temporarily retract or reposition the tool until reliable reference visibility is restored.

[0299] In some cases, uncertainties from pose estimation, registration, and segmentation are propagated to an overall confidence score. For critical automated actions, for example safetyAttorney Docket No. 55441-738601interlocks that prevent instrument deployment, the system enforces conservative gating rules that require high confidence and a GREAT alignment status before permitting the action. For advisory outputs, lower confidence levels may produce recommendations presented with appropriate caveats.

[0300] In some cases, threshold values and decision parameters may be updated dynamically using logged outcomes. For example, if a POOR flag followed by re-acquisition consistently yields improved alignment, the system can update model parameters or threshold policies for similar future scenarios. User overrides and operator feedback are logged and may be used as labels for supervised retraining of threshold prediction models.

[0301] In some cases, the alignment assessment interoperates with robotic control and higher-level workflow managers. Motion assessment may be automatically invoked when the tool is within a specified approach distance to a planned biopsy target, and the robotic workflow manager may pause or limit tool advancement if alignment degrades below a predefined safety threshold. The workflow manager may also orchestrate re-acquisition prompts, notify anesthesia personnel of required breath-hold maneuvers, and record events in the procedural timeline.

[0302] The system includes visualization components that render motion fields with arrows indicating direction and magnitude, contour overlays, difference heatmaps, and color-coded target overlays. Visual conventions, such as green / amber / red color schemes and contour styles, are standardized and may be configurable by the operator.

[0303] Algorithms are engineered to satisfy near-real-time latency requirements.Computationally intensive operations such as segmentation and feature extraction may be GPU-accelerated and executed within low-latency pipelines. When full-resolution processing exceeds latency budgets, the system may process downsampled regions-of-interest and scale or refine results for final display.

[0304] The system supports periodic calibration routines to verify pose estimation accuracy using bead board imaging or equivalent fiducials and to confirm geometric calibration between tomosynthesis reconstruction space and AF projection geometry. Calibration status is logged and the system issues alerts when calibration drift exceeds tolerances.

[0305] The graphical user interface permits the operator to select or confirm the reference projection, change the reference feature, adjust thresholds, and enable or disable automatic gating and recommendations. Operators can accept or reject system recommendations, and such user actions are logged for audit and for use in model retraining.

[0306] The method 2800 beneficially provides an automated, confidence-aware verification of AF overlay placement by measuring reference-feature displacement between a best-matchedAttorney Docket No. 55441-738601tomosynthesis projection and live AF frames, and by providing timely feedback and workflow integration to support safe and accurate image-guided interventions.

[0307] FIG. 28B shows another method 2810 for real-time fluoroscopy imaging in a robotic system. The method comprises: acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, the sequence of 2D fluoroscopy images are acquired at various angles 2811; reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles 2813; selecting a 2D fluoroscopy image from the sequence of 2D fluoroscopy images based at least in part on the angle associated with the sequence of 2D fluoroscopy images and a viewpoint angle for acquiring 2D live fluoroscope imaging 2815; determining a displacement of a reference feature captured in the selected 2D fluoroscopy image and captured in the 2D live fluoroscope imaging 2817; and determining a difference in a physical state of a subject captured in the sequence of 2D fluoroscopy images and the 2D live fluoroscope imaging 2819.

[0308] The operations 2811, 2813 can be the same as those described in the method 2800. In the step of selection of a 2D fluoroscopy image from the tomosynthesis sequence for use as a reference for live augmented fluoroscopy (AF) comparison 2815, the selection may be performed by matching viewpoint information between the available sequence frames and the AF viewpoint. Selection may use logged pose metadata to compute angular and translational differences and select the tomosynthesis projection with minimal viewpoint difference. When logged metadata are incomplete or of insufficient accuracy, the system estimates pose for each projection using image-based pose estimation, for example by detecting and fitting radiopaque bead patterns, fiducial markers, bone features, or other unique image features, and computing the relative pose using PnP or bundle adjustment techniques. The selection step may apply an angle threshold and a translation threshold. In some cases, when the minimum difference between any tomosynthesis projection and the AF viewpoint is within the thresholds the corresponding projection is designated as the reference frame for direct comparison. If no projection satisfies the thresholds, a best-candidate projection is chosen and an explicit 2D-2D registration is scheduled between that candidate and the live AF frames prior to displacement computation. The selection of a reference projection is implemented to minimize viewpoint-induced differences unrelated to patient physiological state. The system may compute a similarity metric for each tomosynthesis projection relative to the AF viewpoint. For example, the metric combines angular difference, translational offset, and an optional intensity similarity score between projected anatomy regions. When the metric indicates a projection within a configured acceptance window, the projection is designated as the reference and further processing proceeds without geometric correction. If the metric indicates a viewpoint mismatch outside the acceptance window, theAttorney Docket No. 55441-738601system performs a 2D-2D registration between the selected tomosynthesis projection and a representative AF frame. Registration may be intensity based (for example, mutual information, normalized cross-correlation) when global intensity relationships are preserved, or feature based (for example, keypoint detection plus descriptor matching and robust transform estimation) when localized salient features are present. Transform models applied during registration include similarity transforms for modest viewpoint differences, affine transforms for combined scale / rotation / shear differences, and projective (homography) transforms for perspective distortion. The registration may return transform parameters and a registration confidence score that quantifies residual misalignment. In some cases, the registration confidence is propagated to later decision stages and may cause the system to mark a lower alignment certainty or request operator confirmation when confidence is below a configured threshold.

[0309] Determination and computation of displacement of a reference feature between the selected tomosynthesis projection and live AF imaging 2817 may comprise, upon reference projection is selected and geometrically aligned (if necessary) to the AF viewpoint, the system identifying one or more reference features in both the reference projection and in one or more live AF frames. Exemplary reference features include diaphragm contour (or other landmarks such as rib or vertebral landmarks, implanted fiducials, vessel bifurcations), or automatically chosen salient keypoints near the target region. Localization of the reference feature may employ multiple complementary algorithms. A contour-based pipeline performs edge detection (for example, Canny), gradient-based pre-filtering, and active contour or level-set refinement, followed by morphological operations (erosion, dilation, opening, closing) to produce a stable binary mask and an extracted boundary curve. The contour pipeline yields geometric descriptors such as boundary centroid, principal axis, apex points, and signed distance fields. Afeature-based pipeline extracts point landmarks using classical detectors (for example, Harris, FAST, ORB, SIFT) or learned keypoint detectors, computes descriptors, and performs descriptor matching between the reference projection and AF frame(s) to establish correspondences and sparse motion vectors. A deep learning segmentation pipeline may optionally be used to generate per-pixel probability maps for anatomical structures such as the diaphragm. For example, the network outputs include segmentation confidence and uncertainty maps that inform subsequent weighting.

[0310] After corresponding features are established, the system computes one or more displacement metrics. Displacement metrics include boundary displacement measures such as mean, median, and maximum perpendicular distance between paired contours; centroid offset as a Euclidean vector in image coordinates convertible to physical distance by application of geometric calibration; area overlap metrics such as intersection-over-union (Jaccard index);Attorney Docket No. 55441-738601dense or sparse motion field statistics derived from optical flow algorithms (for example, Farneback, TV-L1) or from matched keypoints; and composite measures that combine magnitude with spatial extent (for example, the area of contour for which local displacement exceeds a specified threshold dO). The system computes an uncertainty estimate for each metric using inputs including segmentation confidence, registration confidence, and image noise estimates, and produces both scalar and spatially varying displacement representations (for example, a displacement heatmap or contour difference overlay).

[0311] In some cases, acquisition of a sequence of two-dimensional (2D) fluoroscopy images containing a target feature is performed as part of a tomosynthesis sweep. The tomosynthesis sweep comprises a temporally ordered sequence of projection frames acquired while the imaging source and / or detector traverses a limited angular arc around the patient. Each projection is associated with imaging metadata including an estimated or logged pose (viewpoint angle and translation), acquisition timestamp, exposure parameters, and geometry parameters such as source-to-detector distance and detector pixel pitch. Prior to downstream processing, projections may be corrected for detector non-uniformity and noise, and may be intensity-normalized to reduce appearance variation across angles.

[0312] The method 2810 may comprise determination of a physical state difference between the tomosynthesis acquisition and live AF imaging based on the computed displacement 2819. The system maps measured displacement metrics and their uncertainties to a determination of whether a subject’s physical state — such as breath-hold level, lung inflation, or general antero-posterior tissue positioning — has changed between the tomosynthesis sweep and live AF imaging. This mapping is performed by comparing the computed displacement metrics against threshold policies that may be static, procedurally adaptive, or produced by a machine-learning model. Static thresholds are values configured by the operator (for example, a centroid offset of 5 mm or an overlap below 80%), whereas procedurally adaptive thresholds vary according to parameters such as target size, organ type, instrument tolerance, and clinical task. A learned threshold model ingests imaging metadata (for example, modality, exposure, reconstruction confidence), target descriptors (for example, lesion diameter), and historical outcome data and outputs a recommended threshold with an associated confidence estimate; user overrides of model outputs are logged and may be used for model refinement. Decision logic applies the thresholds to displacement metrics together with uncertainty propagation and temporal smoothing to avoid spurious state change indications: for instance, the system may require that a displacement exceed threshold for N consecutive frames or for a minimum cumulative duration before declaring a state change.Attorney Docket No. 55441-738601

[0313] The system categorizes the result into qualitative statuses (for example, GREAT, GOOD, POOR) that correspond to multi-level threshold ranges, and associates each status with defined actions. For example, a GREAT status indicates that the measured displacement and uncertainty are well within acceptable limits and that the subject’s physical state is substantially unchanged; a POOR status indicates substantial displacement beyond tolerance and suggests that the AF overlay is unreliable. When the system determines a physical state change has occurred, it generates downstream outputs and recommended actions that may include visual warnings, color-coded overlays indicating the degree of misalignment, textual recommendations to request a breath-hold and retake the tomosynthesis sweep, suspension of robotic tool advancement via an interlock or advisory signal, or automated logging of the event for audit. When the determination is marginal, the system may present suggested mitigations such as performing a 2D-2D registration refinement, prompting the operator to select an alternate reference feature, or temporarily increasing frame sampling for higher confidence measurement. All determinations, thresholds used, and operator responses are recorded in procedural logs for retrospective review and for potential supervised updating of adaptive threshold models.

[0314] In some embodiments the system comprises an adaptive threshold module, also referred to as the threshold model or learned threshold predictor, that automatically determines one or more displacement thresholds used by the decision logic of steps

[2809] ,

[2815] and

[2817] , The adaptive threshold module receives imaging metadata, target descriptors such as target size, reconstruction and segmentation confidence measures, and historical alignment outcomes recorded in system logs. The module outputs one or more recommended thresholds together with an associated confidence score. The recommended thresholds may be a single scalar value such as a maximum allowable centroid offset in millimeters, a vector of separate thresholds for centroid offset, boundary displacement and area overlap, or a mapping that defines numeric boundaries for multi level qualitative categories such as GREAT, GOOD and POOR. The module supports offline batch training and online incremental updating by consuming logged operator overrides and observed post action outcomes.

[0315] In some cases, the adaptive threshold model ingests a feature vector composed of any combination of imaging metadata, reconstruction metadata, reference projection and pose features, segmentation and localization confidences, target descriptors, historical and context features, and temporal context. Imaging metadata includes modality identifier such as tomo or CBCT, detector type, frame rate, exposure parameters including kVp and mA, source to detector distance, acquisition angular extent, and angular sampling density as acquired during

[2801] , Reconstruction metadata includes the reconstruction algorithm identifier such as FBP, SART or MBIR, reconstruction confidence maps, per slice point spread function estimates and volumetricAttorney Docket No. 55441-738601signal to noise or contrast to noise statistics produced at

[2803] , Reference projection and pose features include the angular difference between the selected projection and the AF viewpoint, registration residuals and registration confidence returned during the selection and registration process at

[2815] , Segmentation and localization confidences include per frame segmentation probability scores, contour stability metrics such as contour smoothness and curvature variance, and keypoint matching confidences computed at

[2817] , Target descriptors include lesion diameter, lesion contrast, distance to pleura, proximity to ribs or large vessels, lesion orientation and the clinical task type such as diagnostic biopsy or tool in lesion confirmation extracted from the 3D reconstruction at

[2803] or specified by the operator. Historical and context features include a pseudonymized patient identifier, prior alignment outcomes such as whether a prior retake produced improved alignment, operator identifier, anesthesia mode such as controlled ventilation or spontaneous breathing, ventilator parameters when available such as tidal volume and PEEP, and prior override occurrences. Temporal context includes time since tomosynthesis acquisition, number of frames averaged and recent trend statistics such as a moving average displacement and displacement variance computed at

[2817] , Feature pre processing includes normalization for numeric features, categorical encoding by one hot encoding or embeddings for device and model identifiers, and missing value imputation where ventilator features or other contextual signals are not available.

[0316] The adaptive threshold module may be implemented with supervised learning architectures suited to tabular and mixed input sets. Representative model families include gradient boosted regression trees such as XGBoost or LightGBM that produce continuous threshold predictions with optional quantile outputs for uncertainty estimation, random forest regressors or classifiers that provide ensemble mean and variance estimates, feedforward neural networks or multi layer perceptrons that accept mixed inputs and output thresholds and per metric confidence scores, hybrid systems that combine classification of qualitative alignment categories with conditional regressors that predict numeric thresholds, and Bayesian or probabilistic neural networks that provide calibrated posterior distributions over threshold values. Model outputs include one or more recommended numeric displacement thresholds for metrics such as centroid offset and area overlap, per threshold confidence or uncertainty measures such as a standard deviation or credible interval, and an optional recommended decision policy such as a required number of consecutive frames exceeding a threshold and suggested actions such as retake tomosynthesis or proceed with caution.

[0317] The adaptive threshold model improves the mapping performed at

[2819] by providing thresholds that reflect the current imaging conditions, anatomy and historical clinical outcomes. Integration of the model into the decision logic reduces unnecessary tomosynthesis retakes whileAttorney Docket No. 55441-738601preserving safety by relaxing thresholds when reconstruction confidence is high and target size is large and by tightening thresholds when segmentation confidence is low or the lesion is small.

[0318] In some cases, the selection, displacement computation, and physical-state determination steps operate within a higher-level pipeline that applies temporal filtering (for example, Kalman or exponential smoothing) to displacement traces, fuses contour-based and feature-based outputs weighted by confidence, detects and manages occlusion or tool interference by switching reference features or suspending analysis when confidence drops, and propagates uncertainties to enforce conservative gating for safety-critical actions. The system exposes controls through a graphical user interface that allow operators to confirm or override reference projection selection, change reference features, adjust thresholds, and accept or reject automated recommendations. The system also supports adaptive learning of thresholds and policies using logged instances where retake and correction led to improved alignment, enabling progressive refinement of decision logic in clinical deployment.

[0319] In an example navigation process, a user may navigate an endoscopic device towards a target via a first UI such as the driving UI as described elsewhere herein; upon receiving an instruction to switch to a tomosynthesis imaging mode, providing a second UI displaying a tomosynthesis reconstruction, where the tomosynthesis reconstruction is generated by: (i) acquiring one or more fluoroscopic images or 2D scans over a region of interest of a patient, and at least part of the fluoroscopic images over the region of interest includes first image data corresponding to a plurality of markers and the reconstructed tomosynthesis image comprises a plurality of tomosynthesis slices; displaying an indicator indicative of the target location and an angle indicator for aligning a tool to the target, where the target location and the angle is determined based at least in part on user input received via the second UI. The tomosynthesis image is reconstructed based on the fluoroscopic images and the plurality of markers.

[0320] The method may comprise receiving a user input to switch to a fluoroscopy mode. The fluoroscopy mode may provide a third UI displaying an augmented fluoroscopy feature allowing for allowing / disabling an augmented overlay to be displayed over the fluoroscopy view. The augmented fluoroscopy overlay is generated based at least in part on the target location identified in the tomosynthesis imaging. The third UI may be accessed from the first UI. In some embodiments, the fluoroscopic images for the tomosynthesis and fluoroscopy images for the fluoroscopic view may be acquired utilizing a Cone Beam CT (CBCT).

[0321] In some cases, upon completion of the tomosynthesis, the navigation mode UI or driving UI may be automatically updated. For example, a virtual endoluminal view of the driving UI may display a floating target based on the results of the tomography scan. The virtual endoluminal view can be the same as those illustrated in FIG. 26 where a target along with a graphicalAttorney Docket No. 55441-738601element (e.g., ribbon) indicating a path to the target is displayed. The angle of the target is also displayed as seen from the point of view of the working channel, where a tool (e.g., needle instrument) will exit the bronchoscope. The angle of the target relative to the exit axis of the working channel may be determined based at least in part on the layout of the working channel within the distal tip, a real-time location and orientation of the distal tip and location of the target obtained from the tomosynthesis result. The target and the angle arrow may help to assist the user in lining up the tool with the lesion before taking a biopsy. The user may also choose to repeat the tomosynthesis process while the tool is expected to be in the lesion to increase confidence in the biopsy.

[0322] The navigation mode UI or the driving UI may also provide a user a targeting mode as described in FIG. 26. A user may switch into targeting mode in which the rendered internal airways may disappear, and the target may be displayed (e.g., depicted as a filled elliptical shape) in free space when the target is within a predetermined proximity range from the tip. The predetermined proximity range may be determined by the system or configurable by a user. In some cases, a graphical element (e.g., crosshair, and arrow) may display in the center of the panel with a triangular shaped indicator around its edge to show the target’s position relative to the direction the scope is facing. As escribed above, the visual indicator such as the location of the crosshair, the arrow may be determined based at least in part on the tomosynthesis result.

[0323] However, as the augmented fluoroscopy (AF) lesion overlay is generated from the lesion location determined in the most recent tomosynthesis acquisition. The geometric accuracy of that overlay depends on the patient being in the same respiratory / physical state (typically the same breath-hold state) during the tomosynthesis acquisition and during the live AF imaging. If the breath-hold states differ, the AF overlay can be displaced relative to the true lesion position and can provide misleading guidance to the operator.

[0324] The method 2800, 2810 may automatically determine whether the physical condition or breath-hold state of a subject during a tomosynthesis acquisition is consistent with the state during subsequent live augmented fluoroscopy, and for notifying the operator when inconsistency is detected. The verification is not intended as generic motion detection but specifically confirms equivalence of breath-hold (or other relevant) states used to generate and apply AF lesion overlays.

[0325] While preferred embodiments of the present disclosure provides have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the disclosure provides be limited by the specific examples provided within the specification. While the disclosure provides has been described with reference to the aforementioned specification, the descriptions and illustrations of theAttorney Docket No. 55441-738601embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure provides. Furthermore, it shall be understood that all aspects of the disclosure provides are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the embodiments of the disclosure provides described herein may be employed in practicing the disclosure provides. It is therefore contemplated that the disclosure provides shall also cover any such alternatives, modifications, variations, or equivalents. It is intended that the following claims define the scope of the disclosure provides and that methods and structures within the scope of these claims and their equivalents be covered thereby.

[0326] While various embodiments of the disclosure provides have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the disclosure provides. It should be understood that various alternatives to the embodiments of the disclosure provides described herein may be employed.

[0327] Whenever the term “at least,” “greater than,” or “greater than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “at least,” “greater than” or “greater than or equal to” applies to each of the numerical values in that series of numerical values. For example, greater than or equal to 1, 2, or 3 is equivalent to greater than or equal to 1, greater than or equal to 2, or greater than or equal to 3.

[0328] Whenever the term “no more than,” “less than,” or “less than or equal to” precedes the first numerical value in a series of two or more numerical values, the term “no more than,” “less than,” or “less than or equal to” applies to each of the numerical values in that series of numerical values. For example, less than or equal to 3, 2, or 1 is equivalent to less than or equal to 3, less than or equal to 2, or less than or equal to 1.

[0329] It should be understood, that any reference herein to the term “or” is intended to mean an “inclusive or” or what is also known as a “logical OR”, wherein when used as a logic statement, the expression “A or B” is true if either A or B is true, or if both A and B are true, and when used as a list of elements, the expression “A, B or C” is intended to include all combinations of the elements recited in the expression, for example, any of the elements selected from the group consisting of A, B, C, (A, B), (A, C), (B, C), and (A, B, C); and so on if additional elements are listed. Furthermore, it should also be understood that the indefinite articles “a” or “an”, and the corresponding associated definite articles “the” or “said”, are each intended to mean one or more unless otherwise stated, implied, or physically impossible. Yet further, it should be understood that the expressions “at least one of A and B, etc.”, “at least one of A or B, etc ”, “selected fromAttorney Docket No. 55441-738601A and B, etc.” and “selected from A or B, etc.” are each intended to mean either any recited element individually or any combination of two or more elements, for example, any of the elements from the group consisting of “A”, “B”, and “A AND B together”, etc.

[0330] Certain inventive embodiments herein contemplate numerical ranges. When ranges are present, the ranges include the range endpoints. Additionally, every sub range and value within the range is present as if explicitly written out. The term “about” or “approximately” may mean within an acceptable error range for the value, which will depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. For example, “about” may mean within 1 or more than 1 standard deviation, per the practice in the art. Alternatively, “about” may mean a range of up to 20%, up to 10%, up to 5%, or up to 1% of a given value. Where values are described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value may be assumed.

[0331] It should be noted that various illustrative or indicated ranges set forth herein are specific to their example embodiments and are not intended to limit the scope or range of disclosed technologies, but, again, merely provide example ranges for frequency, amplitudes, etc. associated with their respective embodiments or use cases.

[0332] While preferred embodiments of the present disclosure provides have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the disclosure provides be limited by the specific examples provided within the specification. While the disclosure provides has been described with reference to the aforementioned specification, the descriptions and illustrations of the embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure provides. Furthermore, it shall be understood that all aspects of the disclosure provides are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the embodiments of the disclosure provides described herein may be employed in practicing the disclosure provides. It is therefore contemplated that the disclosure provides shall also cover any such alternatives, modifications, variations, or equivalents. It is intended that the following claims define the scope of the disclosure provides and that methods and structures within the scope of these claims and their equivalents be covered thereby.

Claims

Atorney Docket No. 55441-738601CLAIMS WHAT IS CLAIMED IS:

1. A method for real-time fluoroscopy imaging for a robotic system, the method comprising:(a) acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, wherein the sequence of 2D fluoroscopy images is acquired at various angles;(b) reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles;(c) acquiring live fluoroscope image frames at a specific angle, wherein the live fluoroscope image frames contain the target feature;(d) displaying an overlay of a projection of the target feature onto the live fluoroscope image frames, wherein the projection of the target feature is based at least in part on the target feature in the reconstructed 3D image; and(e) determining whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference feature identified in a 2D fluoroscopy image selected from the sequence of 2D fluoroscopy images and identified in the live fluoroscope image frames.

2. The method of claim 1, further comprising comparing the displacement against a predetermined threshold and upon determining the displacement is greater than the threshold, displaying an indicator indicating the overlay is not at the correct location.

3. The method of claim 2, wherein the correct location is indicative of a physical condition of a subject during (a) and (c) being substantially the same.

4. The method of claim 3, wherein the indicator is indicative of an alignment level of the physical condition of the subject between (a) and (c).

5. The method of claim 3, wherein the physical condition is a breath holding state.

6. The method of claim 1, wherein the 2D fluoroscopy image is selected based on an angle associated with the 2D fluoroscopy images and the specific angle for acquiring the live fluoroscope image frames.

7. The method of claim 6, wherein the angle associated with the 2D fluoroscopy image is closest to the specific angle for acquiring the live fluoroscope image frames.

8. The method of claim 7, further comprising determining a difference between the angle associated with the of 2D fluoroscopy image and the specific angle is no greater than a predetermined threshold value.

9. The method of claim 7, further comprising performing registration between the 2D fluoroscopy image and the live fluoroscope image frames when a difference between the angle and the specific angle is greater than a predetermined threshold value.Attorney Docket No. 55441-73860110. The method of claim 1, wherein the reference feature is a diaphragm.

11. The method of claim 10, wherein the displacement of the reference feature is measured using a contour-based object tracking algorithm.

12. The method of claim 1, wherein the reference feature comprises one or more features identified in the sequence of 2D fluoroscopy images and the live fluoroscope image frames.

13. The method of claim 12, wherein the displacement of the reference feature is measured using a feature-based object tracking algorithm.

14. The method of claim 1, further comprising displaying an indicator indicating whether the overlay is at the correct location.

15. The method of claim 14, wherein the indicator is indicative of an alignment level of a physical condition of a subject between (a) and (c).

16. The method of claim 15, further comprising displaying information related to retaking the sequence of fluoroscopy images upon determining the overlay is not displayed at the correct location.

17. The method of claim 1, wherein the target feature is a target tissue to be operated by a tool of the robotic system.

18. The method of claim 17, wherein the live fluoroscope image frames show a distal portion of the tool and the overlay of the target feature for guiding a navigation of the tool approaching the target feature.

19. The method of claim 17, wherein the tool is a biopsy needle inserted through a robotic endoscope of the robotic system.

20. A non-transitory computer-readable media storing instructions which, when executed by at least one processor, cause the at least one processor to perform operations comprising:(a) acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, wherein the sequence of 2D fluoroscopy images is acquired at various angles;(b) reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles;(c) acquiring live fluoroscope image frames at a specific angle, wherein the live fluoroscope image frames contain the target feature;(d) displaying an overlay of a projection of the target feature onto the live fluoroscope image frames, wherein the projection of the target feature is based at least in part on the target feature in the reconstructed 3D image; and(e) determining whether the overlay is displayed at a correct location within the live fluoroscope image frames based at least in part on a displacement of a reference featureAttorney Docket No. 55441-738601identified in a 2D fluoroscopy image selected from the sequence of 2D fluoroscopy images and identified in the live fluoroscope image frames.

21. A method for real-time fluoroscopy imaging for a robotic system, the method comprising:(a) acquiring a sequence of two-dimensional (2D) fluoroscopy images containing a target feature, wherein the sequence of 2D fluoroscopy images are acquired at various angles;(b) reconstructing a three-dimensional (3D) image based on the sequence of 2D fluoroscopy images and the various angles;(c) selecting a 2D fluoroscopy image from the sequence of 2D fluoroscopy images based at least in part on the angle associated with the sequence of 2D fluoroscopy images and a viewpoint angle for acquiring 2D live fluoroscope imaging;(d) determining a displacement of a reference feature captured in the selected 2D fluoroscopy image and captured in the 2D live fluoroscope imaging; and(e) determining a difference in a physical state of a subject captured in the sequence of 2D fluoroscopy images and the 2D live fluoroscope imaging.

22. The method of claim 21, wherein the physical state is a breath holding state.

23. The method of claim 21, further comprising selecting the 2D fluoroscopy image as the projection from the sequence of 2D fluoroscopy images having a minimal angular difference with the viewpoint angle.

24. The method of claim 23, further comprising: when an angular difference between the selected 2D fluoroscopy image and the viewpoint angle exceeds a predetermined angle threshold, performing a 2D-2D registration between the selected 2D fluoroscopy image and the 2D live fluoroscope image prior to determining the displacement.

25. The method of claim 21, wherein the reference feature comprises at least one anatomical landmark selected from the group consisting of a diaphragm contour, a rib landmark, a vertebral landmark, a vessel bifurcation, and an implanted fiducial.

26. The method of claim 21, wherein determining the displacement comprises using a contour-based object tracking algorithm to segment and extract a contour of the reference feature in both the selected 2D fluoroscopy image and the 2D live fluoroscope image and computing one or more contour displacement metrics.

27. The method of claim 21, wherein determining the displacement comprises using a feature-based method including detection of keypoints in the selected 2D fluoroscopy image and the 2D live fluoroscope image, matching corresponding keypoints, and computing displacement as a function of matched keypoint motion vectors.Attorney Docket No. 55441-73860128. The method of claim 21, further comprising comparing the displacement to a predetermined displacement threshold and, upon determining the displacement is greater than the predetermined threshold, generating an indicator that the physical state difference exceeds an acceptable tolerance.

29. The method of claim 21, further comprising, upon determining that the physical state difference exceeds a threshold, automatically signaling a robotic control module for the robotic system to pause or limit advancement of a tool of the robotic system until alignment is restored or operator confirmation is received.

30. The method of claim 21, further comprising adaptively determining one or more displacement thresholds using a trained machine-learning model that receives inputs comprising imaging metadata, target size, reconstruction confidence, and historical alignment outcomes, and updating the model based on logged operator overrides and subsequent alignment corrections.