Transformers cell tracking method and system with adaptive threshold and temporal consistency constraint
The Transformer cell tracking method with adaptive threshold and temporal consistency constraints solves the problems of detection stability, reliability of new cell identification, and trajectory continuity in complex microscopic imaging scenarios, thereby improving the accuracy and robustness of cell tracking.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHANGCHUN UNIV OF SCI & TECH
- Filing Date
- 2026-05-21
- Publication Date
- 2026-06-19
AI Technical Summary
Existing cell tracking methods struggle to balance detection stability, reliability of new cell identification, and trajectory continuity in complex microscopic imaging scenarios. In particular, they are prone to trajectory breakage and identity drift when there are fluctuations in illumination, changes in noise, and drastic changes in cell morphology.
A Transformer cell tracking method with adaptive threshold and temporal consistency constraints is adopted. By screening candidate cells with dual threshold hysteresis, post-processing of segmentation mask and reliability filtering, temporal consistency judgment and spatial deduplication, a composite screening mechanism is constructed to eliminate low-reliability instances and confirm new cells.
It significantly improves the accuracy and robustness of cell tracking, reduces the risk of trajectory breakage and identity drift, and enhances the robustness of segmentation noise and the stability of tracking results.
Smart Images

Figure CN122244100A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer vision and biomedical image analysis technology, specifically to cell tracking methods. Background Technology
[0002] Cell tracking is an important research direction in biomedical image analysis. Its purpose is to establish a consistent identity for the same cell in a microscopic imaging sequence at different times, thereby providing support for the quantitative analysis of cell migration, proliferation, division and death.
[0003] Existing cell tracking methods typically involve two basic steps: first, detecting or segmenting cells in each frame of an image, and second, correlating the detection or segmentation results between adjacent frames.
[0004] With the development of deep learning technology, especially the introduction of attention mechanisms and Transformer structures, some methods have integrated the detection, segmentation and association processes into a unified network framework, and modeled cell targets through query mechanisms, which has improved the tracking ability in complex scenes to a certain extent.
[0005] Despite the progress made by the above methods at the structural level, many challenges remain in practical microscopic imaging applications, such as: First, existing methods generally rely on fixed thresholds or weakly adaptive thresholds to filter target confidence scores from the network output. However, microscopic videos often involve fluctuations in illumination, changes in noise, focus shifts, and drastic changes in cell morphology, leading to significant differences in target confidence distribution between different frames. Fixed thresholds easily cause candidate targets to frequently appear and disappear in adjacent frames, resulting in trajectory breaks and identity drift. For example, the Cell-SORT method described in the Chinese patent document "A Cell-SORT Cell Tracking Method and Device Based on Deep Learning" (CN117115205A), although improving the matching recall rate in the data association stage by dividing high and low score detection boxes with dual thresholds, its threshold mechanism is still mainly used for intra-frame matching optimization and fails to effectively address the impact of cross-frame confidence fluctuations on trajectory continuity in the target screening stage.
[0006] Secondly, many methods directly identify targets that do not match existing trajectories as new cells, lacking effective constraints on temporal continuity and spatial consistency. This can easily lead to misclassification of noise, fragmentation, or local morphological changes as new cells, resulting in duplicate numbering and counting errors.
[0007] In addition, the segmentation results often contain fragments of multiple connected components, empty masks, or pseudo-masks with too small an area. Once these unreliable instances enter the association process, they will further exacerbate the instability of the tracking results.
[0008] In summary, the core problem facing existing cell tracking technologies is that they still struggle to balance the three core dimensions of detection stability, reliability of new cell identification, and trajectory continuity in complex microscopic imaging scenarios. Summary of the Invention
[0009] This invention solves the problem in the prior art that it is still difficult to balance the three core dimensions of detection stability, reliability of new cell identification, and trajectory continuity in complex microscopic imaging scenarios.
[0010] A Transformer cell tracking method with adaptive threshold and temporal consistency constraints, the cell tracking method comprising the following steps: Step 1: Obtain the microscopic image sequence of the cells to be processed, preprocess each frame of the microscopic image, and input the preprocessed microscopic image into the cell tracking network for tracking processing; Step 2: The cell tracking network outputs the prediction results for each frame of the microscopic image, including candidate cell confidence, candidate bounding boxes, and segmentation mask. Step 3: Perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set; Step 4: Perform segmentation masking post-processing and reliability filtering on each candidate cell in the initial candidate cell set to remove low-reliability instances and obtain a reliable candidate cell set. Step 5: Associate and match the candidate cells in the reliable candidate cell set with the existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the existing cell trajectories are the cell trajectories updated after processing the previous frame image. Step 6: Perform temporal consistency determination and spatial deduplication on the unmatched cells, identify newly generated cells, and establish new trajectories for them; Step 7: Update the trajectory of all cells, output the cell identification number and trajectory result for each frame, and save the cell tracking data.
[0011] In a further optimized scheme, in step 1, the cell tracking network introduces a focus loss function for monitoring new cells during the training phase. The corresponding label is a binary label indicating whether it is a new cell. The focus loss function includes an exponential parameter and a balance parameter. The loss function terms for the tracking task and the segmentation task in the cell tracking network are each set with adjustable weight coefficients to achieve a loss balance between the tracking task and the segmentation task.
[0012] In a further optimized scheme, in step 3, the dual-threshold hysteresis candidate cell screening adopts different strategies for different types of queries: a high-threshold conservative screening is used for the query of tracked cells used to maintain existing trajectories, and a low-threshold adsorption strategy is used for the query of newly emerging candidate cells used to discover new targets.
[0013] In a further optimized approach, step 3, the dual-threshold hysteresis candidate cell screening, specifically involves: For each candidate cell in each frame of the microscopic image, the following determination is made based on its candidate cell confidence level: If the confidence level of the candidate cell is greater than or equal to the high threshold If the candidate cell is selected, it will be directly included in the initial candidate cell set. If the confidence level of the candidate cell is greater than the low threshold And less than the high threshold If the candidate cell has been included in the initial candidate cell set in the previous frame, or if the candidate cell is spatially continuous with the tracked cell in the previous frame and has the same category attribute, then the candidate cell is included in the initial candidate cell set; otherwise, the candidate cell is removed. If the confidence level of the candidate cell is less than the low threshold If so, the candidate cell is directly eliminated.
[0014] A further preferred option is the high threshold. and low threshold Adaptive adjustment based on the confidence distribution of candidate cells in the current frame, the adaptive adjustment including: sorting the candidate cells by confidence and taking the top ones. The average confidence score of each candidate cell was used as a reference value, and a high threshold was set. or low threshold Limited to a preset value range, and > .
[0015] In a further optimized approach, step 4, the post-processing of the segmentation mask and reliability filtering, includes the following steps: Step 41: Map the segmentation mask back to the original microscope image size to obtain a binary mask aligned with the coordinates of the original microscope image; Step 42: Perform connected component analysis on the binary mask: if the mask contains multiple connected regions, only the connected region with the largest area is retained as the effective mask for the candidate cell, and the remaining connected regions are removed; if the mask is empty, the effective mask is empty. Step 43: Set filtering conditions based on the geometric properties of the effective mask. The geometric properties include area, roundness, or shape factor, and correspond to preset minimum area threshold, roundness threshold, or shape constraint threshold. If the effective mask is empty, or the area of the effective mask is less than the preset minimum area threshold, or does not meet the preset roundness threshold, or does not meet the shape constraint threshold, then the candidate cell is determined to be a low-reliability instance and is removed from the initial candidate cell set; otherwise, the candidate cell is retained and included in the reliable candidate cell set.
[0016] In a further optimized approach, step 6 involves determining temporal consistency and spatial deduplication as follows: The unmatched cells were used as candidate cells for new cell generation. If the intersection-union ratio of the bounding box of the newly generated candidate cell with any tracked cell in the current frame is greater than the first preset threshold, or the distance between its center point and the center point of any tracked cell is less than the second preset threshold, then the candidate cell is determined to be a false detection or fragment of an existing cell and is not treated as a newly generated cell. Otherwise, establish or update the timing buffer record for the newly generated candidate cell.
[0017] A further preferred embodiment is that the establishment or updating of the time-series buffer record for newly generated candidate cells is as follows: If the newly generated candidate cell matches an existing temporal buffer record, the matching record is updated, including updating the position and mask area to the current value, incrementing the cumulative number of detections by 1, and updating the most recent detection frame number to the current frame number; If the newly generated candidate cell does not match any existing record, a new time buffer record is created for it, the cumulative detection count is initialized to 1, the most recently detected frame number is the current frame number, and the current position and mask area are recorded. The time-series buffer records are stored using a key-value structure, where the key contains at least the frame number and the candidate index, and the value contains at least the candidate bounding box, the mask area, the cumulative number of detections, and the most recent detection frame number.
[0018] In a further preferred embodiment, the confirmed newly generated cells are: For each temporal buffer record, if the candidate cell corresponding to the record accumulates a preset patient frame number threshold in a number of consecutive or intermittent frames, and its mask area meets the preset minimum area condition, then the candidate cell is identified as a new cell, a new identity number is assigned to it and a new trajectory is established, and the buffer record is cleared at the same time. If the buffer record fails to meet the confirmation criteria within the preset number of invalid frames, the buffer record will be cleared.
[0019] A Transformer cell tracking system with adaptive threshold and temporal consistency constraints, the cell tracking system comprising the following modules: The image acquisition and preprocessing module is used to acquire the microscopic image sequence of the cells to be processed, to preprocess each frame of the microscopic image, and to input the preprocessed microscopic image into the cell tracking network for tracking processing. A network prediction module is used to output the prediction results of each frame of the microscopic image from the cell tracking network. The prediction results include candidate cell confidence, candidate bounding boxes, and segmentation masks. A dual-threshold screening module is used to perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set. The post-masking processing and filtering module is used to perform segmentation post-masking processing and reliability filtering on each candidate cell in the initial candidate cell set, eliminating low-reliability instances and obtaining a reliable candidate cell set. The association matching module is used to associate and match candidate cells in the reliable candidate cell set with the currently existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the currently existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the currently existing cell trajectories are the cell trajectories updated after processing the previous frame image; The new cell confirmation module is used to determine the temporal consistency and spatial deduplication of the unmatched cells, confirm the new cells, and establish a new trajectory for them; The trajectory update and output module is used to update the trajectories of all cells, output the cell identification number and trajectory results for each frame, and save the cell tracking data.
[0020] The beneficial effects of this invention compared to the prior art are as follows: The method described in this application introduces a dual-threshold hysteresis screening mechanism to filter the confidence level of the network output in the time dimension during the target screening stage: once a target is confirmed because it exceeds the high threshold, even if its confidence level fluctuates in subsequent frames, as long as it does not fall below the low threshold, it is still regarded as the same target. This mechanism makes the target insensitive to confidence level fluctuations in adjacent frames, effectively overcoming the defect that fixed thresholds easily cause frequent target jitter, disappearance and reappearance, thereby greatly improving trajectory continuity and reducing the risk of identity drift and trajectory breakage.
[0021] The method described in this application performs connected component analysis and geometric reliability filtering on the segmentation mask to remove unreliable instances such as empty masks, small-area fragments, and noisy pseudo-masks before the data enters the association stage. This operation purifies the input data from the source and significantly reduces the interference of low-quality segmentation results on the tracking association and numbering process, thereby effectively enhancing the robustness of the tracking system to segmentation noise and the stability of the final result.
[0022] The method described in this application introduces a temporal consistency determination and spatial deduplication mechanism. Candidate targets that do not match existing trajectories are not directly identified as new cells. Instead, they are required to continuously hit the target in multiple consecutive frames and are confirmed only after spatial deduplication. This mechanism effectively avoids misidentifying short-lived false detections or cell division fragments as new targets, thereby significantly improving the accuracy and reliability of new cell identification and reducing the risk of duplicate numbering and counting errors.
[0023] The method described in this application solves the problems of poor detection stability, unreliable identification of new cells, and poor trajectory continuity in existing technologies by constructing a three-in-one composite screening mechanism of "input purification-time filtering-logic verification". Under complex microscopic imaging conditions with light fluctuations, large noise, dense cells, or obvious morphological changes, this application can significantly improve the accuracy and robustness of cell tracking.
[0024] The invention described herein is applicable to fields such as basic biomedical research, intelligent analysis of microscopic images, and quantification of dynamic cell behavior. Attached Figure Description
[0025] Figure 1 This is a flowchart of the cell tracking method described in Implementation Method 1; Figure 2 This is a schematic diagram of the algorithm logic for the dual-threshold hysteresis screening and timing consistency determination described in Implementation Method 5; Figure 3 This is a schematic diagram illustrating the working principle of the dual-threshold hysteresis candidate cell screening described in Implementation Method 7. Figure 4 This is a schematic diagram of the core mechanism of new cell confirmation and noise suppression in the temporal consistency determination described in Implementation Method Twelve. Detailed Implementation
[0026] Various embodiments of the present invention will now be clearly and completely described with reference to the accompanying drawings. The embodiments described with reference to the drawings are exemplary and intended to explain the present invention, and should not be construed as limiting the present invention.
[0027] Implementation Method 1: This implementation method provides a Transformer cell tracking method with adaptive threshold and temporal consistency constraints, such as... Figure 1 As shown, the cell tracking method includes the following steps: Step 1: Obtain the microscopic image sequence of the cells to be processed, preprocess each frame of the microscopic image, and input the preprocessed microscopic image into the cell tracking network for tracking processing; Step 2: The cell tracking network outputs the prediction results for each frame of the microscopic image, including candidate cell confidence, candidate bounding boxes, and segmentation mask. Step 3: Perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set; Step 4: Perform segmentation masking post-processing and reliability filtering on each candidate cell in the initial candidate cell set to remove low-reliability instances and obtain a reliable candidate cell set. Step 5: Associate and match the candidate cells in the reliable candidate cell set with the existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the existing cell trajectories are the cell trajectories updated after processing the previous frame image. Step 6: Perform temporal consistency determination and spatial deduplication on the unmatched cells, identify newly generated cells, and establish new trajectories for them; Step 7: Update the trajectory of all cells, output the cell identification number and trajectory result for each frame, and save the cell tracking data.
[0028] This implementation is based on the PyTorch deep learning framework. The experimental hardware platform consists of a 16vCPU Intel(R) Xeon(R) Platinum 8352V CPU @ 2.10GHz and an RTX4090 (24GB).
[0029] Implementation Method 2: This implementation method is a further limitation of Implementation Method 1, and provides an example of the preprocessing in step 1.
[0030] The acquisition of the microscopic image sequence of the cells to be processed can be obtained from public datasets or laboratory-collected datasets, with each sample containing a chronologically ordered sequence of frames. , No. Frame image denoted as Image size is .
[0031] For each frame Output cell tracking results set :
[0032] in, Assign a cell identification number, To segment the mask, The bounding box is used to ensure that the cell identification number of the same cell remains consistent across different frames.
[0033] Implementation Method 3: This implementation method is a further limitation of Implementation Method 1, and provides an example of the preprocessing in step 1.
[0034] The preprocessing includes the following steps: Step 11, for each frame of the microscopic image Execution intensity normalization is obtained :
[0035] in, For image The maximum pixel value; Step 12: Resize each frame of image to the input size of the cell tracking network. And standardized according to channel:
[0036] in, and The mean and standard deviation of each channel; Step 13: Construct DataLoader to divide the microscopic image samples into training and validation sets.
[0037] Implementation Method Four: This implementation method further defines Implementation Method One and provides an example of the cell tracking network in step 1.
[0038] During the training phase, the cell tracking network introduces a focus loss function for supervising newborn cells, with a corresponding label being a binary label indicating whether it is a newborn cell. The focus loss function includes an exponential parameter and a balance parameter. The loss function terms for the tracking task and the segmentation task in the cell tracking network are each set with adjustable weight coefficients to achieve a loss balance between the tracking task and the segmentation task.
[0039] Implementation Method 5: This implementation method further defines Implementation Method 1 and provides examples of the candidate cell confidence, candidate bounding box, and segmentation mask in step 2.
[0040] For the Frame number For each query, the confidence level of the candidate cell is defined as:
[0041] in The original confidence score (logit) output by the network; The queries are divided into two categories: tracking queries and target queries; Let the total number of queries be The number of tracking queries is The target number of queries is ,satisfy:
[0042] The candidate bounding box adopts a normalized center point and width / height format:
[0043] in, The normalized x-coordinate of the bounding box center point in the image. The normalized ordinate of the bounding box center point in the image. The normalized width of the bounding box. This is the normalized height of the bounding box; The segmentation mask prediction is as follows:
[0044] in To segment the original logit of the mask, This is the mask probability value after normalization using the sigmoid function.
[0045] Figure 2 This is a schematic diagram of the algorithm logic for dual-threshold hysteresis screening and timing consistency determination in this invention.
[0046] Implementation Method Six: This implementation method is a further limitation of Implementation Method One, and provides an example of the dual-threshold hysteresis candidate cell screening in step 3.
[0047] The dual-threshold hysteresis candidate cell screening employs different strategies for different types of queries: a high-threshold conservative screening is used for tracked cell queries used to maintain existing trajectories, while a low-threshold adsorption strategy is used for newly emerging candidate cell queries used to discover new targets.
[0048] Implementation Method Seven: This implementation method is a further limitation of Implementation Method One, and provides an example of the dual-threshold hysteresis candidate cell screening in step 3.
[0049] For each candidate cell in each frame of the microscopic image, the following determination is made based on its candidate cell confidence level: If the confidence level of the candidate cell is greater than or equal to the high threshold If the candidate cell is selected, it will be directly included in the initial candidate cell set. If the confidence level of the candidate cell is greater than the low threshold And less than the high threshold If the candidate cell has been included in the initial candidate cell set in the previous frame, or if the candidate cell is spatially continuous with the tracked cell in the previous frame and has the same category attribute, then the candidate cell is included in the initial candidate cell set; otherwise, the candidate cell is removed. If the confidence level of the candidate cell is less than the low threshold If so, the candidate cell is directly eliminated.
[0050] Figure 3 This illustrates the working principle of dual-threshold hysteresis candidate cell screening. Two key thresholds are set in the figure: a high threshold and a low threshold. =0.7 and low threshold =0.3, dividing the confidence of candidate cells into three regions.
[0051] The confidence level of the candidate cells corresponding to the activation region is greater than or equal to Candidate cells located in this region are directly included in the initial candidate cell set; candidate cells in the hysteresis region have a confidence level greater than or equal to [missing information]. and less than Candidate cells in this region need to undergo hysteresis adsorption based on historical information. If the candidate cell was included in the initial candidate cell set in the previous frame, or is spatially continuous with a cell already tracked in the previous frame, it is included in the initial candidate cell set; otherwise, it is discarded. Candidate cells in the inactivation region have a confidence level less than [a certain value]. Candidate cells located in this region are directly eliminated.
[0052] Figure 3 The paper presents the confidence levels of candidate cells in different frames. By comparing the relationship between confidence level and threshold, it illustrates that the hysteresis mechanism avoids repeated rejection and addition due to small fluctuations in confidence level. For example, when the confidence level drops from the activation region in frame 20 to the inactivation region in frame 30, the candidate cell can still be retained in the hysteresis region because it was included in the previous frame, thus demonstrating temporal stability and reducing the probability of trajectory breakage.
[0053] Implementation Method Eight: This implementation method further defines Implementation Method Seven, specifically regarding the high threshold value. and low threshold Let's illustrate with examples.
[0054] The high threshold and low threshold Adaptive adjustment based on the confidence distribution of candidate cells in the current frame, the adaptive adjustment including: sorting the candidate cells by confidence and taking the top ones. The average confidence score of each candidate cell was used as a reference value, and a high threshold was set. or low threshold Limited to a preset value range, and > ; Specifically, for the first Frame number The confidence score of each query is used to apply filtering rules, and binary reserved variables are defined. : And it satisfies the hysteresis constraint Hysteresis constraints are used to introduce temporal stability: low threshold adsorption is allowed for queries on newly generated candidate cells to reduce missed detections; queries on already tracked cells are combined with the state of the previous frame to suppress drift. This implementation introduces an adaptive threshold strategy to adapt to frames of varying difficulty, and to adjust the confidence set for querying newly generated candidate cells. Take before The mean confidence score of each candidate cell was used as a reference value. :
[0055] in The number after descending order High confidence level yields adaptive threshold :
[0056] The final high and low thresholds used for querying newborn candidate cells in this frame are updated as follows:
[0057] Implementation Method Nine: This implementation method is a further limitation of Implementation Method One, and provides an example of the post-processing of the segmentation mask and the reliability filtering in step 4.
[0058] The post-processing and reliability filtering of the segmentation mask includes the following steps: Step 41: Map the segmentation mask back to the original microscope image size to obtain a binary mask aligned with the coordinates of the original microscope image; Step 42: Perform connected component analysis on the binary mask: if the mask contains multiple connected regions, only the connected region with the largest area is retained as the effective mask for the candidate cell, and the remaining connected regions are removed; if the mask is empty, the effective mask is empty. Step 43: Set filtering conditions based on the geometric properties of the effective mask. The geometric properties include area, roundness, or shape factor, and correspond to preset minimum area threshold, roundness threshold, or shape constraint threshold. If the effective mask is empty, or the area of the effective mask is less than the preset minimum area threshold, or does not meet the preset roundness threshold, or does not meet the shape constraint threshold, then the candidate cell is determined to be a low-reliability instance and is removed from the initial candidate cell set; otherwise, the candidate cell is retained and included in the reliable candidate cell set. Among them, candidate cell elimination based on the minimum area threshold involves calculating the effective mask area. :
[0059] in This serves as an effective mask for the candidate cells. As a preset minimum area condition, when The candidate cell is then removed.
[0060] Implementation Method 10: This implementation method is a further limitation of Implementation Method 5, and provides an example of the timing consistency determination and spatial deduplication in step 6.
[0061] The temporal consistency determination and spatial deduplication are as follows: The unmatched cells were used as candidate cells for new cell generation. If the intersection-union ratio of the bounding box of the newly generated candidate cell with any tracked cell in the current frame is greater than the first preset threshold, or the distance between its center point and the center point of any tracked cell is less than the second preset threshold, then the candidate cell is determined to be a false detection or fragment of an existing cell and is not treated as a newly generated cell. Otherwise, establish or update the timing buffer record for the newly generated candidate cell; Specifically, it includes the following steps: Step 61, Deduplication: Let the set of tracked cell bounding boxes be... For candidate bounding boxes Calculation of bounding boxes with tracked cells :
[0062]
[0063] when First preset threshold At that time, the candidate was considered a false detection of existing cells or a fragment of cell division and was not treated as a new cell. Step 62, Center Distance Deduplication: The distance between the center of the candidate bounding box and the center of the tracked cell bounding box is:
[0064]
[0065] when Second preset threshold At that time, it is not treated as a newly formed cell; Step 63, Time Buffer Establishment: Establish a time buffer for candidates that have passed the deduplication in Steps 61 and 62, and record the cumulative number of detections. Compared with the most recently detected frame number And the newly formed cells were confirmed through Implementation Method Twelve.
[0066] Implementation Method Eleven: This implementation method is a further limitation of Implementation Method Seven, and provides an example of establishing or updating the time-series buffer record for newly generated candidate cells.
[0067] The establishment or updating of the time-series buffer record for newly generated candidate cells is as follows: If the newly generated candidate cell matches an existing temporal buffer record, the matching record is updated, including updating the position and mask area to the current value, incrementing the cumulative number of detections by 1, and updating the most recent detection frame number to the current frame number; If the newly generated candidate cell does not match any existing record, a new time buffer record is created for it, the cumulative detection count is initialized to 1, the most recently detected frame number is the current frame number, and the current position and mask area are recorded. The time-series buffer record is stored using a key-value structure, wherein the key contains at least the frame number and the candidate index, and the value contains at least the candidate bounding box, the mask area, the cumulative number of detections, and the most recent detection frame number. Specifically, the candidate cells retained after association matching inherit the cell identification number from the previous frame and update the candidate bounding box and segmentation mask; new cell identification numbers are assigned to the newly generated cells identified in Implementation Method 7. Let the current maximum cell ID be... The new number has been updated to:
[0068] The final output is a cell segmentation result image for each frame. Its pixel value is the cell identification number:
[0069] When the pixel belongs to An effective mask for each cell At the same time, it outputs a cell trajectory record table (TrackTable), which records the start and end frames of each cell trajectory and its relationship with its parent generation.
[0070] Implementation Method Twelve: This implementation method is a further limitation of Implementation Method Ten, and provides an example of the confirmed new cells.
[0071] The confirmed newly formed cells are: For each temporal buffer record, if the candidate cell corresponding to the record accumulates a preset patient frame number threshold in a number of consecutive or intermittent frames, and its mask area meets the preset minimum area condition, then the candidate cell is identified as a new cell, a new identity number is assigned to it and a new trajectory is established, and the buffer record is cleared at the same time. If the buffer record fails to meet the confirmation criteria within the preset number of invalid frames, the buffer record will be cleared. The confirmation criteria are as follows:
[0072] in This is the preset frame rate threshold. The cleanup conditions are:
[0073] in The current frame number. The most recently detected frame number. This is the preset number of failed frames.
[0074] Figure 4 This demonstrates the core mechanism of new cell identification and noise suppression in temporal consistency determination. Two key parameters are set in the figure: a preset patience frame threshold P=3 frames, indicating that candidate cells need to be detected a cumulative total of 3 times to be identified as new cells; and a preset failure frame threshold. =2 frames, indicating the maximum number of consecutive frames that a candidate cell is allowed to remain undetected in the temporal buffer record.
[0075] Figure 4 (a) in the diagram represents the processing flow for real newly formed cells: candidate cells are processed in the first... Frames are entered into the timing buffer for recording (cumulative detection count = 1), the first... Frame detected (cumulative detection count increased to 2), the first A frame was detected (the cumulative number of detections increased to 3, reaching the preset patient frame count threshold P), the first... The frame is confirmed as a new cell and assigned a cell identification number ID=1, and enters a stable tracking state.
[0076] Figure 4 (b) shows the noise / false detection processing flow: candidate cells enter the temporal buffer recording in frame t (cumulative detection count = 1), and in frame t... Frame not detected (cumulative detection count remains unchanged), number No frame was detected again (the preset number of failed frames has been exceeded since the most recent detected frame number). Finally, the timing buffer records are cleared.
[0077] Implementation Method Thirteen: This implementation method provides a Transformer cell tracking system with adaptive threshold and temporal consistency constraints. The cell tracking system includes the following modules: The image acquisition and preprocessing module is used to acquire the microscopic image sequence of the cells to be processed, to preprocess each frame of the microscopic image, and to input the preprocessed microscopic image into the cell tracking network for tracking processing. A network prediction module is used to output the prediction results of each frame of the microscopic image from the cell tracking network. The prediction results include candidate cell confidence, candidate bounding boxes, and segmentation masks. A dual-threshold screening module is used to perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set. The post-masking processing and filtering module is used to perform segmentation post-masking processing and reliability filtering on each candidate cell in the initial candidate cell set, eliminating low-reliability instances and obtaining a reliable candidate cell set. The association matching module is used to associate and match candidate cells in the reliable candidate cell set with the currently existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the currently existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the currently existing cell trajectories are the cell trajectories updated after processing the previous frame image; The new cell confirmation module is used to determine the temporal consistency and spatial deduplication of the unmatched cells, confirm the new cells, and establish a new trajectory for them; The trajectory update and output module is used to update the trajectories of all cells, output the cell identification number and trajectory results for each frame, and save the cell tracking data.
Claims
1. A Transformer cell tracking method with adaptive threshold and temporal consistency constraints, characterized in that, The cell tracking method includes the following steps: Step 1: Obtain the microscopic image sequence of the cells to be processed, preprocess each frame of the microscopic image, and input the preprocessed microscopic image into the cell tracking network for tracking processing; Step 2: The cell tracking network outputs the prediction results for each frame of the microscopic image, including candidate cell confidence, candidate bounding boxes, and segmentation mask. Step 3: Perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set; Step 4: Perform segmentation masking post-processing and reliability filtering on each candidate cell in the initial candidate cell set to remove low-reliability instances and obtain a reliable candidate cell set. Step 5: Associate and match the candidate cells in the reliable candidate cell set with the existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the existing cell trajectories are the cell trajectories updated after processing the previous frame image. Step 6: Perform temporal consistency determination and spatial deduplication on the unmatched cells, identify newly generated cells, and establish new trajectories for them; Step 7: Update the trajectory of all cells, output the cell identification number and trajectory result for each frame, and save the cell tracking data.
2. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 1, characterized in that, In step 1, during the training phase, the cell tracking network introduces a focus loss function for supervising newborn cells, with the corresponding label being a binary label indicating whether it is a newborn cell. The focus loss function includes an exponential parameter and a balance parameter. The loss function terms for the tracking task and the segmentation task in the cell tracking network are each set with adjustable weight coefficients to achieve a loss balance between the tracking task and the segmentation task.
3. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 1, characterized in that, In step 3, the dual-threshold hysteresis candidate cell screening employs different strategies for different types of queries: a high-threshold conservative screening is used for queries of tracked cells used to maintain existing trajectories, while a low-threshold adsorption strategy is used for queries of newly emerging candidate cells used to discover new targets.
4. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 1, characterized in that, In step 3, the dual-threshold hysteresis candidate cell screening specifically involves: For each candidate cell in each frame of the microscopic image, the following determination is made based on its candidate cell confidence level: If the confidence level of the candidate cell is greater than or equal to the high threshold If the candidate cell is selected, it will be directly included in the initial candidate cell set. If the confidence level of the candidate cell is greater than the low threshold And less than the high threshold Then, hysteresis adsorption is performed by combining the historical information of the candidate cell: if the candidate cell has been included in the initial candidate cell set in the previous frame, or if the candidate cell is spatially continuous with the cell tracked in the previous frame and has the same category attribute, then the candidate cell is included in the initial candidate cell set. Otherwise, the candidate cell is discarded; If the confidence level of the candidate cell is less than the low threshold If so, the candidate cell is directly eliminated.
5. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 4, characterized in that, The high threshold and low threshold Adaptive adjustment based on the confidence distribution of candidate cells in the current frame, the adaptive adjustment including: sorting the candidate cells by confidence and taking the top ones. The average confidence score of each candidate cell was used as a reference value, and a high threshold was set. or low threshold Limited to a preset value range, and > .
6. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 1, characterized in that, Step 4, the post-processing of the segmentation mask and reliability filtering includes the following steps: Step 41: Map the segmentation mask back to the original microscope image size to obtain a binary mask aligned with the coordinates of the original microscope image; Step 42: Perform connected component analysis on the binary mask: if the mask contains multiple connected regions, only the connected region with the largest area is retained as the effective mask for the candidate cell, and the remaining connected regions are removed; if the mask is empty, the effective mask is empty. Step 43: Set filtering conditions based on the geometric properties of the effective mask. The geometric properties include area, roundness, or shape factor, and correspond to preset minimum area threshold, roundness threshold, or shape constraint threshold. If the effective mask is empty, or the area of the effective mask is less than the preset minimum area threshold, or does not meet the preset roundness threshold, or does not meet the shape constraint threshold, then the candidate cell is determined to be a low-reliability instance and is removed from the initial candidate cell set; otherwise, the candidate cell is retained and included in the reliable candidate cell set.
7. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 1, characterized in that, In step 6, the timing consistency determination and spatial deduplication are as follows: The unmatched cells were used as candidate cells for new cell generation. If the intersection-union ratio of the bounding box of the newly generated candidate cell with any tracked cell in the current frame is greater than the first preset threshold, or the distance between its center point and the center point of any tracked cell is less than the second preset threshold, then the candidate cell is determined to be a false detection or fragment of an existing cell and is not treated as a newly generated cell. Otherwise, establish or update the timing buffer record for the newly generated candidate cell.
8. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 7, characterized in that, The establishment or updating of the time-series buffer record for newly generated candidate cells is as follows: If the newly generated candidate cell matches an existing temporal buffer record, the matching record is updated, including updating the position and mask area to the current value, incrementing the cumulative number of detections by 1, and updating the most recent detection frame number to the current frame number; If the newly generated candidate cell does not match any existing record, a new time buffer record is created for it, the cumulative detection count is initialized to 1, the most recently detected frame number is the current frame number, and the current position and mask area are recorded. The time-series buffer records are stored using a key-value structure, where the key contains at least the frame number and the candidate index, and the value contains at least the candidate bounding box, the mask area, the cumulative number of detections, and the most recent detection frame number.
9. The Transformer cell tracking method with adaptive threshold and temporal consistency constraints according to claim 7, characterized in that, The confirmed newly formed cells are: For each temporal buffer record, if the candidate cell corresponding to the record accumulates a preset patient frame number threshold in a number of consecutive or intermittent frames, and its mask area meets the preset minimum area condition, then the candidate cell is identified as a new cell, a new identity number is assigned to it and a new trajectory is established, and the buffer record is cleared at the same time. If the buffer record fails to meet the confirmation criteria within the preset number of invalid frames, the buffer record will be cleared.
10. A Transformer cell tracking system with adaptive threshold and temporal consistency constraints, characterized in that, The cell tracking system includes the following modules: The image acquisition and preprocessing module is used to acquire the microscopic image sequence of the cells to be processed, to preprocess each frame of the microscopic image, and to input the preprocessed microscopic image into the cell tracking network for tracking processing. A network prediction module is used to output the prediction results of each frame of the microscopic image from the cell tracking network. The prediction results include candidate cell confidence, candidate bounding boxes, and segmentation masks. A dual-threshold screening module is used to perform dual-threshold hysteresis candidate cell screening on the prediction results to obtain an initial candidate cell set. The post-masking processing and filtering module is used to perform segmentation post-masking processing and reliability filtering on each candidate cell in the initial candidate cell set, eliminating low-reliability instances and obtaining a reliable candidate cell set. The association matching module is used to associate and match candidate cells in the reliable candidate cell set with the currently existing cell trajectories to obtain matched cells and unmatched cells; wherein, for the first frame image, the currently existing cell trajectories are empty, and all candidate cells are treated as unmatched cells; for non-first frame images, the currently existing cell trajectories are the cell trajectories updated after processing the previous frame image; The new cell confirmation module is used to determine the temporal consistency and spatial deduplication of the unmatched cells, confirm the new cells, and establish a new trajectory for them; The trajectory update and output module is used to update the trajectories of all cells, output the cell identification number and trajectory results for each frame, and save the cell tracking data.