An eye-tracking method and system based on dynamic trajectory calibration
By employing adaptive dual-mode pupil detection, an improved RANSAC algorithm, and intelligent data cleaning, combined with multi-sensor fusion, the reliability and stability issues of traditional eye-tracking technology in aviation applications have been resolved, achieving high-precision gaze estimation and stable eye-tracking performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA AVIATION LIFESAVING INST
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-30
AI Technical Summary
Traditional eye-tracking technology is difficult to meet the reliability, accuracy and stability requirements of actual combat in aviation applications, especially in environments with strong interference such as vibration and changes in lighting. The calibration process is unreliable, the ability to process abnormal data is weak, and the environmental adaptability is insufficient.
Adaptive dual-mode pupil detection, improved RANSAC algorithm for eye center estimation, dynamic trajectory calibration, and intelligent data cleaning are employed, combined with multi-sensor fusion, to achieve high-precision and robust eye tracking.
It significantly improves calibration accuracy and system stability, maintains high-precision line-of-sight estimation in complex environments, and provides an excellent user experience and system performance.
Smart Images

Figure CN122313554A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to, but is not limited to, the fields of computer vision, human-computer interaction and aerospace equipment technology, and particularly to an eye-tracking method and system based on dynamic trajectory calibration. Background Technology
[0002] Eye-tracking technology has extremely high application value in the aviation field and is one of the key technologies for improving aviation safety and human-machine ergonomics. It can be used for pilot attention allocation analysis, human-machine interface optimization, and real-time fatigue monitoring.
[0003] However, traditional eye-tracking technology faces severe challenges in the special application environment of aviation, and its reliability, accuracy, and stability are difficult to meet the requirements of actual combat. Summary of the Invention The purpose of this invention is to provide an eye-tracking method and system based on dynamic trajectory calibration, so as to solve the problem that the reliability, accuracy and stability of traditional eye-tracking technology cannot meet the practical needs of aviation applications.
[0004] The technical solution of the present invention is as follows: In a first aspect, the present invention provides an eye-tracking method based on dynamic trajectory calibration, comprising: Step 1: Perform pupil detection using an adaptive dual-mode pupil detection method, and collect pupil ellipse parameters from multiple frames of eye images; during the pupil detection process, automatically select either direct detection mode or tracking detection mode to perform pupil detection based on the pupil confidence level, tracking status, and computing resources of the eye images. Step 2: Based on the pupil ellipse parameters of the multiple frames of eye images collected in Step 1, obtain multiple frames of pupil observation data for estimating the eyeball center. Use the robust eyeball center estimation algorithm based on improved RANSAC to estimate the eyeball center and calculate the eyeball radius to establish an eyeball model. Step 3: Implement the calibration process, collect calibration data pairs formed by the human eye's visual axis direction and the coordinates of the calibration point, and use an intelligent data cleaning algorithm to clean the calibration data pairs to establish a mapping model from the visual axis direction to the screen coordinates. Step 4: Based on the established eye model and the calibrated mapping model, during the continuous operation of eye tracking, based on the results of real-time pupil detection and real-time gaze direction, the binocular fusion gaze point is estimated through multi-sensor fusion, and the gaze point sequence is output after smoothing.
[0005] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, step 1 includes: During the initialization, tracking failure recovery, and periodic recalibration phases of the eye-tracking system, a direct detection mode is used to perform fine processing on the global eye image to ensure accurate pupil detection results. When the eye-tracking system is successfully initialized and the detection results of the direct detection mode meet the preset conditions, the tracking detection mode is activated, and rapid tracking detection is performed within the region of interest (ROI) based on the high-confidence detection results of the previous frame.
[0006] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, step 1 includes: Step 11: Acquire an eye image and initialize it before performing pupil detection; the initialization includes setting confidence thresholds A and B, and setting the initial pupil confidence. Step 12: Determine if the stored pupil queue is empty; if it is empty, proceed to step 13; if it is not empty, continue to determine if the cumulative running time of the tracking detection mode exceeds the fixed period; if it is not, proceed to step 13; if it is, delete the pupil queue information and then proceed to step 13. Step 13: Determine whether the current pupil confidence level is less than or equal to the confidence threshold A. If the determination is yes, proceed to step 14; if the determination is no, proceed to step 15. Step 14: Perform global eye image detection using direct detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B and the pupil queue is not empty, switch to tracking detection mode to perform tracking detection within the ROI region, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result in direct detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 15: Perform tracking detection within the ROI region using the tracking detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B, switch to the tracking detection mode to perform global eye image detection, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result under the tracking detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 16: Determine whether a preset number of pupil ellipse parameters have been collected. If yes, proceed to step 2 to establish the eyeball model. If no, return to step 11 to re-execute pupil detection until a preset number of pupil ellipse parameters have been collected.
[0007] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, the detection process in step 1 using the direct detection mode includes: S11, multi-level image preprocessing, including: downsampling the input eye image, dynamically adjusting the contrast stretching range according to the global grayscale distribution of the eye image, and enhancing the pupil-iris contrast. S12 employs an improved Canny edge detection method, which includes: calculating the gradient magnitude and direction of the eye image in the x and y directions, thinning the edge response by suppressing non-maximum values, and using dual thresholding to connect strong edges, suppress weak edges and isolated noise points to obtain an accurate single-pixel width edge map. S13, Orthogonal edge filtering, including: scanning the single-pixel width edge map along four directions of 0°, 45°, 90° and 135° respectively, calculating the support of each edge point in a specific neighborhood in its normal direction, and identifying and removing orthogonal line segment interference that does not conform to the circular contour feature of the pupil; S14, Edge contour extraction and filtering, including: performing connected component analysis on the edges filtered by S13 to extract all closed or nearly closed contours; filtering each contour to initially select candidate pupil contours. S15, Candidate contour segment processing and merging, including: calculating the ratio of the intersection area of the minimum bounding rectangles of different candidate contour segments to the union area; if the ratio exceeds the adaptive threshold, these candidate contour segments are determined to belong to the same pupil and are merged. S16, high-precision ellipse fitting, includes: performing ellipse fitting on the candidate contour edge point set merged in S15 using the weighted least squares method, and outputting the detected pupil ellipse parameters; wherein, the weights in this step are allocated according to the gradient magnitude of the candidate contour edge points and the signal-to-noise ratio of their location.
[0008] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, the detection process using the tracking detection mode in step 1 includes: S21, Dynamic ROI setting, including: using the pupil center of the previous frame as a reference point, dynamically setting a rectangular region of interest (ROI) that adapts and adjusts in size according to its movement speed and pupil size; S22, Multimodal bright and dark mask processing, including: generating bright pupil masks and dark pupil masks for ROI regions respectively; S23 enhances the local contrast of the ROI region; S24, Intelligent intersection ratio judgment, including: calculating the ratio of the intersection area of the candidate pupil region and the ROI region in the current frame to the area of the ROI region; when the calculated ratio is lower than the set threshold, it is determined that the pupil may be moving rapidly or the tracking is about to be lost, triggering the confidence decrease mechanism to provide a decision basis for mode switching; S25. Edge extraction or binarization is performed on the ROI region image enhanced by S23. The obtained pixels are grouped using a clustering algorithm to determine the neighborhood radius and minimum number of points, thereby separating the pupil region from noise points and filtering out pixel clusters that are too small or have unreasonable shapes. S26. For the pixel clusters obtained by clustering, calculate their convex hulls, perform least-squares ellipse fitting on the convex hull point set, and output the tracked pupil ellipse parameters.
[0009] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, the method for establishing the eye model in step 2 includes: Step 21: Collect and calculate multiple frames of pupil observation data, including: calculating the pupil ellipse parameters on the image plane based on the collected multiple frames of eye images, calculating the back projection circle and the two-dimensional line of sight, and obtaining pupil observation data for estimating the center of the eyeball; Step 22: Perform eye center estimation based on the robust eye center estimation algorithm of improved RANSAC to solve for the eye center; Step 23: After determining the center of the eyeball, a robust estimation algorithm is used to calculate the radius of the eyeball, and the median of multiple independent calculations is taken as the final value of the radius of the eyeball to establish the eyeball model. Step 24: By introducing the Kappa angle for calibration, the calculated optical axis direction is accurately converted into the true visual axis direction, i.e., the line of sight direction.
[0010] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, step 22 performs eye center estimation, including two schemes: Scheme 1 transforms the problem of finding the eyeball center into finding a three-dimensional sphere center such that this sphere is tangent to multiple pupil circular planes obtained by back projection, and the tangency points satisfy preset geometric constraints. In Scheme 1, pupil ellipse parameters from several frames are randomly sampled to generate a candidate eyeball center. The distance from the pupil circular planes of all frames to the candidate eyeball center is calculated, and a consistency metric function is constructed by combining pupil physiological radius constraints and spatial consistency constraints for interior point judgment. The eyeball model with the most interior points and the smallest consistency error is selected as the optimal solution for eyeball center estimation, and the coordinates of the eyeball center are obtained. Scheme 2 transforms the problem of solving the eye center projection into finding a point on a two-dimensional plane such that the sum of the distances from that point to all back-projected line vectors is minimized. In Scheme 2, line lines from several frames are randomly sampled, and their combined intersection points are calculated as candidate points. The distances from all lines to the candidate point are calculated, and in-place points are determined by combining distance threshold constraints and spatial consistency constraints. After iteration, the optimal two-dimensional eye center projection point is solved using the least squares method with all in-place line lines. This two-dimensional eye center projection point is then back-projected into three-dimensional space to obtain the final eye center coordinates.
[0011] Optionally, in the eye-tracking method based on dynamic trajectory calibration described above, the implementation of dynamic trajectory calibration and intelligent data cleaning in step 3 includes: Step 31, dynamic trajectory design, includes: controlling the calibration point to move smoothly on the virtual screen according to a predefined composite trajectory of "rectangular border + two diagonals", guiding the user's eye movement through the movement of the calibration point, so as to collect massive, continuous and spatially uniform calibration data pairs (φ,θ,x,y); where x,y are the two-dimensional coordinates of the calibration point on the virtual screen, and φ,θ are the azimuth and pitch angles of the human eye; Step 32: The calibration data is processed using an intelligent data cleaning algorithm to remove noise points that do not conform to the motion law and retain high-quality smooth tracking data. Step 33: Using the cleaned and smoothed tracking data, establish a mapping model from the view axis direction (φ,θ) to the screen coordinates (x,y).
[0012] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, the intelligent data cleaning algorithm in step 32 includes the following processing steps: Step 32-1, invalid data removal, including: filtering out invalid data caused by detection failure; Step 32-2, First-order difference calculation, includes: For calibration data in time series, calculating the first-order difference values of the azimuth angle φ and elevation angle θ between adjacent data points, respectively: Δφ[i]=φ[i]-φ[i-1] and Δθ[i]=θ[i]-θ[i-1]; The first-order difference value mentioned above represents the instantaneous angular velocity of the line of sight motion; Step 32-3, adaptive threshold filtering, includes: calculating the standard deviations of all Δφ and Δθ, i.e., σ_φ, σ_θ, multiplying each standard deviation by a preset empirical coefficient as the adaptive dynamic threshold; traversing all data points, filtering out abnormal data points that meet the condition |Δφ[i]|>0.8*σ_φ or |Δθ[i]|>0.8*σ_θ, and removing them.
[0013] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, the processing method for the continuous operation phase of eye tracking in step 4 includes: Step 41, perform pupil detection in real time, including: calculate the two-dimensional pupil ellipse parameters of the left and right eyes in real time based on the eye images captured by the left and right eye cameras in real time; and convert the two-dimensional pupil ellipse of each frame into a three-dimensional pupil circle tangent to the sphere of the eye ball model by using inverse projection based on the eye ball model, thereby obtaining the center position of the three-dimensional pupil circle in real time. Step 42: Determine the real-time line of sight direction. Without considering the Kappa angle, use the optical axis as the line of sight direction. With consideration of the Kappa angle, use the visual axis as the line of sight direction. Step 43: Use multi-sensor fusion to estimate binocular gaze to obtain binocular fusion gaze points; Step 44: Smoothing of binocular fusion fixation points. Based on the real-time performance requirements and application scenarios of the eye-tracking system, a sliding window strategy is used to perform mean filtering and output the fixation point sequence.
[0014] Optionally, in the eye-tracking method based on dynamic trajectory calibration as described above, step 44 includes: Step 44-1: Perform sliding window mean filtering on the binocular fusion fixation point, using a fixed-length or adaptive variable-length sliding window for mean filtering. Step 44-2: Use the Kalman filter prediction model to predict the current binocular fusion fixation point position, and then perform a weighted fusion of the predicted value and the filtered value. Step 44-3: Output the binocular fused gaze point after being fused with the prediction model through sliding window mean filtering.
[0015] Secondly, embodiments of the present invention also provide an eye-tracking system based on dynamic trajectory calibration, comprising: an eye-tracking device, a camera device, an eye-tracking processor, and a display configured in the eye-tracking device; The camera device includes left and right eye cameras disposed on both sides of the eye tracking device, used to acquire images of the user's left and right eyes in real time and transmit them to the eye tracking processor, so that the eye tracking processor can execute the eye tracking method as described above, and display the calibration point trajectory on the display during the execution of the eye tracking method. The eye-tracking processor includes an adaptive dual-mode pupil detection module and a fixation point estimation module; the adaptive dual-mode pupil detection module is used to output a preset number of pupil ellipse parameters by executing step 1; the fixation point estimation module is used to execute steps 2 to 4 based on the output pupil ellipse parameters to output a fixation point sequence.
[0016] The beneficial effects of this invention are as follows: This invention provides an eye-tracking method and system based on dynamic trajectory calibration. On the one hand, it employs an adaptive dual-mode pupil detection method to perform pupil detection. During pupil detection, based on the pupil confidence level of the eye image, tracking status, and computational resource availability, it automatically determines whether to use at least one of a direct detection mode or a tracking detection mode to perform pupil detection. This achieves a balance between detection accuracy and efficiency in achieving high-precision detection results and rapid tracking detection. On the other hand, by establishing an eye model, performing dynamic trajectory calibration and intelligent data cleaning, and using multi-sensor fusion for gaze direction estimation, high-precision gaze point estimation is achieved. Compared with the prior art, the technical solution provided by the embodiments of this invention has the following significant beneficial effects: First, it has achieved a revolutionary improvement in calibration accuracy and reliability: by adopting dynamic trajectory calibration combined with intelligent data cleaning, the quality and quantity of input data are fundamentally guaranteed, resulting in a significant improvement in the accuracy of the mapping model in the screen area. Second, excellent environmental robustness: In the eye modeling stage, the improved RANSAC algorithm is used to perform eye center estimation, and an intelligent filtering mechanism is used in the intelligent data cleaning process. This can ensure that the system can maintain stable output under strong interference environments such as vibration and light changes, meeting the stringent requirements of aviation applications. Third, excellent user experience and efficiency: the dynamic trajectory calibration process is more natural and faster, and the gaze point output is smooth and fluid, which greatly enhances the pilot's willingness to use the system and the human-machine efficiency of the system.
[0017] Fourth, systematic innovation: From data collection and cleaning to modeling, a complete technical loop is formed, with each link working together to ensure the superior performance of the final product. Attached Figure Description
[0018] The accompanying drawings are provided to further understand the technical solutions of the present invention and constitute a part of the specification. They are used together with the embodiments of this application to explain the technical solutions of the present invention and do not constitute a limitation on the technical solutions of the present invention.
[0019] Figure 1 A flowchart illustrating an eye-tracking method based on dynamic trajectory calibration provided in an embodiment of the present invention; Figure 2 A flowchart illustrating another eye-tracking method based on dynamic trajectory calibration provided in an embodiment of the present invention; Figure 3 This is a schematic diagram of the eye center estimation process in the eye tracking method provided in this embodiment of the invention; Figure 4 This is a schematic diagram of a composite trajectory for dynamic trajectory calibration in the eye-tracking method provided in this embodiment of the invention; Figure 5 This is a flowchart illustrating the intelligent data cleaning algorithm in the eye-tracking method provided in this embodiment of the invention. Figure 6 This is a flowchart of the gaze point smoothing process in the eye-tracking method provided in an embodiment of the present invention. Detailed Implementation
[0020] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, unless otherwise specified, the embodiments and features described in this application can be arbitrarily combined with each other.
[0021] As explained in the background section above, in the field of aviation applications, the reliability, accuracy, and stability of traditional eye-tracking technology are insufficient to meet the demands of actual combat, specifically in the following aspects: (1) The calibration process is unreliable and inefficient: Existing eye tracking solutions generally adopt static point calibration mode (such as discrete nine-point calibration), which is tedious and the collected data points are sparse and unevenly distributed. Users are prone to low data quality due to prediction or distraction, making it difficult to establish an accurate gaze mapping model, especially in nonlinear areas such as the screen edge where the error is significant. In addition, the cumbersome calibration process also seriously affects user experience and operating efficiency.
[0022] (2) Weak ability to handle abnormal data and suppress noise: During calibration and real-time tracking, abnormal data points frequently occur due to factors such as violent head movements, sudden changes in ambient light, blinking, and momentary occlusion. Existing eye-tracking schemes lack effective online cleaning and compensation mechanisms, which leads to deviations in the estimation of eye model parameters, significant jitter in the gaze point output, and reduced system reliability.
[0023] (3) Insufficient environmental adaptability and robustness: The aviation operating environment is subject to strong interference factors such as continuous high-frequency vibration and complex and variable lighting conditions. The algorithms of traditional eye-tracking schemes (such as the least squares method) are highly sensitive to outliers. Existing systems are difficult to maintain stable and reliable performance in such high-noise environments, and their pupil detection accuracy and line-of-sight estimation accuracy will drop significantly.
[0024] The aforementioned problems are particularly pronounced in the specialized application scenarios of pilot helmet-mounted equipment, severely limiting the large-scale application and promotion of eye-tracking technology in the aviation field. Therefore, the industry urgently needs an eye-tracking solution that can fundamentally adapt to the unique aviation environment and possesses high precision, robustness, and real-time performance.
[0025] Based on the technical requirements for eye tracking in aviation applications, this invention provides an eye tracking method and system based on dynamic trajectory calibration, specifically an eye tracking technical solution based on dynamic trajectory calibration and intelligent data cleaning.
[0026] The present invention provides the following specific embodiments, which can be combined with each other. For the same or similar concepts or processes, they may not be described again in some embodiments.
[0027] Figure 1 This is a flowchart illustrating an eye-tracking method based on dynamic trajectory calibration, provided as an embodiment of the present invention. Figure 1 As shown, the eye-tracking method based on dynamic trajectory calibration provided by this invention systematically solves the three core problems of insufficient data collection, large noise interference, and low model accuracy by introducing a dynamic trajectory calibration strategy, a statistical intelligent data cleaning strategy, and an improved robust parameter estimation strategy. This significantly improves the accuracy, stability, and practicality of eye-tracking technology in real aviation environments.
[0028] The eye-tracking method provided by this invention includes the following steps: Step 1: Perform pupil detection using an adaptive dual-mode pupil detection method, and collect pupil ellipse parameters from multiple frames of eye images; The adaptive dual-mode pupil detection method adopted in this invention employs an intelligent collaborative working mechanism. During the pupil detection process, it can automatically select to perform pupil detection using either direct detection mode or tracking detection mode based on image quality confidence, tracking status, and computing resource availability. For each frame of eye diagram detection, it outputs the pupil ellipse parameters obtained from the direct detection mode and / or tracking detection mode.
[0029] Step 2: Based on the pupil ellipse parameters of the multiple frames of eye images collected in Step 1, obtain multiple frames of pupil observation data for estimating the eyeball center. Use the robust eyeball center estimation algorithm based on improved RANSAC to estimate the eyeball center and calculate the eyeball radius to establish an eyeball model. Step 3: Implement the calibration process, collect calibration data pairs formed by the human eye's visual axis direction and the coordinates of the calibration point, and use an intelligent data cleaning algorithm to clean the calibration data pairs to establish a mapping model from the visual axis direction to the screen coordinates. Step 4: Based on the established eye model and the calibrated mapping model, during the continuous operation of eye tracking, the binocular fusion fixation point is estimated based on the structure of real-time pupil detection and real-time gaze direction, and the fixation point sequence is output after smoothing.
[0030] The following is a detailed description of each step of the eye-tracking method based on dynamic trajectory calibration provided in the embodiments of the present invention; as follows: Figure 2The diagram shown is a flowchart illustrating another eye-tracking method based on dynamic trajectory calibration provided in an embodiment of the present invention.
[0031] In one implementation of this invention, such as Figure 2 As shown, the dual-mode pupil detection in step 1 above includes a direct detection mode (detect) and a tracking detection mode (track). During the detection process, the system automatically and seamlessly switches between the two modes to achieve the best balance between accuracy and efficiency. The switching strategy for the two modes is as follows: during the initialization, tracking failure recovery, and periodic recalibration stages of the eye-tracking system, the direct detection mode is used to perform fine processing on the global eye image to ensure accurate pupil detection results; when the eye-tracking system is successfully initialized and the detection results of the direct detection mode meet the preset conditions, the tracking detection mode is activated, and rapid tracking detection is performed within the Region of Interest (ROI) based on the high-confidence detection results of the previous frame.
[0032] like Figure 2 As shown, the implementation process of pupil detection in step 1 is as follows: Step 11: Acquire an eye image and initialize it before performing pupil detection; the initialization includes setting confidence thresholds A and B, and setting the initial pupil confidence. Step 12: Determine if the stored pupil queue is empty; if it is empty, proceed to step 13; if it is not empty, further determine if the cumulative running time of the tracking detection mode exceeds a fixed period; if it is not, proceed to step 13; if it is, delete the pupil queue information and then proceed to step 13. Step 13: Determine whether the current pupil confidence level is less than or equal to the confidence threshold A. If the determination is yes, proceed to step 14; if the determination is no, proceed to step 15.
[0033] It should be noted that the confidence thresholds A and B set in this step are both between 0 and 1, and the confidence threshold B is greater than the confidence threshold A. Through the initial settings, it is ensured that the initial pupil confidence is less than or equal to the confidence threshold A, thereby ensuring that the first frame detection after initialization is performed in at least the direct detection mode. After performing the detection of one or more frames of eye diagrams, if the current pupil confidence is greater than the confidence threshold A, the tracking detection mode is performed in at least the following mode.
[0034] Step 14: Perform global eye image detection using direct detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B and the pupil queue is not empty, switch to tracking detection mode to perform tracking detection within the ROI region, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result in direct detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 15: Perform tracking detection within the ROI region using the tracking detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B, switch to the tracking detection mode to perform global eye image detection, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result under the tracking detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 16: Determine whether a preset number of pupil ellipse parameters have been collected. If yes, proceed to step 2 to establish the eyeball model. If no, return to step 11 to re-execute pupil detection until a preset number of pupil ellipse parameters have been collected.
[0035] It should be noted that during the detection processes in steps 14 and 15, if the pupil execution score using single-mode detection is greater than or equal to the confidence threshold B, it indicates that the confidence of the single-mode detection is high, and the detection result under this single mode is directly output and stored in the pupil queue. If the pupil confidence score of the current single-mode detection is less than the confidence threshold B, it indicates that the confidence of the single-mode detection has not yet reached the required high confidence level. In this case, for the eye map of the current frame, another detection mode needs to be used to perform pupil detection. In this case, the pupil detection result with the highest confidence among the two detection modes is stored in the pupil queue and used as the pupil ellipse parameter acquired in the current frame. Furthermore, after acquiring a certain number of pupil ellipse parameters that meet the requirements through multi-frame detection, the eye model establishment step is initiated.
[0036] It should also be noted that during the pupil detection process in this step, when it is determined in step 12 that the pupil queue is not empty, the further determination is the cumulative running time in the tracking detection mode, that is, the cumulative running time of the tracking detection mode during the cyclic execution of steps 13 to 15.
[0037] The specific detection methods under the two detection modes described above are explained below: (1) Direct Detection Mode This direct detection mode is used to perform fine processing on the acquired global image during the initialization, tracking failure recovery, and periodic recalibration phases of the eye-tracking system to ensure detection accuracy; the specific detection steps are as follows: S11, Multi-level image preprocessing: The input eye image is downsampled to improve processing speed. Then, an adaptive maximum-minimum normalization algorithm is used to dynamically adjust the contrast stretching range according to the global grayscale distribution of the eye image, which enhances the pupil-iris contrast while avoiding excessive amplification of background noise.
[0038] S12, Improved Canny Edge Detection: An adaptive Gaussian filter kernel is employed, its size dynamically adjusted based on the image noise level (e.g., by calculating the variance of the image gradient magnitude) to achieve the optimal balance between smoothing noise and preserving edge details. Specifically, the gradient magnitude (e.g., using the Sobel operator) and direction of the eye image in the x and y directions are precisely calculated. Edge response is refined through non-maximum suppression, and a dual-thresholding process is used (the high threshold is automatically determined by the OTSU algorithm, and the low threshold is 0.4~0.5 times the high threshold) to connect strong edges and suppress weak edges and isolated noise points, ultimately yielding an accurate single-pixel-width edge map.
[0039] S13, Orthogonal Edge Filtering: To eliminate linear interference from eyelids, eyelashes, etc., this embodiment of the invention designs a four-directional traversal mechanism. The single-pixel width edge map is scanned along four directions: 0°, 45°, 90°, and 135°, respectively. The support of each edge point in a specific neighborhood along its normal direction is calculated, effectively identifying and eliminating orthogonal line segment interference that does not conform to the circular contour features of the pupil.
[0040] S14, Edge Contour Extraction and Filtering: Connectivity analysis is performed on the edges filtered by S13 to extract all closed or nearly closed contours. Each contour is then filtered using a comprehensive heuristic function, with criteria including contour perimeter, area, roundness, convexity, and the aspect ratio of its minimum bounding rectangle, to initially select candidate pupil contours.
[0041] S15, Candidate contour segment processing and merging: For broken pupil edges caused by partial occlusion or uneven lighting, a judgment strategy based on dynamic intersection area threshold is adopted; the specific implementation is as follows: calculate the ratio of the minimum bounding rectangle intersection area to the union area of different candidate contour segments. If the ratio exceeds the adaptive threshold (usually 0.2 to 0.3), they are determined to belong to the same pupil and are merged to provide complete data for subsequent ellipse fitting.
[0042] S16, High-precision ellipse fitting: For the final merged set of candidate contour edge points, the weighted least squares method is used to fit the ellipse, and the detected pupil ellipse parameters are finally output. In this step, the weights are allocated according to the gradient magnitude of the candidate contour edge points and the signal-to-noise ratio of their location, which is used to reduce the contribution of noise points and significantly improve the fitting accuracy of the pupil center and ellipse parameters.
[0043] (2) Track detection mode This tracking detection mode is activated after the eye-tracking system is successfully initialized. Based on the high-confidence detection results of the previous frame, it performs rapid tracking within a preset defined area, greatly improving processing efficiency. The specific detection steps are as follows: S21, Dynamic ROI Setting: Using the pupil center of the previous frame as a reference point, a rectangular region of interest (ROI) is dynamically set according to its movement speed and pupil size. The size of this ROI region is adaptively adjusted, expanding when the movement is intense and shrinking when it is stable, in order to balance the search range and computational load.
[0044] S22, Multimodal Bright / Dark Mask Processing: For the ROI region, bright pupil masks and dark pupil masks are generated separately. In this step, morphological opening operations are used to eliminate minor noise, combined with adaptive histogram equalization.
[0045] S23 enhances the local contrast of the ROI area, making the pupil area clearly stand out under different lighting conditions.
[0046] S24, Intelligent intersection ratio judgment: Calculate the ratio of the intersection area of the candidate pupil region and the ROI region in the current frame to the area of the ROI region; if this ratio is lower than the set threshold, it is determined that the pupil may be moving rapidly or the tracking is about to be lost, triggering the confidence decrease mechanism to provide a decision basis for mode switching.
[0047] S25, Advanced Topology Analysis: Edge extraction or binarization is performed on the ROI region image enhanced by S23. The resulting pixels are then grouped using an improved DBSCAN clustering algorithm. This algorithm can adaptively determine the neighborhood radius and minimum number of points, effectively separating the pupil region from noise points and filtering out pixel clusters that are too small or have unreasonable shapes.
[0048] S26, Efficient Convex Hull Combination and Ellipse Fitting: For the pixel clusters obtained by clustering, calculate their convex hulls to obtain the complete contour shape; in specific implementation, the optimized Graham scan algorithm is used to quickly calculate the convex hull, and the least squares ellipse fitting is performed on the convex hull point set to finally output the tracked pupil ellipse parameters.
[0049] In one implementation of this invention, such as Figure 2 and Figure 3 As shown, the implementation method for establishing the eyeball model in step 2 above is described below.
[0050] This step aims to convert the two-dimensional pupil ellipse parameters into the gaze direction in three-dimensional space, including the following specific processes: Step 21: Collect and calculate multiple frames of pupil observation data: Based on the camera intrinsic parameters and distortion coefficients, calculate the pupil ellipse parameters on the image plane based on the collected multiple frames of eye images, and further calculate the back projection circle, two-dimensional line of sight, etc., to estimate the pupil observation data at the center of the eyeball. Step 22, Eye center estimation. In this embodiment of the invention, an eye center robust estimation algorithm based on improved RANSAC is used to perform eye center estimation, such as... Figure 3 The diagram shown is a flowchart illustrating the eye center estimation process in the eye-tracking method provided in this embodiment of the invention.
[0051] This invention provides a robust eye center estimation algorithm, the core of which lies in utilizing an improved RANSAC random sampling consensus framework to robustly solve for the spatial location of the eye center from pupil observation data containing noise and outliers. For eye center estimation, this invention provides two optional implementation methods, which can be flexibly selected according to the system sensor configuration and accuracy requirements.
[0052] Implementation Method A (Three-Dimensional Spherical Fitting - General Model) 1) Problem Modeling: The problem of finding the center of the eyeball is transformed into finding a three-dimensional sphere (i.e., the center of the eyeball) such that this three-dimensional sphere is tangent to multiple pupil circular planes obtained by back projection, and the tangency points satisfy geometric constraints. This method is based on a three-dimensional geometric model and has a theoretically higher upper limit of accuracy.
[0053] 2) Adaptive sampling strategy: The traditional RANSAC algorithm has a fixed number of samplings. This invention dynamically adjusts the maximum number of iterations based on the noise level of the input data. When the noise is high, the sampling is increased to ensure the probability, and when the noise is low, the sampling is reduced to improve the computational efficiency.
[0054] 3) Generation and verification of multiple constraints: A candidate eyeball center is generated by randomly sampling the pupil ellipse parameters of a small number of frames (e.g., 2-3 frames) each time. The distance from the pupil circular plane to the candidate eyeball center in all frames is calculated, and a comprehensive consistency metric function is constructed by combining the pupil physiological radius constraint (usually 11-13mm) and the spatial consistency constraint (minimum displacement of the eyeball center between consecutive frames) for interior point judgment.
[0055] 4) Iterative Optimization and Quality Assessment: The eye model with the largest number of inliers and the smallest consistency error is selected as the optimal solution for eye center estimation. Then, the Levenberg-Marquardt algorithm is used to perform nonlinear optimization using all inlier data to further refine the solution. Optionally, a quality assessment mechanism can be introduced to provide a confidence score for the solution based on the inlier ratio and the final reprojection error, for reference by subsequent modules.
[0056] Implementation Method B (Two-Dimensional Line Intersection - High-Efficiency Model) 1) Problem Modeling: The problem of solving the projection of the eye center is transformed into finding a point on a two-dimensional plane (i.e., the projection of the eye center) such that the sum of the distances from this point to all back-projected line vectors (two-dimensional lines) is minimized. This method reduces the dimensionality of the three-dimensional problem to two-dimensional processing, significantly improving computational efficiency, and is especially suitable for embedded platforms.
[0057] 2) Adaptive sampling and multi-constraint verification: The RANSAC algorithm framework is adopted. The maximum number of iterations can be dynamically adjusted according to the noise level, or a fixed number of iterations can be selected. Each time, a small number of frames (such as 2 to 4 frames) of line of sight are randomly sampled to calculate their comprehensive intersection as candidate points. The distance from all lines to the candidate point is calculated, and the inlier is judged by combining the distance threshold constraint and the spatial consistency constraint (the candidate point must be located in a reasonable area of the image).
[0058] 3) Iterative optimization and back projection: After the iteration, the optimal two-dimensional eye center projection point is obtained by using the least squares method with all interior point lines. Then, according to the camera focal length (focal_length) and the preset eye depth (eye_z), the two-dimensional eye center projection point is back projected to three-dimensional space to obtain the final eye center coordinates (eye_center_proj*eye_z / focal_length, eye_z).
[0059] 4) Quality Assessment and Pupil Disambiguation: A confidence score is ultimately given based on the inlier ratio and distance error. The solved projection point of the eye's center is then used to assist in the disambiguation judgment of the pupil's circular back projection. The correct line of sight should ensure that the pupil aligns with the final determination of the line of sight direction. The confidence score in this step is not mandatory.
[0060] Application of the eyeball model: After solving for the center of the eyeball using any of the above implementation methods, the following steps are required to finalize the direction of the gaze: Step 23, calculate the eyeball radius to optimize the eyeball model: After determining the eyeball center, the robust estimation algorithm is used to calculate the eyeball radius based on the inlier data, and the median of multiple independent calculations is taken as the final value to eliminate the influence of abnormal estimation. Step 24, Optical Axis to Visual Axis Conversion: By introducing a personalized Kappa angle (the angle between the optical and visual axes, typically about 5° horizontally and 1.5° vertically) for calibration, the calculated optical axis direction is accurately converted into the actual visual axis direction (i.e., the direction of gaze); thus obtaining the final eyeball model. It should be noted that step 24 is optional; without optical axis to visual axis conversion, the optical axis and visual axis are assumed to be approximately the same.
[0061] In one implementation of this invention, such as Figure 2 , Figure 4 and Figure 5 As shown, the implementation process of dynamic trajectory calibration and intelligent data cleaning in step 3 above is explained below.
[0062] Step 31, Dynamic Trajectory Design: Control the calibration point to move smoothly on the AR virtual screen according to a predefined composite trajectory of "rectangular border + two diagonals"; for example... Figure 4 The diagram shows a composite trajectory for dynamic trajectory calibration in the eye-tracking method provided in this embodiment of the invention. This composite trajectory covers all core and edge regions, and the calibration point's movement speed simulates the optimal speed range (10-20 deg / s) for smooth eye tracking in humans. This design guides users to generate natural and focused eye movements, enabling the collection of massive, continuous, and spatially uniform high-quality calibration data pairs (φ, θ, x, y); where x and y are the two-dimensional coordinates of the calibration point on the virtual screen, and φ and θ are the azimuth and pitch angles of the human eye.
[0063] Step 32: The calibration data is processed using an intelligent data cleaning algorithm to remove noise points that do not conform to the motion patterns; for example... Figure 5 The diagram shown is a flowchart of the intelligent data cleaning algorithm in the eye-tracking method provided in this embodiment of the invention, including the following processes: Step 32-1, Invalid data removal: First, filter out invalid data caused by detection failure; for example, data points with screen coordinates (x,y) of (0,0) or calculated viewing angles (φ,θ) that exceed the physiological range of human eye movement (such as pitch angles exceeding ±30°).
[0064] Step 32-2, First-order difference calculation: For the calibration data in the time series, calculate the first-order difference values Δφ[i]=φ[i]-φ[i-1] and Δθ[i]=θ[i]-θ[i-1] of the azimuth (φ) and pitch (θ) of adjacent data points; this difference value physically represents the instantaneous angular velocity of the line of sight motion.
[0065] Step 32-3, Adaptive Threshold Filtering: Calculate the standard deviations (σ_φ, σ_θ) of all Δφ and Δθ. Multiply each standard deviation by an empirical coefficient (usually 0.6–0.8) to obtain the adaptive dynamic threshold. Iterate through all data points. If |Δφ[i]| > 0.8 * σ_φ or |Δθ[i]| > 0.8 * σ_θ, then this point is considered an anomaly caused by blinking, momentary occlusion, detection jitter, or user distraction, and is removed. This method efficiently preserves high-quality smooth tracking data and removes noise points that do not conform to motion patterns.
[0066] Step 33, higher-order polynomial mapping modeling; Using cleaned, high-quality, smoothed tracking data, a mapping model is established from the view axis direction (azimuth φ, pitch θ) to screen coordinates (x, y). In specific embodiments, for example, a third-order 10-parameter polynomial is used to accurately describe complex nonlinear mapping relationships, which is particularly effective in fitting distortions in screen edge regions.
[0067] x=a0+a1*φ+a2*θ+a3*φ²+a4*φθ+a5*θ²+a6*φ³+a7*φ²θ+a8*φθ²+a9*θ³; y=b0+b1*φ+b2*θ+b3*φ²+b4*φθ+b5*θ²+b6*φ³+b7*φ²θ+b8*φθ²+b9*θ³.
[0068] To solve for the polynomial coefficients, an overdetermined linear system of equations A*X=B is constructed. The numerically stable QR decomposition method is employed to solve the normal equations, effectively avoiding matrix ill-conditioning and obtaining a robust optimal solution, i.e., solving for the values of a0 to a9.
[0069] In one implementation of this invention, such as Figure 2 and Figure 6 As shown, the implementation process of step 4 above during the continuous operation phase of eye tracking is described below. The implementation process of step 4 includes the following steps: Step 41, perform pupil detection in real time. Based on real-time eye images captured by left and right eye cameras, the two-dimensional pupil ellipse parameters of the left and right eyes are calculated in real time. Based on the established eyeball model, the two-dimensional pupil ellipse of each frame is converted into a three-dimensional pupil circle tangent to the eyeball model sphere by using the inverse projection method and the geometric tangency condition. That is, the three-dimensional pupil circle is strictly located on the sphere with the center of the eyeball as the center and the radius of the eyeball as the radius. At this time, the center position of the three-dimensional pupil circle in real time is obtained.
[0070] Step 42, determine the real-time line of sight direction The real-time optical axis of the eye is the direction of the line connecting the center of the eyeball and the center of the real-time three-dimensional pupil circle, and this direction is perpendicular to the plane of the three-dimensional pupil circle. Without considering the Kappa angle, the optical axis is the direction of the line of sight; with considering the Kappa angle, the visual axis is the direction of the line of sight.
[0071] It should be noted that steps 41 and 42 above are for pupil detection and direction determination for a single eye, while step 43 below is for the fusion estimation of binocular gaze.
[0072] Step 43, Binocular gaze estimation via multi-sensor fusion Step 43-1, Multi-coordinate system collaborative transformation: Using a pre-calibrated precise rotation and translation matrix, the gaze vectors calculated in the coordinate systems of the left and right eye cameras are uniformly transformed to the coordinate system of the eye-tracking device (e.g., a helmet), providing a unified reference for binocular fusion. It should be noted that this step is not mandatory.
[0073] Step 43-2, intelligent binocular gaze fusion; in this embodiment of the invention, the following two gaze fusion schemes can be selected: Implementation Plan A: Confidence-Based Intelligent Binocular Gaze Fusion It's not a simple averaging process. First, the confidence level of the binocular solution (from RANSAC quality assessment) is determined. Then, weighted least squares is used, with the confidence level as the weight, to find the best estimated point of the center of each eyeball in the helmet coordinate system, which serves as the starting point of the gaze vector. Subsequently, the two gaze vectors are projected onto the virtual screen plane, and their intersection is the fused binocular gaze point.
[0074] Implementation Method B: Binocular gaze fusion based on average values Binocular gaze fusion is performed by averaging the values. The pupil information of the real-time acquired eye images can be obtained through the eye feature extraction algorithm. The gaze direction (optical axis direction or visual axis direction) of the left and right eyes is obtained by combining the simplified eyeball models of the left and right eyes. The coordinates of the gaze points of the left and right eyes on the screen are obtained by using the azimuth and elevation angles of the gaze direction based on the parameters of the high-order polynomial mapping model. The average value of the gaze points of the left and right eyes is taken as the binocular fusion gaze point.
[0075] Step 44: Smoothing of the binocular fusion fixation point; This invention provides an efficient eye-tracking gaze smoothing scheme. It allows for flexible selection of a fixed-length or adaptive variable-length sliding window strategy to output the gaze sequence, based on the real-time performance requirements of the eye-tracking system and the application scenario. This sliding window strategy achieves an optimal balance between smoothing effect and response speed. Figure 6 The diagram shown is a flowchart of the gaze point smoothing process in the eye-tracking method provided in an embodiment of the present invention.
[0076] Step 44-1, Sliding Window Mean Filtering: The core filter is a sliding window mean filter. Two preferred implementation methods are provided in this embodiment of the invention: Implementation Method A (Adaptive Variable Length Window): The window size N is dynamically adjusted based on the real-time estimated eye movement type. When a smooth pursuit motion is detected, a larger window (e.g., N=12~15) is used to fully smooth the jitter; when a saccade is detected, the window size is rapidly reduced (e.g., N=3~7) to reduce the delay introduced by filtering and ensure the response speed of gaze point jumps in binocular fusion. This method is suitable for scenarios with extremely high requirements for output smoothness and real-time performance.
[0077] Implementation method B (fixed length window): The window size N is a fixed value preset based on experience (usually selected between 5 and 15). This method has stable computational resource overhead and low algorithm complexity, and is suitable for scenarios with limited computational resources or high requirements for latency consistency.
[0078] Step 44-2, Motion Prediction Model Assistance: A simplified Kalman filter prediction model is used to predict the current binocular fusion gaze point position in one step. Weighted fusion of the predicted and filtered values further suppresses jitter while improving the system's response characteristics. This module can be used in conjunction with any of the aforementioned windowing strategies.
[0079] Step 44-3, Hybrid Filtering Output: The final output is the result after the above sliding window mean filtering and prediction model fusion, which can provide a stable, smooth and timely gaze point coordinate sequence under various eye movement states.
[0080] Based on the eye-tracking method based on dynamic trajectory calibration provided in the above embodiments of the present invention, the present invention also provides an eye-tracking system based on dynamic trajectory calibration, the system comprising: an eye-tracking device (e.g., a wearable helmet), a camera device, an eye-tracking processor, and a display configured in the eye-tracking device.
[0081] In this embodiment of the invention, the camera device can be left and right eye cameras located on both sides of the eye tracking device, used to collect images of the user's left and right eyes in real time and transmit them to the eye tracking processor, so that the eye tracking processor can execute the eye tracking method provided in any of the above embodiments, and display the calibration point trajectory on the display during the execution of the eye tracking method.
[0082] In this embodiment of the invention, the eye-tracking processor includes: an adaptive dual-mode pupil detection module and a fixation point estimation module; the adaptive dual-mode pupil detection module is used to output a preset number of pupil ellipse parameters by executing step 1; the fixation point estimation module is used to execute steps 2 to 4 based on the output pupil ellipse parameters to output a fixation point sequence.
[0083] This invention provides an eye-tracking method and system based on dynamic trajectory calibration. On one hand, it employs an adaptive dual-mode pupil detection approach. During pupil detection, based on the pupil confidence level of the eye image, tracking status, and computational resource availability, it automatically determines whether to use at least one of a direct detection mode or a tracking detection mode. This achieves a balance between high-precision detection results and rapid tracking detection. On the other hand, by establishing an eye model, performing dynamic trajectory calibration and intelligent data cleaning, and using multi-sensor fusion for gaze direction estimation, it achieves high-precision gaze point estimation. Compared with existing technologies, the technical solution provided by this invention has the following significant advantages: First, it has achieved a revolutionary improvement in calibration accuracy and reliability: by adopting dynamic trajectory calibration combined with intelligent data cleaning, the quality and quantity of input data are fundamentally guaranteed, resulting in a significant improvement in the accuracy of the mapping model in the screen area. Second, excellent environmental robustness: In the eye modeling stage, the improved RANSAC algorithm is used to perform eye center estimation, and an intelligent filtering mechanism is used in the intelligent data cleaning process. This can ensure that the system can maintain stable output under strong interference environments such as vibration and light changes, meeting the stringent requirements of aviation applications. Third, excellent user experience and efficiency: the dynamic trajectory calibration process is more natural and faster, and the gaze point output is smooth and fluid, which greatly enhances the pilot's willingness to use the system and the human-machine efficiency of the system.
[0084] Fourth, systematic innovation: From data collection and cleaning to modeling, a complete technical loop is formed, with each link working together to ensure the superior performance of the final product.
[0085] The following is an illustrative example illustrating the implementation of the eye-tracking method and system based on dynamic trajectory calibration provided by the present invention.
[0086] Implementation Example In this embodiment, the physical device of the eye-tracking system is a pilot's helmet, which integrates an eye-tracking processing device for executing the eye-tracking method based on dynamic trajectory calibration provided in this embodiment. The hardware configuration and software implementation of the eye-tracking system in this embodiment are described below.
[0087] (1) Hardware configuration: 1) Imaging unit: Near-infrared cameras (850nm) integrated on both sides of the helmet, with a resolution of 400x400@60Hz, equipped with adjustable power IR-LED illumination source to ensure clear eye images under different lighting conditions.
[0088] 2) Processing unit: It adopts an embedded GPU processor, which is responsible for running all image processing and solving algorithms.
[0089] 3) Display: An AR perspective display is used to show flight information and calibration point trajectories.
[0090] (2) Software implementation process: 1) System Startup and Initialization: After startup, the camera begins acquiring images. The adaptive dual-mode pupil detection module first enters direct detection mode to lock the pupil position; and then... Figure 2 The pupil detection process in step 1 implements dual-mode pupil detection and ultimately outputs a certain number of pupil ellipse parameters that meet the conditions.
[0091] 2) Eye model establishment: Collect multiple frames of natural gaze data, and use the improved RANSAC algorithm to initially estimate the eye center (e.g., Figure 3 The estimation process is followed, and the eyeball radius is calculated to construct an eyeball model.
[0092] 3) User calibration: An eye-tracking system controls a calibration point on an AR display along a predefined composite trajectory of a rectangular border and two diagonals (e.g., ...). Figure 4 The trajectory shown moves smoothly for approximately 40 seconds.
[0093] The user's line-of-sight angle (φ,θ) and the screen coordinates (x,y) of the calibration point are collected simultaneously to form the original calibration dataset.
[0094] Call intelligent data cleaning algorithms (such as...) Figure 5 The process shown filters the original calibration dataset to remove invalid and outlier points.
[0095] Using the cleaned data, a third-order polynomial mapping model is fitted via QR decomposition.
[0096] 4) Real-time tracking and output: The eye-tracking system switches to tracking and detection mode to estimate pupil ellipse parameters in real time.
[0097] Using the eyeball model to establish the eyeball center and radius, combined with the pupil ellipse parameters detected in the current frame, the real-time three-dimensional pupil center is calculated by inverse projection, and then the optical axis direction and visual axis direction (i.e., the line of sight) are obtained in sequence.
[0098] Using a calibrated polynomial model, the gaze direction is converted into the coordinates of the gaze point on the screen.
[0099] Finally, through fixation smoothing (such as...) Figure 6 The process shown outputs a stable and smooth gaze point sequence. Depending on the actual system configuration, the processor can employ either Implementation A (adaptive variable-length window) or Implementation B (fixed-length window) strategies. In application scenarios with abundant computing resources, Implementation A is preferred to obtain optimal performance; in application scenarios sensitive to power consumption and computational latency, Implementation B can be used to balance performance and efficiency.
[0100] While the embodiments disclosed in this invention are as described above, they are merely illustrative of the embodiments to facilitate understanding of the invention and are not intended to limit the invention. Any person skilled in the art to which this invention pertains may make any modifications and variations in the form and details of the implementation without departing from the spirit and scope disclosed herein; however, the scope of patent protection for this invention shall still be determined by the scope defined in the appended claims.
Claims
1. An eye-tracking method based on dynamic trajectory calibration, characterized in that, include; Step 1: Perform pupil detection using an adaptive dual-mode pupil detection method, and collect pupil ellipse parameters from multiple frames of eye images; During the pupil detection process, the system automatically selects between direct detection mode and tracking detection mode to perform pupil detection based on the pupil confidence level, tracking status, and computing resources of the eye image. Step 2: Based on the pupil ellipse parameters of the multiple frames of eye images collected in Step 1, obtain multiple frames of pupil observation data for estimating the eyeball center. Use the robust eyeball center estimation algorithm based on improved RANSAC to estimate the eyeball center and calculate the eyeball radius to establish an eyeball model. Step 3: Implement the calibration process, collect calibration data pairs formed by the human eye's visual axis direction and the coordinates of the calibration point, and use an intelligent data cleaning algorithm to clean the calibration data pairs to establish a mapping model from the visual axis direction to the screen coordinates. Step 4: Based on the established eye model and the calibrated mapping model, during the continuous operation of eye tracking, based on the results of real-time pupil detection and real-time gaze direction, the binocular fusion gaze point is estimated through multi-sensor fusion, and the gaze point sequence is output after smoothing.
2. The eye-tracking method based on dynamic trajectory calibration according to claim 1, characterized in that, Step 1 includes: During the initialization, tracking failure recovery, and periodic recalibration phases of the eye-tracking system, a direct detection mode is used to perform fine processing on the global eye image to ensure accurate pupil detection results. When the eye-tracking system is successfully initialized and the detection results of the direct detection mode meet the preset conditions, the tracking detection mode is activated, and rapid tracking detection is performed within the region of interest (ROI) based on the high-confidence detection results of the previous frame.
3. The eye-tracking method based on dynamic trajectory calibration according to claim 2, characterized in that, Step 1 includes: Step 11: Acquire an eye image and initialize it before performing pupil detection; the initialization includes setting confidence thresholds A and B, and setting the initial pupil confidence. Step 12: Determine if the stored pupil queue is empty; if it is empty, proceed to step 13; if it is not empty, continue to determine if the cumulative running time of the tracking detection mode exceeds the fixed period; if it is not, proceed to step 13; if it is, delete the pupil queue information and then proceed to step 13. Step 13: Determine whether the current pupil confidence level is less than or equal to the confidence threshold A. If the determination is yes, proceed to step 14; if the determination is no, proceed to step 15. Step 14: Perform global eye image detection using direct detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B and the pupil queue is not empty, switch to tracking detection mode to perform tracking detection within the ROI region, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result in direct detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 15: Perform tracking detection within the ROI region using the tracking detection mode; during the detection process, if the current pupil confidence is less than the confidence threshold B, switch to the tracking detection mode to perform global eye image detection, and output the pupil detection result with the highest confidence among the two detection modes; otherwise, output the pupil detection result under the tracking detection mode; store the output pupil detection result in the pupil queue, and collect the pupil ellipse parameters obtained in this detection; then proceed to step 16; Step 16: Determine whether a preset number of pupil ellipse parameters have been collected. If yes, proceed to step 2 to establish the eyeball model. If no, return to step 11 to re-execute pupil detection until a preset number of pupil ellipse parameters have been collected.
4. The eye-tracking method based on dynamic trajectory calibration according to claim 1, characterized in that, The methods for establishing the eyeball model in step 2 include: Step 21: Collect and calculate multiple frames of pupil observation data, including: calculating the pupil ellipse parameters on the image plane based on the collected multiple frames of eye images, calculating the back projection circle and the two-dimensional line of sight, and obtaining pupil observation data for estimating the center of the eyeball; Step 22: Perform eye center estimation based on the robust eye center estimation algorithm of improved RANSAC to solve for the eye center; Step 23: After determining the center of the eyeball, a robust estimation algorithm is used to calculate the radius of the eyeball, and the median of multiple independent calculations is taken as the final value of the radius of the eyeball to establish the eyeball model. Step 24: By introducing the Kappa angle for calibration, the calculated optical axis direction is accurately converted into the true visual axis direction, i.e., the line of sight direction.
5. The eye-tracking method based on dynamic trajectory calibration according to claim 4, characterized in that, Step 22 performs eye center estimation, including two schemes: Scheme 1 transforms the problem of finding the eyeball center into finding a three-dimensional sphere center such that this sphere is tangent to multiple pupil circular planes obtained by back projection, and the tangency points satisfy preset geometric constraints. In Scheme 1, pupil ellipse parameters from several frames are randomly sampled to generate a candidate eyeball center. The distance from the pupil circular planes of all frames to the candidate eyeball center is calculated, and a consistency metric function is constructed by combining pupil physiological radius constraints and spatial consistency constraints for interior point judgment. The eyeball model with the most interior points and the smallest consistency error is selected as the optimal solution for eyeball center estimation, and the coordinates of the eyeball center are obtained. Scheme 2 transforms the problem of solving the eye center projection into finding a point on a two-dimensional plane such that the sum of the distances from that point to all back-projected line vectors is minimized. In Scheme 2, line lines from several frames are randomly sampled, and their combined intersection points are calculated as candidate points. The distances from all lines to the candidate point are calculated, and in-place points are determined by combining distance threshold constraints and spatial consistency constraints. After iteration, the optimal two-dimensional eye center projection point is solved using the least squares method with all in-place line lines. This two-dimensional eye center projection point is then back-projected into three-dimensional space to obtain the final eye center coordinates.
6. The eye-tracking method based on dynamic trajectory calibration according to claim 1, characterized in that, The implementation methods for dynamic trajectory calibration and intelligent data cleaning in step 3 include: Step 31, dynamic trajectory design, includes: controlling the calibration point to move smoothly on the virtual screen according to a predefined composite trajectory of "rectangular border + two diagonals", guiding the user's eye movement through the movement of the calibration point, so as to collect massive, continuous and spatially uniform calibration data pairs (φ,θ,x,y); where x,y are the two-dimensional coordinates of the calibration point on the virtual screen, and φ,θ are the azimuth and pitch angles of the human eye; Step 32: The calibration data is processed using an intelligent data cleaning algorithm to remove noise points that do not conform to the motion law and retain high-quality smooth tracking data. Step 33: Using the cleaned and smoothed tracking data, establish a mapping model from the view axis direction (φ,θ) to the screen coordinates (x,y).
7. The eye-tracking method based on dynamic trajectory calibration according to claim 6, characterized in that, The processing steps of the intelligent data cleaning algorithm in step 32 include: Step 32-1, invalid data removal, including: filtering out invalid data caused by detection failure; Step 32-2, First-order difference calculation, includes: For calibration data in time series, calculating the first-order difference values of the azimuth angle φ and elevation angle θ between adjacent data points, respectively: Δφ[i]=φ[i]-φ[i-1] and Δθ[i]=θ[i]-θ[i-1]; The first-order difference value mentioned above represents the instantaneous angular velocity of the line of sight motion; Step 32-3, adaptive threshold filtering, includes: calculating the standard deviations of all Δφ and Δθ, i.e., σ_φ, σ_θ, multiplying each standard deviation by a preset empirical coefficient as the adaptive dynamic threshold; traversing all data points, filtering out abnormal data points that meet the condition |Δφ[i]|>0.8*σ_φ or |Δθ[i]|>0.8*σ_θ, and removing them.
8. The eye-tracking method based on dynamic trajectory calibration according to claim 1, characterized in that, The processing method for the continuous operation phase of eye tracking in step 4 includes: Step 41, perform pupil detection in real time, including: calculate the two-dimensional pupil ellipse parameters of the left and right eyes in real time based on the eye images captured by the left and right eye cameras in real time; and convert the two-dimensional pupil ellipse of each frame into a three-dimensional pupil circle tangent to the sphere of the eye ball model by using inverse projection based on the eye ball model, thereby obtaining the center position of the three-dimensional pupil circle in real time. Step 42: Determine the real-time line of sight direction. Without considering the Kappa angle, use the optical axis as the line of sight direction. With consideration of the Kappa angle, use the visual axis as the line of sight direction. Step 43: Use multi-sensor fusion to estimate binocular gaze to obtain binocular fusion gaze points; Step 44: Smoothing of binocular fusion fixation points. Based on the real-time performance requirements and application scenarios of the eye-tracking system, a sliding window strategy is used to perform mean filtering and output the fixation point sequence.
9. The eye-tracking method based on dynamic trajectory calibration according to claim 8, characterized in that, Step 44 includes: Step 44-1: Perform sliding window mean filtering on the binocular fusion fixation point, using a fixed-length or adaptive variable-length sliding window for mean filtering. Step 44-2: Use the Kalman filter prediction model to predict the current binocular fusion fixation point position, and then perform a weighted fusion of the predicted value and the filtered value. Step 44-3: Output the binocular fused gaze point after being fused with the prediction model through sliding window mean filtering.
10. An eye-tracking system based on dynamic trajectory calibration, characterized in that, include: An eye-tracking device, comprising a camera, an eye-tracking processor, and a display configured therein; The camera device includes left and right eye cameras disposed on both sides of the eye tracking device, used to acquire images of the user's left and right eyes in real time and transmit them to the eye tracking processor, so as to execute the eye tracking method as described in any one of claims 1 to 9 through the eye tracking processor, and display the calibration point trajectory through the display during the execution of the eye tracking method; The eye-tracking processor includes an adaptive dual-mode pupil detection module and a fixation point estimation module; the adaptive dual-mode pupil detection module is used to output a preset number of pupil ellipse parameters by executing step 1; the fixation point estimation module is used to execute steps 2 to 4 based on the output pupil ellipse parameters to output a fixation point sequence.