A scene text tracking method and system based on semantic guidance and structure correction
CN121963219BActive Publication Date: 2026-06-26NANKAI UNIV
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANKAI UNIV
- Filing Date
- 2026-04-03
- Publication Date
- 2026-06-26
Smart Images

Figure CN121963219B_ABST
Abstract
The application discloses a scene text tracking method and system based on semantic guidance and structure correction, and belongs to the technical field of image or video recognition. The method comprises the following steps: performing feature interaction on a template token sequence and a search area token sequence to output an original search area visual feature map; performing predictive token correction to obtain a corrected feature map; performing cross-expert calibration on a template frame and a search frame to generate a text semantic calibration mask; multiplying the corrected feature map and the text semantic calibration mask element by element to obtain a final fusion feature, and inputting the final fusion feature into a prediction head to perform position prediction and output a response map containing target center coordinates and size; adaptively adjusting a search scale and outputting a prediction result, and then fusing a constant speed motion model Kalman filter result to update a final target position. The application solves the problems of structure imbalance and deformation, realizes stable motion prediction, and thus realizes efficient and high-precision specific text instance tracking.
Need to check novelty before this filing date? Find Prior Art