Text recognition method and device, computer device and computer readable storage medium

By training a feature extraction model using unlabeled text image samples and combining DenseNet and multi-head attention mechanisms, the problem of poor recognition performance and high training difficulty of OCR models in font deformation scenarios is solved, achieving more efficient text recognition capabilities.

CN115909336BActive Publication Date: 2026-06-16TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
TENCENT TECHNOLOGY (SHENZHEN) CO LTD
Filing Date
2021-08-17
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing OCR models have poor recognition performance when faced with font deformation in different scenarios, and obtaining training samples requires a lot of manpower, resulting in high training difficulty.

Method used

The training effect of the feature extraction model is enhanced by training it with unlabeled text image samples. The DenseNet neural network and multi-head attention mechanism are used for image feature extraction and attention feature extraction. The training sample index is calculated and predicted by combining image attribute information.

🎯Benefits of technology

It improves the recognition ability of OCR models in different scenarios, reduces the training difficulty, reduces the dependence on labeled samples, and enhances the generalization ability of feature extraction models.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115909336B_ABST
    Figure CN115909336B_ABST
Patent Text Reader

Abstract

Embodiments of the present application disclose a text recognition method and device, computer equipment and a computer readable storage medium. The method comprises: obtaining a text image sample; performing image index calculation according to image attribute information of the text image sample, and determining a reference sample index based on a calculation result; performing image feature extraction processing on the text image sample by using a feature extraction model to obtain image feature information; performing attention feature extraction based on the image feature information by using the feature extraction model to obtain attention feature information of a context information of interest; predicting a prediction sample index based on the attention feature information; and training the feature extraction model according to the prediction sample index and the corresponding reference sample index, so as to extract the attention feature information of a to-be-recognized text image by using the trained feature extraction model to perform image text recognition. The method can train the feature extraction model by using a large number of unlabeled text image samples, and can enhance the training effect of the feature extraction model.
Need to check novelty before this filing date? Find Prior Art