A text detection model training method and device, equipment and medium

CN116189205BActive Publication Date: 2026-06-19ZHEJIANG E COMMERCE BANK CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ZHEJIANG E COMMERCE BANK CO LTD
Filing Date
2023-02-28
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing text detection models rely on a large number of manually labeled samples for supervised training, resulting in high costs.

Method used

A self-supervised training method is adopted, which maps text images as foreground images to non-text images as background images, uses Siamese network structure for feature extraction and loss calculation, automatically labels the synthesized images, and constructs a text detection model.

Benefits of technology

This enables the training of a text detection model without manually labeled samples, significantly reducing modeling costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116189205B_ABST
    Figure CN116189205B_ABST
Patent Text Reader

Abstract

This specification discloses a training method, apparatus, device, and medium for a text detection model, comprising: mapping a foreground image to a background image to obtain a synthetic image, and labeling the mapping positions on the synthetic image, wherein the foreground image is a text image and the background image is a non-text image. A sample image set is constructed by selecting the synthetic images, and the labeled mapping positions on the sample images represent the locations of Chinese text information. Features are extracted from the first sample image based on the labeled Chinese text information locations using a first neural network to obtain a first feature vector, and features are extracted from the second sample image based on the labeled Chinese text information locations using a second neural network to obtain a second feature vector. The first and second neural networks are coupled with the parameters of the text detection model. A training loss is determined based on whether the first and second sample images belong to the same sample category and the similarity between the first and second feature vectors to adjust the parameters of the text detection model.
Need to check novelty before this filing date? Find Prior Art