Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Character detection model pre-training method and device

A text detection and pre-training technology, which is applied in the field of model training, can solve the problems of judging that text cannot be lined up, affecting the text detection effect, etc., achieve accurate text line position and line granularity, and avoid the effect of line and column ambiguity

Active Publication Date: 2022-05-13
ALIBABA (CHINA) CO LTD
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, extracting characters from complex text formats often involves understanding the semantics of the text content (especially in Chinese scenes involving row and column ambiguity, wide-spaced words, etc.), and the distance between characters cannot simply be used to judge the line of text. basis, which in turn affects the effect of text detection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Character detection model pre-training method and device
  • Character detection model pre-training method and device
  • Character detection model pre-training method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the specification. However, this specification can be implemented in many other ways different from those described here, and those skilled in the art can make similar extensions without violating the connotation of this specification, so this specification is not limited by the specific implementations disclosed below.

[0028] Terms used in one or more embodiments of this specification are for the purpose of describing specific embodiments only, and are not intended to limit one or more embodiments of this specification. As used in one or more embodiments of this specification and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and / or" used in one or more embodiments of the present specification refers t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a character detection model pre-training method and device, and the method comprises the steps: inputting a character sample into a text encoder to obtain a character feature, inputting an image sample into an image encoder to obtain an image feature, and enabling the character sample to be extracted from the image sample; according to a data dictionary and the image features, determining whether the image sample contains a text sample, and obtaining a text containing result, the data dictionary comprising the text sample; determining a corresponding relation between the character sample and the image sample according to the character features and the image features, and obtaining an image-text corresponding result; predicting the masked character sample according to the character features and the image features to obtain a character prediction result; and according to the containing result, the image-text corresponding result and the text prediction result, performing parameter adjustment on the image encoder to obtain a pre-trained text detection model. Visual representation has semantic knowledge, so that the problems of ambiguity in rows and columns and the like caused by insufficient semantic knowledge are avoided.

Description

technical field [0001] The embodiments of this specification relate to the technical field of model training, and in particular to a method for pre-training a text detection model. Background technique [0002] With the rapid development of personal consumer electronics (digital cameras, mobile phones, etc.) , card pictures, street view videos, etc.) to extract text information in Optical Character Recognition (OCR) technology has been widely used. With the advent of the era of deep learning, OCR technology has gradually moved from simple character recognition of scanned documents to an era of solving complex text scenarios such as complex text distribution, artistic words, table text, and even handwritten formulas in a wide range of scenarios. Pan-OCR technology is usually divided into three stages: text detection, text recognition, and format understanding. In the text detection stage, it is necessary to train the text detection model based on deep learning and training ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06V30/40G06V30/18G06F40/242G06F40/30
CPCG06F40/242G06F40/30
Inventor 宋思博万建强杨志博姚聪
Owner ALIBABA (CHINA) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products