Unlock instant, AI-driven research and patent intelligence for your innovation.

A training method and device for a captcha recognizer based on self-supervised learning

A verification code and recognizer technology, applied in the field of verification code recognition, can solve problems such as a large amount of labor costs, difficulty, and poor robustness of the recognizer, and achieve the effect of improving recognition performance and reducing the number

Active Publication Date: 2021-06-18
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
View PDF22 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods no longer work, as text captchas draw on previous failures and introduce more complex security features
And some deep learning-based methods have made significant progress in the accuracy of character recognition, but need to collect a large number of samples and manually label them, which requires a lot of labor costs
In addition, recognizers trained for specific captcha schemes are not robust enough to be directly applicable to other captcha schemes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A training method and device for a captcha recognizer based on self-supervised learning
  • A training method and device for a captcha recognizer based on self-supervised learning
  • A training method and device for a captcha recognizer based on self-supervised learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0139] According to a specific implementation manner, the third prediction loss is determined by the following formula:

[0140] Loss=L Rec +L Exc +L Reg (4)

[0141] In one example, the reconstructed similarity loss L Rec It can be calculated according to the mean square error method.

[0142] In one example, after determining the independence loss L Exc When , the specific gravity threshold T is set to 0.5, so that the background image and the character image are as independent as possible from each other. At this point, the independence loss can be written as:

[0143] L Exc =Σ x |t(x)-0.5| (5)

[0144] In one example, the sparsity loss is determined by the following formula (6), so that the character image is as sparse as possible in the entire image, so as to better fit the characteristics of character distribution in the captcha image.

[0145] L Reg =∑ x |t(x)| (6)

[0146] Further, according to another specific embodiment, the independence loss L can also...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Embodiments of the present invention provide a computer-executed training method and device for a verification code recognizer, wherein the verification code recognizer includes a feature extractor and a classifier. The feature extractor is trained in a self-supervised manner. The training process includes first obtaining an unlabeled captcha image; dividing the captcha image into multiple tiles. The feature extractor is used to extract the features of each block respectively, and the encoding vector of each block is obtained. Select a continuous tile sequence from multiple tiles, use the regression network, determine the hidden vector based on the encoding vectors of the previous blocks in the tile sequence, and determine the prediction vector of the subsequent tiles in the sequence based on the hidden vector . Then, based on the encoding vectors and prediction vectors of subsequent tiles, a prediction loss is determined; based on this prediction loss, a feature extractor and a regression network are trained. After the feature extractor is trained, the classifier is trained in a supervised manner based on the feature extractor.

Description

technical field [0001] One or more embodiments of this specification relate to the fields of machine learning and data security, and in particular, to a verification code recognition method and device using machine learning and data security. Background technique [0002] Captchas were first proposed in 2003 to distinguish humans from automated computer programs. Captcha is a test that is difficult for computers to solve, but easy for humans. With the development of the Internet, captchas have been widely used in web applications to protect security and prevent data theft and password cracking. Although many alternatives to text-based captchas have been proposed, text-based captchas are still the authentication mechanism of choice for many websites. Thus, a successful attack on a captcha scheme will wreak havoc on a website. [0003] Captcha images usually consist of three parts, foreground layer, character layer and background layer. The foreground and background layers...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F21/36G06K9/46G06K9/62G06N3/04G06N3/08
CPCG06F21/36G06N3/08G06V10/50G06V10/464G06N3/045G06F18/24G06F18/214
Inventor 熊涛
Owner ALIPAY (HANGZHOU) INFORMATION TECH CO LTD