Artificial synthesizing method of training samples and verification code recognizing method based on samples

A technology of training samples and artificial synthesis, which is applied in the fields of artificial synthesis of training samples and verification code recognition, which can solve problems such as single training samples, inability to adapt to verification code text, and inability to implement applications, so as to achieve sample diversification and reduce human and financial resources. Effect

Inactive Publication Date: 2018-06-12
厦门美亚商鼎信息科技有限公司
View PDF6 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method can only recognize traditional verification codes with clear backgrounds and relatively correct and simple text, but cannot adapt to verification codes with complex backgrounds.
At present, there is also a verification code recognition method of cnn+rnn machine learning, but it is relatively simple to synthesize training samples through a verification code generator, and the workload of manual collection of samples is heavy, so it cannot be implemented in actual projects.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Artificial synthesizing method of training samples and verification code recognizing method based on samples
  • Artificial synthesizing method of training samples and verification code recognizing method based on samples
  • Artificial synthesizing method of training samples and verification code recognizing method based on samples

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0037] Please refer to figure 1 As shown, the artificial synthesis method of training samples is used to synthesize training samples for machine learning, so as to realize verification code recognition based on machine learning, including the following steps:

[0038] S1. Randomly select the character category, the number of characters, and the combination of characters to generate a verification code vocabulary.

[0039] The character categories described in step S1 include numbers, letters, mathematical symbols and Chinese characters, and the selected character categories are one or more. For example, the character category only has numbers and letters, that is, only 0-9 and a-z, then the corresponding font library has 031a2, b2431, IZ, E0.... and other free combinations (randomly set the number of characters according to the needs, and then randomly set the characters) , generate as many sample lexicons as possible, including various verification code formats.

[0040] S2...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an artificial synthesizing method of training samples and a verification code recognizing method based on the samples. The artificial synthesizing method of the training samples comprises the following steps: S1, generating a verification code lexicon; S2, generating a background photo gallery; S3, screening rectangular background blocks; S4, collecting typeface sets; S5, randomly matching words and expressions with the background blocks; and S6, taking the background blocks which have been written with the words and expressions as samples. The invention further discloses a verification code recognizing method based on the training samples. The verification code recognizing method based on the training samples comprises the following steps: S1, feature extraction; S2, sequence calibration; and S3, correction of recognition results. The training samples are acquired by a mode of synthesizing samples artificially, only recognized character types need to be set, but samples in the corresponding types can be generated, good effect can be achieved by adding a small amount of actual website samples, and a large number of labors and financial resources for sample collection are reduced while the samples are diversified.

Description

technical field [0001] The invention relates to the technical field of the Internet, in particular to a method for artificially synthesizing a training sample and a verification code recognition method based on the sample. Background technique [0002] In general, image-based text recognition includes optical character recognition (Optical Character Recognition, OCR) based on scanned text and CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart, which is widely used in website registration verification, fully automatic distinction between computers and humans. Turing test). In comparison, scanner-based OCR is the easiest, and CAPTCHA is the hardest. [0003] Traditional verification code cracking mainly detects characters, cuts them, and finally recognizes individual characters. This method can only recognize traditional verification codes with clear backgrounds and relatively correct and simple characters, but cannot adapt to verification co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/32G06K9/72
CPCG06V20/63G06V10/768G06V30/10
Inventor 叶炳坤王志永郭建辉林文东郑旭
Owner 厦门美亚商鼎信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products