A Method for Collecting Labeled Datasets Using Improved Captcha

A data set and verification code technology, applied in the computer field, can solve problems such as low data set generation efficiency, and achieve the effect of solving low generation efficiency, high efficiency and solving failure.

Active Publication Date: 2022-05-13
WUHAN FENJIN INTELLIGENT MACHINE CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The present invention proposes a method for collecting labeled data sets by using improved verification codes, which is used to solve or at least partially solve the technical problem of low data set generation efficiency existing in the methods in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method for Collecting Labeled Datasets Using Improved Captcha
  • A Method for Collecting Labeled Datasets Using Improved Captcha
  • A Method for Collecting Labeled Datasets Using Improved Captcha

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] The present invention aims to solve the problems that current verification codes are deciphered by machine vision and lead to failure, and the current generation efficiency of marked data sets is low, resulting in insufficient categories.

[0051] Main idea of ​​the present invention is as follows:

[0052] Provide a method of collecting labeled data sets using improved verification codes. First, collect and label a large number of objects that are not included in existing open source data sets or uncommon and diverse features of some objects, and perform displacement, Image enhancement processing of rotation, brightness and scaling and merging a multi-featured data set, and then propose a new verification code method based on the above-mentioned data set that uses rectangles to mark these objects or feature labels. This method determines whether the label entered by the user is Within the acceptable range, if it is correct, save the image and the label entered by the u...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for collecting labeled data sets by using improved verification codes. Firstly, a large number of objects that are not included in the existing open source data sets or uncommon and diverse features of some objects are collected and marked, and these images are Carry out image enhancement processing of displacement, rotation, brightness and scaling and merge a multi-featured data set, and then propose a new verification code method based on the above-mentioned data set that uses rectangles to mark these objects or feature labels. Whether the label is within the acceptable range, if it is correct, save the image and the label entered by the user as part of the data set, otherwise the verification fails, then retest until it succeeds. The efficiency of the data set collected by the invention is extremely high, the cost is low, and the quality of the obtained data set is high.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method for collecting labeled data sets by using improved verification codes. Background technique [0002] The verification code (CAPTCHA) technology is widely used in the Internet field. It is an automated public Turing test for judging whether a user is a human being, thereby protecting the website from being used by people to submit a large amount of information through the program and causing server downtime; in supervised deep learning work In , a large number of data sets and corresponding annotations are extremely important to the trained model. However, the labeling of the data sets can only be done manually one by one, which is time-consuming, labor-intensive and costly. As a result, the current data sets are far from being able to Meet the needs of neural network training. [0003] In the process of implementing the present invention, the inventor of the present a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): H04L9/40
CPCH04L63/0838
Inventor 王淑青张子言刘逸凡庆毅辉王晨曦兰天泽张鹏飞黄剑锋王年涛顿伟超张子蓬
Owner WUHAN FENJIN INTELLIGENT MACHINE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products