Independent character determination method, device, equipment, storage medium and program product
By using a pre-trained convolutional neural network combined with fully connected layers and normalized exponential function layers, the problem of character adhesion in CAPTCHA images was solved, improving the accuracy and reliability of recognition.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2023-09-08
- Publication Date
- 2026-06-16
AI Technical Summary
When dealing with the phenomenon of characters sticking together in CAPTCHA images, existing technologies are prone to confusion and misjudgment by recognition systems, leading to a decrease in the reliability and accuracy of CAPTCHA image recognition.
A pre-trained convolutional neural network, combined with fully connected layers and normalized exponential function layers, is used to recognize CAPTCHA images. The fully connected layers convert image features into target feature vectors, and the normalized exponential function layers convert the target feature vectors into independent characters, thus solving the problem of character concatenation.
It improves the accuracy and reliability of CAPTCHA image recognition, effectively handles character adhesion phenomena, and enhances the accuracy and reliability of recognition.
Smart Images

Figure CN117197818B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of information security technology, and in particular to a method, apparatus, device, storage medium, and program product for determining independent characters. Background Technology
[0002] With the continuous development of the financial sector, banks typically use CAPTCHA images to enhance security when performing user authentication and transaction confirmation. CAPTCHA images are a human-computer interaction security measure that requires users to enter a set of characters or numbers on a website or mobile application to prove they are genuine users and not automated programs (such as bots). Generally, the characters in bank CAPTCHA images may include numbers, uppercase letters, and / or lowercase letters.
[0003] In existing technologies, CAPTCHA image recognition employs image processing and pattern recognition methods, aiming to convert visual information into computer-processable data for automatic recognition. The process includes image preprocessing, feature extraction, and classification. First, the CAPTCHA image undergoes preprocessing to highlight character outlines and remove interference. Second, key information such as character boundaries, shapes, and textures are extracted from the processed image to describe character features. Finally, a trained classifier uses these features to classify the characters and make recognition decisions.
[0004] However, traditional technologies struggle to effectively handle the sticky characters in CAPTCHA images, making it easy for recognition systems to confuse, misjudge, or fail to correctly separate characters, thus affecting the reliability and accuracy of CAPTCHA image recognition. Summary of the Invention
[0005] Therefore, it is necessary to provide an independent character determination method, apparatus, device, storage medium, and program product that can improve the reliability and accuracy of CAPTCHA image recognition in response to the above-mentioned technical problems.
[0006] Firstly, this application provides a method for determining an independent character, the method comprising:
[0007] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0008] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0009] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0010] In one embodiment, the convolutional neural network further includes a feature extraction layer that uses the pre-trained convolutional neural network to recognize the CAPTCHA image and obtain multiple independent characters, including:
[0011] The verification code image is input into the feature extraction layer to obtain image features;
[0012] Image features are input into a fully connected layer to obtain the target feature vector;
[0013] The target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
[0014] In one embodiment, image features are input into a fully connected layer to obtain a target feature vector, including:
[0015] Image features are input into a fully connected layer to flatten the image features, resulting in candidate feature vectors; the candidate feature vectors include multiple first sub-vectors.
[0016] The first sub-vector in the candidate feature vector is processed in the neurons of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0017] In one embodiment, the target feature vector is input into a normalized exponential function layer to obtain multiple independent characters, including:
[0018] The target feature vector is input into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0019] For each second subvector, the independent character is determined based on the character corresponding to the maximum probability value in the probability distribution.
[0020] In one embodiment, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction; the first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image. 。
[0021] In one embodiment, before using a pre-trained convolutional neural network to recognize multiple independent characters in a CAPTCHA image, the method further includes:
[0022] Perform preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image;
[0023] Accordingly, a pre-trained convolutional neural network is used to recognize the processed CAPTCHA image to obtain multiple independent characters.
[0024] In one embodiment, a preprocessing operation is performed on the verification code image to determine the processed verification code image, including:
[0025] The verification code image is converted to grayscale to determine the grayscale image;
[0026] Binarize the grayscale image to determine the corresponding processed CAPTCHA image.
[0027] In one embodiment, a preprocessing operation is performed on the verification code image to determine the processed verification code image, including:
[0028] The grayscale image is obtained by calculating the grayscale value of each pixel in the CAPTCHA image using a weighted average method.
[0029] In one embodiment, the grayscale image is binarized to determine the processed verification code image corresponding to the grayscale image, including:
[0030] The optimal threshold for a grayscale image is calculated using a global thresholding method, and the corresponding processed CAPTCHA image is determined using the optimal threshold.
[0031] In one embodiment, the training method for the convolutional neural network includes:
[0032] Acquire multiple training data sets; each training data set includes sample images and the sample characters corresponding to the sample images.
[0033] Each training data point is input into the neural network to determine the predicted character corresponding to each training data point.
[0034] Based on the predicted independent characters and their corresponding characters from each training data, the neural network is trained to determine the convolutional neural network.
[0035] Secondly, this application also provides an independent character determination device, the device comprising:
[0036] The acquisition module is used to acquire the verification code image; the verification code image includes connected characters.
[0037] The recognition module is used to recognize the CAPTCHA image using a pre-trained convolutional neural network to obtain multiple independent characters;
[0038] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0039] Thirdly, this application also provides a computer device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to perform the following steps:
[0040] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0041] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0042] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0043] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the following steps:
[0044] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0045] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0046] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0047] Fifthly, this application also provides a computer program product, which includes a computer program that, when executed by a processor, performs the following steps:
[0048] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0049] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0050] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0051] The aforementioned method, apparatus, device, storage medium, and program product for determining independent characters first acquire a CAPTCHA image, then use a pre-trained convolutional neural network (CNN) to recognize the CAPTCHA image, thereby obtaining multiple independent characters. The CNN includes fully connected layers and normalized exponential function layers. The fully connected layers convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers convert the target feature vectors into independent characters. In this method, the fully connected layers and normalized exponential function layers of the CNN can transform the complex features in the CAPTCHA image into target feature vectors that are easier to process and understand, and then convert them into character probability distributions, thereby effectively solving the problems of character concatenation and recognition, and improving the accuracy and reliability of CAPTCHA recognition. Attached Figure Description
[0052] Figure 1 This is an internal structural diagram of a computer device in one embodiment;
[0053] Figure 2 This is a flowchart illustrating an independent character determination method in one embodiment;
[0054] Figure 3 This is a flowchart illustrating the method for determining independent characters in another embodiment;
[0055] Figure 4 This is a flowchart illustrating the method for determining independent characters in another embodiment;
[0056] Figure 5 This is a flowchart illustrating the method for determining independent characters in another embodiment;
[0057] Figure 6 This is a flowchart illustrating the method for determining independent characters in another embodiment;
[0058] Figure 7 This is a flowchart illustrating the method for determining independent characters in another embodiment;
[0059] Figure 8 This is a structural block diagram of an independent character determination device in one embodiment. Detailed Implementation
[0060] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0061] In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as follows: Figure 1As shown, the computer device includes a processor, memory, input / output interfaces, a communication interface, a display unit, and an input device. The processor, memory, and input / output interfaces are connected via a system bus, and the communication interface, display unit, and input device are also connected to the system bus via the input / output interfaces. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The input / output interfaces are used for exchanging information between the processor and external devices. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When executed by the processor, the computer program implements an independent character recognition method. The display unit is used to form a visually visible image and can be a display screen, a projection device, or a virtual reality imaging device. The display screen can be an LCD screen or an e-ink screen. The input device of the computer device can be a touch layer covering the display screen, or buttons, trackballs, or touchpads set on the casing of the computer device, or external keyboards, touchpads, or mice, etc.
[0062] Those skilled in the art will understand that Figure 1 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0063] In one embodiment, such as Figure 2 As shown, a method for determining independent characters is provided, which can be applied to... Figure 1 Taking a computer device as an example, the explanation includes the following steps:
[0064] S202, Obtain the verification code image; the verification code image includes connected characters.
[0065] A CAPTCHA image is an image used to distinguish between humans and computers, typically used to verify user identity or prevent malicious automated operations. CAPTCHA images usually contain characters such as uppercase letters, lowercase letters, and numbers. Users need to perform corresponding actions based on the content of the image, such as entering characters, to complete the verification process.
[0066] Adhesive characters refer to two or more characters in a CAPTCHA image that are closely connected or may overlap, creating a visual effect of adhesion or merging. While this design increases the difficulty of CAPTCHA images and prevents automated programs from easily recognizing and cracking them, it also poses a certain challenge to computer devices in recognizing them.
[0067] In this embodiment, the computer device can acquire a verification code image, which includes characters that are joined together. Optionally, the computer device can acquire the verification code image using a camera, scanner, or by downloading it from the network. If the verification code image is not on the local device, the computer may need to transmit the image data over the network, for example, by downloading the verification code image from a server or website. The verification code image data is then loaded into memory, ready for subsequent processing and analysis.
[0068] S204 uses a pre-trained convolutional neural network to recognize CAPTCHA images and obtain multiple independent characters. The convolutional neural network includes a fully connected layer and a normalized exponential function layer. The fully connected layer is used to convert the image features of the CAPTCHA image into a target feature vector. The normalized exponential function layer is used to convert the target feature vector into independent characters.
[0069] In this embodiment of the application, a computer device can use a pre-trained convolutional neural network to recognize CAPTCHA images. During the CAPTCHA image recognition process, the pre-trained convolutional neural network can identify multiple independent characters in the CAPTCHA image.
[0070] Optionally, firstly, the fully connected layer of the convolutional neural network transforms the image features of the CAPTCHA image to generate a target feature vector. In this process, the fully connected layer weighs and combines local information and various features in the image to create a comprehensive target feature vector. This vector represents the integration of features from various parts of the CAPTCHA image, effectively capturing the importance of different positions and regions. Next, through a normalized exponential function layer, the target feature vector is converted into a probability distribution of characters. Using this probability distribution, the likelihood of each character's position can be accurately inferred, thus precisely parsing multiple independent characters. Converting the target feature vector into a probability distribution of characters through the normalized exponential function layer is a crucial step in the CAPTCHA recognition process. This process helps to parse connected characters and accurately identify multiple independent characters.
[0071] In the aforementioned method for determining independent characters, a CAPTCHA image is first acquired, and then a pre-trained convolutional neural network (CNN) is used to recognize the CAPTCHA image, yielding multiple independent characters. The CNN includes fully connected layers and normalized exponential function layers. The fully connected layers convert the image features of the CAPTCHA image into target feature vectors, while the normalized exponential function layers convert the target feature vectors into independent characters. This method, by combining a pre-trained CNN with fully connected and normalized exponential function layers, better captures the contextual information between characters, effectively handling the issue of overlapping characters. Furthermore, the fully connected layers convert image features into target feature vectors, which helps transform input image information into a more representative and advanced feature representation. By capturing richer image features, the network can more accurately distinguish between overlapping characters, thereby improving the reliability of CAPTCHA image recognition. The normalized exponential function layers convert the target feature vectors into probability distributions for independent characters. This allows the network to provide a probability value for each character position; by calculating the probability of each character position, the location of overlapping characters can be accurately determined, thus improving the accuracy of CAPTCHA image recognition. The fully connected layers and normalized exponential function layers of convolutional neural networks can transform the complex features in CAPTCHA images into target feature vectors that are easier to process and understand. These vectors are then converted into character probability distributions, effectively solving the problems of character adhesion and recognition, and improving the accuracy and reliability of CAPTCHA recognition.
[0072] The above embodiments mention that a pre-trained convolutional neural network can be used to recognize the CAPTCHA image to obtain multiple independent characters. In fact, the convolutional neural network also includes a feature extraction layer. Based on this, the following embodiments will describe in detail the specific process of using a pre-trained convolutional neural network to recognize the CAPTCHA image to obtain multiple independent characters.
[0073] In one embodiment, another independent character determination method is provided, based on the above embodiments, such as... Figure 3 As shown, S204 above may include:
[0074] S302, input the verification code image into the feature extraction layer to obtain image features.
[0075] In this embodiment, the CAPTCHA image is input into the feature extraction layer. The feature extraction layer can use a series of convolution kernels to perform convolution operations on the input CAPTCHA image to capture different features in the CAPTCHA image. Each convolution kernel can detect a certain local feature in the image, such as edges, textures, etc. Through the convolution operation, a series of feature maps can be obtained, and each feature map corresponds to a feature detected by a convolution kernel.
[0076] S304: Input the image features into the fully connected layer to obtain the target feature vector.
[0077] In this embodiment, after the feature extraction layer completes the feature extraction of the CAPTCHA image, the image features are input to the fully connected layer. The fully connected layer integrates and transforms the image features to generate a target feature vector. This target feature vector integrates the high-level representations of various image features and can better represent the features of the CAPTCHA image.
[0078] S306, the target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
[0079] In this embodiment, the target feature vector is input into a normalized exponential function layer. Through this layer, the target feature vector is transformed into a probability distribution of characters. Using this probability distribution, the likelihood of each character position can be accurately inferred, thereby precisely parsing multiple independent characters.
[0080] In the above embodiments, the feature extraction layer in the convolutional neural network helps capture local features in the image, such as edges and textures, which help distinguish different parts of connected characters. The fully connected layer transforms these image features into higher-level representations, thereby better capturing the contextual information between characters and further aiding in the differentiation of connected characters. Furthermore, the transformation of image features by the fully connected layer converts the input image information into a more representative and higher-level feature representation. This helps the network more accurately distinguish the differences between connected characters, thereby improving the reliability of CAPTCHA image recognition. The fully connected layer maps image features to target feature vectors, enabling the network to better understand the relationships between characters, further enhancing the reliability of recognition. Simultaneously, the normalized exponential function layer transforms the target feature vector into a probability distribution for independent characters. This approach allows the network to provide a probability value for each character position, thereby better capturing the uncertainty of characters. By calculating the probability of each character position, the possible positions of connected characters can be accurately determined, thereby improving the accuracy of CAPTCHA image recognition. This probability distribution provides finer-grained information, enabling the network to make more accurate classification decisions. In summary, this method, through the design of a feature extraction layer, a fully connected layer, and a normalized exponential function layer, fully utilizes the information of image features, thereby effectively solving the problem of difficult recognition of overlapping characters in traditional methods and improving the reliability and accuracy of CAPTCHA image recognition.
[0081] The above embodiments mentioned that image features can be input into a fully connected layer to obtain a target feature vector; the following embodiments will describe in detail the specific process of inputting image features into a fully connected layer to obtain a target feature vector.
[0082] In one embodiment, another independent character determination method is provided, based on the above embodiments, such as... Figure 4 As shown, S304 above may include:
[0083] S402, the image features are input into the fully connected layer to flatten the image features and obtain candidate feature vectors; the candidate feature vectors include multiple first sub-vectors.
[0084] S404, the first sub-vector in the candidate feature vector is processed in the corresponding neuron of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0085] In this embodiment, the image features processed by the feature extraction layer are input into a fully connected layer. In the fully connected layer, the image features are flattened, transforming them from a two-dimensional feature map shape into a one-dimensional vector shape, resulting in a candidate feature vector. This candidate feature vector includes information about the image features. The first sub-vector is then input into the neurons of the fully connected layer for processing. Each neuron is connected to each element of the first sub-vector, and through weighted combination and transformation, feature transformation is performed on the first sub-vector. After processing by the neurons of the fully connected layer, the first sub-vector is transformed into a new vector, namely the target feature vector. This target feature vector can be seen as a high-level representation of the first sub-vector, capturing the features and relationships within the first sub-vector.
[0086] In the above embodiments, by flattening image features into candidate feature vectors and then processing the first sub-vectors in each candidate feature vector, the fully connected layer can handle the sticking phenomenon between characters more finely. Each first sub-vector, through weighted combination and transformation in the fully connected layer, helps capture the local relationships and features between characters, thus effectively handling the sticking phenomenon. After converting image features into candidate feature vectors, the fully connected layer processes each first sub-vector, transforming it into a higher-level representation. This helps the network more accurately distinguish between sticking characters, further improving the reliability of CAPTCHA image recognition. By processing each sub-vector individually, the network can better understand the local features of characters, thereby enhancing the reliability of recognition. By inputting each first sub-vector in the candidate feature vectors into the fully connected layer neurons for processing, the target feature vector corresponding to each sub-vector can be obtained. These target feature vectors are transformed into independent characters through a normalized exponential function layer, providing finer-grained information, enabling the network to make classification decisions more accurately, thereby improving the accuracy of CAPTCHA image recognition. Processing each sub-vector individually helps to accurately capture the features of sticking characters, further enhancing recognition accuracy. In summary, this process, by flattening image features into candidate feature vectors and processing each first sub-vector individually, fully utilizes the information in the image features, thus effectively solving the problem of difficult recognition of connected characters in traditional methods and improving the reliability and accuracy of CAPTCHA image recognition. In particular, by processing each sub-vector separately, the network can better understand the local relationships and features of characters, processing them with more detailed classification information, thus exhibiting a significant advantage in handling connected characters.
[0087] The above embodiments mentioned that the target feature vector can be input into the normalized exponential function layer to obtain multiple independent characters; the following embodiments will describe in detail the specific process of inputting the target feature vector into the normalized exponential function layer to obtain multiple independent characters.
[0088] In one embodiment, another independent character determination method is provided, based on the above embodiments, such as... Figure 5 As shown, the above S306 may include:
[0089] S502, the target feature vector is input into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0090] S504, for each second sub-vector, determine the independent character based on the character corresponding to the maximum probability value in the probability distribution.
[0091] In this embodiment, the target feature vector obtained after processing by the fully connected layer is input into the normalized exponential function layer. The normalized exponential function layer exponentializes each second sub-vector in the target feature vector. After exponentialization, the normalized exponential function layer normalizes the processed vector. This operation divides each element in the vector by the sum of all elements in the vector to ensure that the sum of all elements is 1, thus obtaining a probability distribution. The normalized vector represents the probability distribution of each second sub-vector, that is, the probability of each position corresponding to a character. For each second sub-vector, the independent character can be determined by finding the position with the maximum probability value. For example, if the probability distribution of the second sub-vector is [0.1, 0.5, 0.2, 0.2, 0.0, ..., 0.0], then the position corresponding to the maximum probability value is the second position, indicating that the network considers the character at this position to be a character with a high probability of being a contiguous character, and therefore determines it as an independent character.
[0092] In the above embodiments, the normalization processing of the normalized exponential function layer maps the probability of each position to a probability distribution, ensuring that the weights of each position are appropriately balanced. Thus, even if a character position is affected by character overlap, it will still have a certain weight in the probability distribution and will not be completely ignored. This trade-off helps improve the reliability of CAPTCHA image recognition and reduces recognition errors caused by character overlap. Through the processing of the normalized exponential function layer, each second sub-vector is mapped to a probability distribution, where the character corresponding to the position with the highest probability is considered the most likely character. This approach allows the network to make more refined judgments on each position, thereby accurately identifying each character. Therefore, by selecting the character corresponding to the maximum probability value in the probability distribution, the accuracy of CAPTCHA image recognition can be improved. In summary, the processing of the normalized exponential function layer can effectively address the character overlap phenomenon, improving the reliability and accuracy of CAPTCHA image recognition. By converting the target feature vector into a character probability distribution and determining independent characters based on the maximum probability value, the convolutional neural network can fully utilize the information of image features, thereby achieving better results in recognizing CAPTCHA images containing overlapped characters.
[0093] In one embodiment, another independent character determination method is provided. Based on the above embodiment, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction; the first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image. 。
[0094] In this embodiment, the feature extraction layer consists of a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer, each with different functions: (1) First convolutional layer: This layer transforms the original CAPTCHA image into a set of feature maps through a series of convolutional operations. Each convolutional kernel slides on the image to extract local features of the image and capture low-level features such as edges and textures in the CAPTCHA image. (2) First pooling layer: Pooling operations help reduce the size of the image while retaining important feature information. By performing operations such as max pooling on the output of the first convolutional layer, the main features in the image are obtained, improving the computational efficiency of subsequent layers. (3) Second convolutional layer: Similar to the first convolutional layer, the second convolutional layer further extracts high-level features, further abstracts and combines the features of the previous layer, and captures more complex patterns and shapes. (4) Second pooling layer: Continues to perform pooling operations on the feature maps to further reduce the size of the image, reduce the amount of computation, and retain important feature information. (5) Third convolutional layer: Through deeper convolutional operations, the third convolutional layer can capture more abstract and high-level features, further refining the features of the previous layers for higher-level pattern recognition. (6) Third pooling layer: The last pooling operation further refines the feature map, preparing it for the subsequent fully connected layers.
[0095] In the above embodiments, the feature extraction layer aims to extract the most representative features from the CAPTCHA image, which will be helpful for subsequent classification and recognition tasks. Through layer-by-layer convolution and pooling operations, the feature extraction layer can gradually transform image data into more abstract and meaningful feature representations, providing a better foundation for CAPTCHA image recognition tasks such as recognizing characters that appear to be stuck together.
[0096] The above embodiments mention that a pre-trained convolutional neural network can be used to recognize CAPTCHA images and obtain multiple independent characters. In fact, before inputting the CAPTCHA image into the convolutional neural network, the CAPTCHA image needs to be preprocessed. The following embodiments will describe this process in detail.
[0097] In one embodiment, another independent character determination method is provided. Based on the above embodiment, before S204, the method may include: performing a preprocessing operation on the verification code image to determine the processed verification code image.
[0098] In this embodiment, before inputting the CAPTCHA image into the convolutional neural network, the CAPTCHA image can be preprocessed to improve recognition accuracy. Optionally, filtering techniques can be used to remove noise from the CAPTCHA image to preserve the clear outlines of the characters. Noise removal can improve the quality of the CAPTCHA image and reduce interference with character recognition. CAPTCHA images can be uniformly adjusted to the same size to ensure that the images input into the convolutional neural network have consistent dimensions. Image enhancement techniques such as contrast enhancement and brightness adjustment can also be applied to enhance the visual features of the image, helping to better capture character details.
[0099] In the above embodiments, using filtering techniques to remove noise can preserve the clear outline of characters, which helps to segment and recognize characters more accurately. This can improve the clarity of character edges, thereby reducing interference with subsequent recognition processes. Preprocessing steps such as noise removal and image adjustment can significantly improve the quality of CAPTCHA images, making characters more prominent and thus easier for the model to recognize. Clear images can provide more feature information, helping the network to learn and infer better. Adjusting images to the same size ensures that the images input to the neural network have consistent dimensions, making the network easier to process. This helps reduce unnecessary variations and improves the network's generalization ability. Techniques such as contrast enhancement and brightness adjustment can enhance the visual features of the image, making characters more prominent. This helps the network to better capture character details and improve recognition accuracy. In summary, the preprocessing process of CAPTCHA images can enhance image quality and highlight character features without changing the original character information, thereby providing more useful input for convolutional neural networks. These preprocessing steps can improve the accuracy of CAPTCHA image recognition, enhance the model's robustness to problems such as stuck characters and noise, and thus improve overall recognition performance.
[0100] The above embodiments mentioned performing preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image. The following embodiments will describe in detail the specific process of performing preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image.
[0101] In one embodiment, another independent character determination method is provided, based on the above embodiments, such as... Figure 6 As shown, the above preprocessing operation on the CAPTCHA image determines that the processed CAPTCHA image may include:
[0102] S602, perform grayscale processing on the verification code image to determine the grayscale image.
[0103] In this embodiment, for each pixel of the color CAPTCHA image, the average value of its red, green, and blue channels is calculated, and this average value is used as the grayscale value. This results in a grayscale image where each pixel has only one grayscale value. Converting a CAPTCHA image to grayscale reduces processing complexity while preserving the main information in the image. Each pixel value in a grayscale image represents the image's brightness, typically within the range of 0 to 255.
[0104] S604 performs binarization on the grayscale image to determine the processed verification code image corresponding to the grayscale image.
[0105] In this embodiment, after grayscale processing, the grayscale image can be binarized. Binarization divides the pixel values in the grayscale image into two categories, typically black and white. This helps to highlight the outline and features of characters, facilitating subsequent character recognition.
[0106] In the above embodiments, grayscale processing converts color information into grayscale values, and further binarization simplifies the image to only black and white pixel values, reducing information redundancy and making processing more efficient. Binarization separates characters from the background, thus better highlighting the shape and edges of characters and helping to improve the accuracy of subsequent character recognition. The image after grayscale and binarization processing is clearer and simpler, providing better input conditions, thereby helping to improve the accuracy and reliability of CAPTCHA recognition.
[0107] The above embodiments mentioned that the CAPTCHA image can be converted to grayscale to determine a grayscale image. The following embodiments will describe in detail the specific process of converting the CAPTCHA image to grayscale to determine a grayscale image.
[0108] In one embodiment, another independent character determination method is provided. Based on the above embodiment, S602 may include:
[0109] The grayscale image is obtained by calculating the grayscale value of each pixel in the CAPTCHA image using a weighted average method.
[0110] In this embodiment, the color values of the red, green, and blue channels of each pixel are weighted and averaged to obtain a single grayscale value. Optionally, assume that the colors of a pixel are (R, G, B), where R, G, and B represent the color values of the red, green, and blue channels, respectively. The steps for calculating the grayscale value Y according to the weighted average formula are as follows: Y = 0.299 * R + 0.578 * G + 0.114 * B. The weight values 0.299, 0.578, and 0.114 in this formula are determined based on the perceptual weights of different color channels by the human eye, so that the grayscale image better reflects the human eye's perception of the image. By substituting the RGB values of each pixel into this formula, the corresponding grayscale value Y can be obtained. Finally, the RGB values of all pixels are converted into their corresponding grayscale values, thereby generating a grayscale image.
[0111] In the above embodiments, the weighted average method helps to reduce the dimensionality of the image, making subsequent processing more efficient, while also preserving the main features of the image for various image analysis and processing tasks.
[0112] The above embodiments mentioned that grayscale images can be binarized to determine the corresponding processed CAPTCHA image. The following embodiments will describe in detail the specific process of binarizing grayscale images to determine the corresponding processed CAPTCHA image.
[0113] In one embodiment, another independent character determination method is provided. Based on the above embodiment, S604 may include: calculating the optimal threshold of the grayscale image using a global thresholding method, and using the optimal threshold to determine the processed verification code image corresponding to the grayscale image.
[0114] In this embodiment, a global thresholding method is used to calculate the optimal threshold for the grayscale image, and the optimal threshold is used to determine the processed CAPTCHA image corresponding to the grayscale image. By using the global thresholding method to calculate the optimal threshold for the grayscale image, the image can be divided into two regions: foreground (character part) and background, thereby determining the processed CAPTCHA image corresponding to the grayscale image.
[0115] The basic idea of global thresholding is to find a threshold such that pixels below the threshold are considered background, and pixels above the threshold are considered foreground. In the context of a CAPTCHA image, the threshold should be chosen to separate the characters from the background, allowing the characters to be displayed more clearly.
[0116] Optionally, the process can be:
[0117] (1) Assume a threshold T, and mark pixels with gray values greater than T as the target (foreground) and pixels with gray values less than T as the background.
[0118] (2) Traverse each pixel of the image, calculate the sum of the gray values H1 of the pixels with gray values greater than T, and record the number of pixels with gray values greater than T N1.
[0119] (3) Calculate the average gray value of the target part M1 = H1 / N1.
[0120] (4) Calculate the average gray value of the background part M2 = (sum of total pixel gray values - H1) / (total number of pixels - N1).
[0121] (5) Calculate the average value (Mean) of M1 and M2 = (M1 + M2) / 2.
[0122] The average value calculated in step (5) is used as the new threshold T. Then, steps (2) to (5) are repeated until the new threshold T no longer changes significantly, i.e., convergence is achieved. This threshold is the optimal threshold found by the global thresholding method and is used for image binarization.
[0123] (6) Using the determined optimal threshold, the grayscale image is converted into a binary image, i.e., the foreground pixels are black and the background pixels are white.
[0124] In the above embodiments, by selecting an appropriate threshold, the shape of the characters can be displayed more clearly, providing better input conditions for subsequent character recognition, thereby improving the accuracy and reliability of CAPTCHA recognition.
[0125] The above embodiments mentioned that a pre-trained convolutional neural network can be used to recognize CAPTCHA images. The following embodiments will describe in detail the training process of the convolutional neural network.
[0126] In one embodiment, another independent character determination method is provided, based on the above embodiments, such as... Figure 7 As shown, the training methods for convolutional neural networks can include:
[0127] S702, acquire multiple training data; each training data includes sample images and sample characters corresponding to the sample images;
[0128] S704, Input each training data into the initial neural network to determine the predicted character corresponding to each training data;
[0129] S706, Based on the predicted independent characters and corresponding sample characters of each training data, train the initial convolutional neural network to determine the convolutional neural network.
[0130] In this embodiment, a large amount of training data is first collected. Each training data set includes a sample image and a corresponding sample character. These sample images can be CAPTCHA images, and the sample characters are the characters displayed in the images, such as concatenated "A", "B", "C", etc. The sample image of each training data set is input into a neural network. The neural network structure includes convolutional layers, pooling layers, and fully connected layers, which extract and transform features from the images, ultimately generating prediction results. Through the forward propagation process of the neural network, each training data set is processed to obtain the corresponding predicted character. The predicted character is compared with the sample character (reference character), and a loss function is calculated. The loss function measures the difference between the network's prediction and the actual sample, i.e., the accuracy of the prediction. Using the backpropagation algorithm, the loss is propagated back from the output layer to the network, and the contribution of each parameter to the loss is calculated. Then, the weights and parameters of the network are updated according to gradient descent or other optimization algorithms. Through multiple iterations of training, the network parameters are continuously optimized to better adapt to the training data and improve the recognition ability of CAPTCHA characters. After multiple iterations of training, an adjusted convolutional neural network is obtained.
[0131] In the above embodiments, by continuously adjusting the parameters of the neural network, the neural network can learn from the training data and extract useful features, thereby achieving the goal of accurately recognizing CAPTCHA images.
[0132] The following detailed embodiment illustrates the process of the independent character determination method in this application. Based on the above embodiment, the implementation process of this method may include the following:
[0133] S1, acquire multiple training data; each training data includes sample images and sample characters corresponding to the sample images;
[0134] S2, input each training data into the initial neural network to determine the predicted character corresponding to each training data;
[0135] S3, based on the predicted characters and corresponding sample characters of each training data, train the initial neural network to determine the convolutional neural network; wherein, the convolutional neural network includes a fully connected layer and a normalized exponential function layer; the fully connected layer is used to convert the image features of the verification code image into the target feature vector; the normalized exponential function layer is used to convert the target feature vector into independent characters;
[0136] S4, Obtain the verification code image; the verification code image includes connected characters;
[0137] S5. The grayscale value of each pixel in the verification code image is calculated using the weighted average method to obtain the grayscale image.
[0138] S6. The optimal threshold for the grayscale image is calculated using the global thresholding method. The grayscale image is then processed using the optimal threshold to obtain the processed verification code image.
[0139] S7, input the processed CAPTCHA image into the feature extraction layer to obtain image features; the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer, the first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction; the first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image;
[0140] S8, input the image features into the fully connected layer to flatten the image features and obtain candidate feature vectors; the candidate feature vectors include multiple first sub-vectors;
[0141] S9. The first sub-vector in the candidate feature vector is processed in the neurons of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0142] S10, the target feature vector is input to the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0143] S11, for each second sub-vector, determine the independent character based on the character corresponding to the maximum probability value in the probability distribution.
[0144] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0145] Based on the same inventive concept, this application also provides an independent character determination device for implementing the aforementioned independent character determination method. The solution provided by this device is similar to the implementation described in the above method; therefore, the specific limitations in one or more independent character determination device embodiments provided below can be found in the limitations of the independent character determination method described above, and will not be repeated here.
[0146] In one embodiment, such as Figure 8 As shown, an independent character determination device is provided, including: an acquisition module 11 and a recognition module 12, wherein:
[0147] Module 11 is used to acquire the verification code image; the verification code image includes connected characters.
[0148] The recognition module 12 is used to recognize the CAPTCHA image using a pre-trained convolutional neural network to obtain multiple independent characters;
[0149] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0150] In another embodiment, a different independent character determination device is provided. Based on the above embodiments, the recognition module 12 may include:
[0151] The first acquisition unit is used to input the verification code image into the feature extraction layer to obtain image features;
[0152] The second acquisition unit is used to input image features into the fully connected layer to obtain the target feature vector;
[0153] The third acquisition unit is used to input the target feature vector into the normalized exponential function layer to obtain multiple independent characters.
[0154] In another embodiment, a different independent character determination device is provided. Based on the above embodiments, the second acquisition unit may include:
[0155] The first acquisition subunit is used to input image features into the fully connected layer to flatten the image features and obtain candidate feature vectors; the candidate feature vectors include multiple first subvectors.
[0156] The second acquisition subunit is used to process the first sub-vector in the candidate feature vector into the neurons of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0157] In another embodiment, a different independent character determination device is provided. Based on the above embodiments, the third acquisition unit may include:
[0158] The first determining subunit is used to input the target feature vector into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second subvector in the target feature vector to determine the probability distribution of each second subvector.
[0159] The second determining subunit is used to determine the independent character for each second subvector based on the character corresponding to the maximum probability value in the probability distribution.
[0160] In another embodiment, another independent character determination device is provided. Based on the above embodiments, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction. The first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image.
[0161] In another embodiment, a different independent character determination device is provided, which, based on the above embodiments, may further include:
[0162] The preprocessing module is used to perform preprocessing operations on the CAPTCHA image and determine the processed CAPTCHA image.
[0163] In another embodiment, a different independent character determination device is provided. Based on the above embodiments, the preprocessing module may further include:
[0164] The grayscale image determination unit is used to perform grayscale processing on the verification code image and determine the grayscale image.
[0165] The processed CAPTCHA image determination unit is used to perform binarization processing on the grayscale image and determine the processed CAPTCHA image corresponding to the grayscale image.
[0166] In another embodiment, another independent character determination device is provided. Based on the above embodiments, the grayscale image determination unit may further include:
[0167] The third determining subunit is used to calculate the grayscale value of each pixel in the verification code image using a weighted average method to obtain a grayscale image.
[0168] In another embodiment, another independent character determination device is provided. Based on the above embodiments, the processed verification code image determination unit may further include:
[0169] The fourth determination subunit is used to calculate the optimal threshold of the grayscale image using the global thresholding method, and to determine the processed CAPTCHA image corresponding to the grayscale image using the optimal threshold.
[0170] In another embodiment, a different independent character determination device is provided. Based on the above embodiments, the device may further include a training module, which may include:
[0171] The fourth acquisition unit is used to acquire multiple training data; each training data includes a sample image and the sample character corresponding to the sample image;
[0172] The fifth determining subunit is used to input each training data into the initial neural network and determine the predicted character corresponding to each training data.
[0173] The sixth determining subunit is used to train the initial neural network based on the predicted characters and corresponding sample characters of each training data, and to determine the convolutional neural network.
[0174] Each module in the aforementioned independent character determination device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.
[0175] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:
[0176] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0177] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0178] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0179] In one embodiment, the convolutional neural network further includes a feature extraction layer, and the processor, when executing a computer program, also performs the following steps:
[0180] The verification code image is input into the feature extraction layer to obtain image features;
[0181] Image features are input into a fully connected layer to obtain the target feature vector;
[0182] The target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
[0183] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0184] Image features are input into a fully connected layer to flatten the image features, resulting in candidate feature vectors; the candidate feature vectors include multiple first sub-vectors.
[0185] The first sub-vector in the candidate feature vector is processed in the neurons of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0186] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0187] The target feature vector is input into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0188] For each second subvector, the independent character is determined based on the character corresponding to the maximum probability value in the probability distribution.
[0189] In one embodiment, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction. The first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image.
[0190] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0191] Perform preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image.
[0192] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0193] The verification code image is converted to grayscale to determine the grayscale image;
[0194] Binarize the grayscale image to determine the corresponding processed CAPTCHA image.
[0195] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0196] The grayscale image is obtained by calculating the grayscale value of each pixel in the CAPTCHA image using a weighted average method.
[0197] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0198] The optimal threshold for a grayscale image is calculated using a global thresholding method, and the corresponding processed CAPTCHA image is determined using the optimal threshold.
[0199] In one embodiment, the processor, when executing a computer program, also performs the following steps:
[0200] Acquire multiple training data sets; each training data set includes sample images and the sample characters corresponding to the sample images.
[0201] Each training data point is input into the initial neural network to determine the predicted character corresponding to each training data point.
[0202] Based on the predicted independent characters and corresponding sample characters of each training data, the initial neural network is trained to determine the convolutional neural network.
[0203] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor:
[0204] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0205] The CAPTCHA image is identified using a pre-trained convolutional neural network to obtain multiple independent characters;
[0206] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0207] In one embodiment, the convolutional neural network further includes a feature extraction layer, and the computer program, when executed by a processor, also performs the following steps:
[0208] The verification code image is input into the feature extraction layer to obtain image features;
[0209] Image features are input into a fully connected layer to obtain the target feature vector;
[0210] The target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
[0211] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0212] Image features are input into a fully connected layer to flatten the image features, resulting in candidate feature vectors; the candidate feature vectors include multiple first sub-vectors.
[0213] The first sub-vector in the candidate feature vector is processed in the corresponding neuron of the fully connected layer to obtain the target feature vector.
[0214] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0215] The target feature vector is input into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0216] For each second subvector, the independent character is determined based on the character corresponding to the maximum probability value in the probability distribution.
[0217] In one embodiment, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction. The first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image.
[0218] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0219] Perform preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image.
[0220] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0221] The verification code image is converted to grayscale to determine the grayscale image;
[0222] Binarize the grayscale image to determine the corresponding processed CAPTCHA image.
[0223] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0224] The grayscale image is obtained by calculating the grayscale value of each pixel in the CAPTCHA image using a weighted average method.
[0225] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0226] The optimal threshold for a grayscale image is calculated using a global thresholding method, and the corresponding processed CAPTCHA image is determined using the optimal threshold.
[0227] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0228] Acquire multiple training data sets; each training data set includes sample images and the sample characters corresponding to the sample images.
[0229] Each training data point is input into the initial neural network to determine the predicted character corresponding to each training data point.
[0230] Based on the predicted independent characters and corresponding sample characters of each training data, the initial neural network is trained to determine the convolutional neural network.
[0231] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, performs the following steps:
[0232] Obtain the CAPTCHA image; the CAPTCHA image includes concatenated characters.
[0233] A pre-trained convolutional neural network is used to recognize CAPTCHA images and obtain multiple independent characters;
[0234] The convolutional neural network includes fully connected layers and normalized exponential function layers; the fully connected layers are used to convert the image features of the CAPTCHA image into target feature vectors; the normalized exponential function layers are used to convert the target feature vectors into individual characters.
[0235] In one embodiment, the convolutional neural network further includes a feature extraction layer, and the computer program, when executed by a processor, also performs the following steps:
[0236] The verification code image is input into the feature extraction layer to obtain image features;
[0237] Image features are input into a fully connected layer to obtain the target feature vector;
[0238] The target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
[0239] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0240] Image features are input into a fully connected layer to flatten the image features, resulting in candidate feature vectors; the candidate feature vectors include multiple first sub-vectors.
[0241] The first sub-vector in the candidate feature vector is processed in the neurons of the fully connected layer to obtain the target feature vector; the target feature vector includes multiple second sub-vectors.
[0242] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0243] The target feature vector is input into the normalized exponential function layer. The normalized exponential function layer performs exponentialization and normalization processing on each second sub-vector in the target feature vector to determine the probability distribution of each second sub-vector.
[0244] For each second subvector, the independent character is determined based on the character corresponding to the maximum probability value in the probability distribution.
[0245] In one embodiment, the feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction. The first pooling layer, the second pooling layer, and the third pooling layer are used to obtain key features of the CAPTCHA image.
[0246] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0247] Perform preprocessing operations on the CAPTCHA image to determine the processed CAPTCHA image.
[0248] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0249] The verification code image is converted to grayscale to determine the grayscale image;
[0250] Binarize the grayscale image to determine the corresponding processed CAPTCHA image.
[0251] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0252] The grayscale image is obtained by calculating the grayscale value of each pixel in the CAPTCHA image using a weighted average method.
[0253] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0254] The optimal threshold for a grayscale image is calculated using a global thresholding method, and the corresponding processed CAPTCHA image is determined using the optimal threshold.
[0255] In one embodiment, when the computer program is executed by a processor, it also performs the following steps:
[0256] Acquire multiple training data sets; each training data set includes sample images and the sample characters corresponding to the sample images.
[0257] Each training data point is input into the initial neural network to determine the predicted character corresponding to each training data point.
[0258] Based on the predicted independent characters and corresponding sample characters of each training data, the initial neural network is trained to determine the convolutional neural network.
[0259] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0260] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0261] The above embodiments are merely illustrative of several implementation methods of this application, and their descriptions are relatively specific and detailed. However, they should not be construed as limiting the scope of this application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for determining an independent character, characterized in that, The method includes: Obtain the verification code image; the verification code image includes connected characters; The CAPTCHA image is identified using a pre-trained convolutional neural network to obtain multiple independent characters; wherein, the convolutional neural network includes a fully connected layer and a normalized exponential function layer; the fully connected layer is used to convert the image features of the CAPTCHA image into a target feature vector; the normalized exponential function layer is used to convert the target feature vector into the independent characters; The convolutional neural network further includes a feature extraction layer. The step of using a pre-trained convolutional neural network to recognize the CAPTCHA image and obtain multiple independent characters includes: inputting the CAPTCHA image into the feature extraction layer to obtain the image features. The image features are input into a fully connected layer to flatten the image features, resulting in a candidate feature vector; the candidate feature vector includes multiple first sub-vectors. The first sub-vector is input into the neurons of the fully connected layer for processing. Each neuron is connected to each element in the first sub-vector. Through weighted combination and transformation, the first sub-vector is transformed to obtain the target feature vector. The target feature vector includes multiple second sub-vectors. The target feature vector is used to obtain the multiple independent characters. The method further includes: inputting the target feature vector into the normalized exponential function layer, performing exponentialization and normalization processing on each of the second sub-vectors in the target feature vector through the normalized exponential function layer to determine the probability distribution of each second sub-vector; and for each second sub-vector, determining the independent character based on the character corresponding to the maximum probability value in the probability distribution. Each position corresponds to the probability of a character. For each second subvector, the independent character is determined by finding the position with the highest probability value.
2. The method according to claim 1, characterized in that, The process of using a pre-trained convolutional neural network to recognize the CAPTCHA image yields multiple independent characters, including: The target feature vector is input into the normalized exponential function layer to obtain multiple independent characters.
3. The method according to claim 2, characterized in that, The feature extraction layer includes a first convolutional layer, a first pooling layer, a second convolutional layer, a second pooling layer, a third convolutional layer, and a third pooling layer. The first convolutional layer, the second convolutional layer, and the third convolutional layer are used for feature extraction. The first pooling layer, the second pooling layer, and the third pooling layer are used to obtain the key features of the CAPTCHA image.
4. The method according to claim 1, characterized in that, Before using a pre-trained convolutional neural network to recognize multiple independent characters in the CAPTCHA image, the method further includes: The verification code image is preprocessed to determine the processed verification code image; Accordingly, a pre-trained convolutional neural network is used to recognize the processed CAPTCHA image to obtain multiple independent characters.
5. The method according to claim 4, characterized in that, The preprocessing operation on the verification code image to determine the processed verification code image includes: The verification code image is converted to grayscale to determine the grayscale image; The grayscale image is binarized to determine the processed verification code image corresponding to the grayscale image.
6. The method according to claim 5, characterized in that, The preprocessing operation on the verification code image to determine the processed verification code image includes: The grayscale image is obtained by calculating the grayscale value of each pixel in the verification code image using a weighted average method.
7. The method according to claim 5, characterized in that, The step of binarizing the grayscale image to determine the processed verification code image corresponding to the grayscale image includes: The optimal threshold for the grayscale image is calculated using a global thresholding method, and the processed CAPTCHA image corresponding to the grayscale image is determined using the optimal threshold.
8. The method according to claim 1, characterized in that, The training methods for the convolutional neural network include: Acquire multiple training data sets; each training data set includes a sample image and a sample character corresponding to the sample image. Each training data is input into an initial neural network to determine the predicted character corresponding to each training data. Based on the predicted characters and corresponding sample characters of each training data, the initial neural network is trained to determine the convolutional neural network.
9. An independent character determination device, characterized in that, The device includes: The acquisition module is used to acquire a verification code image; the verification code image includes connected characters. The recognition module is used to recognize the CAPTCHA image using a pre-trained convolutional neural network to obtain multiple independent characters; wherein, the convolutional neural network includes a fully connected layer and a normalized exponential function layer; the fully connected layer is used to convert the image features of the CAPTCHA image into a target feature vector; the normalized exponential function layer is used to convert the target feature vector into the independent characters; The device is further configured to: the convolutional neural network further includes a feature extraction layer, and the step of using a pre-trained convolutional neural network to recognize the verification code image and obtain multiple independent characters includes: inputting the verification code image into the feature extraction layer to obtain the image features; The image features are input into a fully connected layer to flatten the image features, resulting in a candidate feature vector; the candidate feature vector includes multiple first sub-vectors. The first sub-vector is input into the neurons of the fully connected layer for processing. Each neuron is connected to each element in the first sub-vector. Through weighted combination and transformation, the first sub-vector is transformed to obtain the target feature vector. The target feature vector includes multiple second sub-vectors. The target feature vector is used to obtain the multiple independent characters. The device is further configured to: input the target feature vector into the normalized exponential function layer, perform exponentialization and normalization processing on each of the second sub-vectors in the target feature vector through the normalized exponential function layer, and determine the probability distribution of each of the second sub-vectors; for each of the second sub-vectors, determine the independent character according to the character corresponding to the maximum probability value in the probability distribution; The device is further configured to: determine the probability of a character corresponding to each position, and for each second sub-vector, determine the independent character by finding the position with the maximum probability value.
10. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 8.
11. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.
12. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.