Error word detection method and homework correction method

By converting the target character into a standard template character and matching it with the correct reference character, and training it using a font standardization model, the problem of insufficient training data in existing misspelling detection methods is solved, and higher accuracy misspelling detection is achieved.

CN115984878BActive Publication Date: 2026-06-16IFLYTEK CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
IFLYTEK CO LTD
Filing Date
2022-12-26
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing misspelling detection methods suffer from poor training data richness during model training, and schemes based on whole character modeling and radical modeling have limited application scope, resulting in low misspelling detection accuracy.

Method used

By acquiring the target image, converting it into a standard template character, and matching it with the correct reference character that meets the similarity conditions, the font standardization model is used for self-supervised and supervised training to determine the misspelling detection results.

🎯Benefits of technology

It improves the accuracy of misspelling detection, simplifies the detection process, enhances operability and transferability, and can accurately detect any target character.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115984878B_ABST
    Figure CN115984878B_ABST
Patent Text Reader

Abstract

The application provides a wrong word detection method and a homework correction method, and relates to the technical field of word processing. The wrong word detection method comprises the following steps: obtaining a target image, the target image containing a target word to be detected; determining a standard template word corresponding to the target word based on the target image, the word content of the standard template word being the same as that of the target word, and the standard template word being in a first font style; determining a first reference word meeting a preset similarity condition with the target word based on the target image, the first reference word being a correct word; and determining a wrong word detection result corresponding to the target word based on the first reference word and the standard template word. Through the scheme in the application, the process of wrong word detection can be simplified, and the accuracy of wrong word detection is ensured.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of word processing technology, specifically to a method for detecting misspelled words and a method for correcting homework. Background Technology

[0002] Literacy is the foundation of reading and writing and an important task in Chinese language teaching. However, students inevitably make spelling mistakes during the process of learning and writing characters. Related misspelling detection methods typically utilize classification models based on whole-character modeling or decoding models based on radical modeling to detect target characters. However, the application scope of these two approaches is limited, and there is also the problem of poor richness in the training data collected when training the corresponding models. Summary of the Invention

[0003] To address the aforementioned technical problems, this application is proposed. Embodiments of this application provide a method for detecting misspellings and a method for correcting homework.

[0004] In a first aspect, one embodiment of this application provides a method for detecting misspelled characters. The method includes: acquiring a target image, the target image containing a target character to be detected; determining a standard template character corresponding to the target character based on the target image, the content of the standard template character being the same as the content of the target character, and the standard template character being a first font style; determining a first reference character that meets preset similarity conditions with the target character based on the target image, the first reference character being a correct character; and determining a misspelled character detection result corresponding to the target character based on the first reference character and the standard template character.

[0005] In conjunction with the first aspect, in some implementations of the first aspect, the error detection result corresponding to the target character is determined based on the first reference character and the standard template character, including: if the font style of the first reference character is different from that of the standard template character, the font style of the first reference character is converted to obtain the second reference character corresponding to the first reference character, and the second reference character is the first font style; if the character feature data corresponding to the second reference character is different from the character feature data corresponding to the standard template character, the error detection result corresponding to the target character is determined to be that the target character is an error.

[0006] In conjunction with the first aspect, in some implementations of the first aspect, the error detection result corresponding to the target character is determined based on the first reference character and the standard template character, including: if the first reference character and the standard template character have the same font style, and the character feature data corresponding to the first reference character and the character feature data corresponding to the standard template character are the same, then the error detection result corresponding to the target character is determined to be that the target character is a correct character.

[0007] In conjunction with the first aspect, in some implementations of the first aspect, the standard template character corresponding to the target character is determined based on the target image, including: processing the target image using a font standardization model to obtain the standard template character corresponding to the target character.

[0008] In conjunction with the first aspect, in some implementations of the first aspect, the training method of the font standardization model includes: constructing a network model to be trained; performing self-supervised training on the network model to be trained to obtain a pre-trained font standardization model; and performing supervised training on the pre-trained font standardization model to obtain a font standardization model.

[0009] In conjunction with the first aspect, in some implementations of the first aspect, self-supervised training is performed on the network model to be trained to obtain a pre-trained font standardization model, including: acquiring a first training dataset, which includes character sample images; segmenting the character sample images to obtain multiple character image blocks corresponding to the character sample images; performing a masking operation on some of the character image blocks in the multiple character image blocks; and performing self-supervised training on the network model to be trained based on the character image blocks after the masking operation and the character image blocks without the masking operation to obtain a pre-trained font standardization model.

[0010] In conjunction with the first aspect, in some implementations of the first aspect, supervised training is performed on the pre-trained font standardization model to obtain the font standardization model, including: obtaining a second training dataset, which includes a training sample group, which includes handwritten character sample images and standard template characters corresponding to the handwritten character sample images; and supervised training is performed on the pre-trained font standardization model based on the second training dataset to obtain the font standardization model.

[0011] Secondly, one embodiment of this application provides a homework correction method, which includes: acquiring a homework image to be corrected, the homework image including handwritten characters to be corrected; and using the misspelling detection method described in the first aspect to determine the misspelling detection result corresponding to the handwritten characters to be corrected.

[0012] Thirdly, one embodiment of this application provides a misspelling detection device, which includes: an acquisition module for acquiring a target image, the target image containing a target character to be detected; a first determination module for determining a standard template character corresponding to the target character based on the target image, the content of the standard template character being the same as the content of the target character, and the standard template character being a first font style; a second determination module for determining a first reference character that meets preset similarity conditions with the target character based on the target image, the first reference character being a correct character; and a third determination module for determining the misspelling detection result corresponding to the target character based on the first reference character and the standard template character.

[0013] Fourthly, one embodiment of this application provides a homework correction device, which includes: an acquisition module for acquiring an image of a homework to be corrected, the image of which includes handwritten characters to be corrected; and a determination module for determining the error detection result corresponding to the handwritten characters to be corrected using the error detection method described in the first aspect.

[0014] Fifthly, one embodiment of this application provides a computer-readable storage medium storing a computer program for performing the methods described in the first and second aspects.

[0015] In a sixth aspect, one embodiment of this application provides an electronic device, the electronic device comprising: a processor; a memory for storing processor-executable instructions; the processor being configured to perform the methods described in the first and second aspects.

[0016] The misspelling detection method provided in this application has the following beneficial effects:

[0017] First, this application converts the target character into a corresponding standard template character. This standardizes the font style of the target character before detection, avoiding low accuracy in error detection caused by potential handwriting irregularities or inconsistent font styles. In other words, this application improves error detection accuracy by converting the target character into a corresponding standard template character. Second, this application matches the target character with a first reference character that meets similarity criteria, and this first reference character is correct. Then, by comparing the first reference character with the standard template character, the error detection result of the target character is determined. This solution is simpler, more operable, and more transferable, enabling detection of any target character. Attached Figure Description

[0018] The above and other objects, features, and advantages of this application will become more apparent from the more detailed description of the embodiments of this application in conjunction with the accompanying drawings. The drawings are provided to further illustrate the embodiments of this application and form part of the specification. They are used together with the embodiments of this application to explain this application and do not constitute a limitation thereof. In the drawings, the same reference numerals generally represent the same components or steps.

[0019] Figure 1 The image shown is a schematic diagram of a misspelling provided in an exemplary embodiment of this application.

[0020] Figure 2 The diagram shown is a scenario applicable to an embodiment of this application.

[0021] Figure 3 The diagram shown is a flowchart of a misspelling detection method provided in an exemplary embodiment of this application.

[0022] Figure 4 The diagram shown is a flowchart illustrating the process of determining the misspelling detection result according to an exemplary embodiment of this application.

[0023] Figure 5 The diagram shown is a flowchart illustrating the process of determining a standard template word according to an exemplary embodiment of this application.

[0024] Figure 6 The following shows a schematic diagram of the pre-training process provided by an exemplary embodiment of the present application.

[0025] Figure 7 The following shows a schematic diagram of the supervised training process provided by an exemplary embodiment of the present application.

[0026] Figure 8 The following shows a schematic flowchart of the homework correction method provided by an exemplary embodiment of the present application.

[0027] Figure 9 The following shows a schematic structural diagram of the typo detection device provided by an exemplary embodiment of the present application.

[0028] Figure 10 The following shows a schematic structural diagram of the homework correction device provided by an exemplary embodiment of the present application.

[0029] Figure 11 The following shows a schematic structural diagram of the electronic device provided by an embodiment of the present application. Detailed implementation manners

[0030] Next, the technical solutions in the embodiments of the present application will be clearly and completely described in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

[0031] Application Overview

[0032] Literacy is the foundation of reading and writing and an important task in Chinese teaching. In the process of students' literacy and writing, it is inevitable to write wrong characters. Detecting wrong characters is of great significance for realizing intelligent homework correction. Wrong characters usually include two categories. One category is miswritten characters, which refer to characters that are not in the dictionary. Figure 1 The following shows a schematic diagram of miswritten characters provided by an exemplary embodiment of the present application. As Figure 1 shown, the characters represented by a, b, c, and d are all correct characters. a' is the miswritten character corresponding to the "low" represented by symbol a; b' is the miswritten character corresponding to the "cold" represented by symbol b; c' is the miswritten character corresponding to the "recall" represented by symbol c; d' is the miswritten character corresponding to the "grab" represented by symbol d. The other category is别字 (near-homophone error), which refers to writing one character as another character, but both characters are in the dictionary. The present application focuses on detecting miswritten characters formed by miswriting a certain radical of Chinese characters.

[0033] Relevant misspelling detection schemes include classification schemes based on whole-character modeling and decoding schemes based on radical modeling. Classification schemes based on whole-character modeling treat each type of misspelling as a category and add it to the dictionary of correct characters, then train a classification model to directly classify each Chinese character into its category. This method is suitable for situations where the types of misspellings are known and the number of categories is small, but an unavoidable drawback is that each misspelling requires a sufficient number of training samples. In other words, the whole-character modeling scheme treats each type of misspelling as a category, resulting in a large number of categories that impacts model efficiency. Furthermore, it requires collecting a large amount of training data to train the classification model, and retraining the model is necessary when adding new misspelling categories, leading to high costs.

[0034] The decoding scheme based on radical modeling is a sequence-to-sequence recognition scheme. This scheme consists of two parts: an encoder and a decoder. An encoder extracts features containing text lines from the image, and a decoder decodes these features into a character sequence. This scheme can share the same set of radicals across different Chinese characters, reducing the amount of training data required compared to whole-character modeling schemes. However, it still faces the problem of needing to define new radicals and collect corresponding training data for new types of misspelled characters.

[0035] In view of this, this application proposes a novel solution, transforming the problem of identifying misspelled characters into a problem of comparing standard template characters. The specific process is as follows: First, a target image is acquired, containing the target character to be detected. Next, the target character is converted into a standard template character. Then, a first reference character that meets the similarity criteria and is correct is determined. The similarity between the first reference character and the standard template character is then compared to determine the misspelled character detection result corresponding to the target character. Before detecting the target character, this application standardizes the font style of the target character, avoiding the problem of low misspelled character detection accuracy caused by possible non-standard writing or inconsistent fonts. In other words, this application improves the accuracy of misspelled character detection by converting the target character into a corresponding standard template character. Furthermore, this solution is simpler, more operable, and more transferable, enabling the detection of any target character.

[0036] Exemplary scenario

[0037] The misspelling detection method proposed in this application can be executed by an electronic device, which can be a terminal, such as a smartphone, tablet computer, desktop computer, etc.; or the electronic device can also be a server, such as an independent physical server, a server cluster composed of multiple servers, or a cloud server capable of cloud computing.

[0038] Figure 2The following is a schematic diagram of a scenario applicable to an embodiment of the present application. The scenario schematic diagram includes an image acquisition device 21 and a server 22. The image acquisition device 21 and the server 22 are communicatively connected.

[0039] Exemplarily, the image acquisition device 21 can be a mobile phone, a tablet computer, a camera, or other devices with image acquisition functions. Specifically, the image acquisition device 21 captures the student's answered homework to obtain a target image containing the target character, and sends the target image to the server 22. The server 22 determines the standard template character corresponding to the target character and the first reference character corresponding to the target character, compares the similarity between the first reference character and the standard template character, determines the misspelling detection result corresponding to the target character according to the similarity, and sends the misspelling detection result to the terminal.

[0040] In another applicable scenario, if the complexity of the target character to be detected is low or the number of target characters to be detected is small, the image acquisition device 21 can directly send the target image to the terminal for misspelling detection of the target character. Alternatively, the image acquisition device 21 sends the target characters with a small number and low complexity to the terminal for misspelling detection according to the complexity and number of the target characters to be detected, and sends the target characters with a large number and high complexity to the server 22 for misspelling detection.

[0041] Exemplary methods

[0042] Figure 3 The following is a schematic flowchart of a misspelling detection method provided by an exemplary embodiment of the present application. As Figure 3 shown, the misspelling detection method provided by the embodiment of the present application includes the following steps.

[0043] Step 310, obtain a target image.

[0044] The target image contains the target character to be detected. The number of target characters can be one or multiple. The embodiment of the present application does not limit the number of target characters. In addition, the target character can be a handwritten character or a character in other font styles. For example, the target character is a character in the Chinese regular script style. The embodiment of the present application does not limit the specific font style of the target character.

[0045] Step S320, based on the target image, determine the standard template character corresponding to the target character.

[0046] The character content of the standard template character is the same as that of the target character, and the standard template character is in the first font style. That is, the target character is converted into the standard template character, only the font style of the target character is changed, and the content of the target character is not changed. Exemplarily, the target character is a handwritten character and the standard template character is a regular script character, that is, the handwritten character is converted into a regular script character.

[0047] Step S330: Based on the target image, determine the first reference character that meets the preset similarity conditions with the target character.

[0048] The first reference character is the correct character. For example, the similarity condition refers to the highest similarity, that is, determining the first reference character with the highest similarity to the target character. For example, the target image is input into a trained classification model to obtain the correct character, i.e., the first reference character, which has the highest similarity to the target character to be detected in the target image.

[0049] Step S340: Based on the first reference character and the standard template character, determine the error detection result corresponding to the target character.

[0050] For example, a similarity comparison model is used to extract feature data of the first reference character and the standard template character. Based on the feature data of the first reference character and the feature data of the standard template character, the error detection result corresponding to the target character is determined.

[0051] In this embodiment, the target character is converted into a corresponding standard template character. That is, before detection, the font style of the target character is standardized, avoiding the problem of low error detection accuracy caused by possible non-standard writing or inconsistent fonts. In other words, this application improves the accuracy of error detection by converting the target character into a corresponding standard template character. Secondly, this application matches the target character with a first reference character that meets similarity conditions, and the first reference character is a correct character (e.g., a character existing in a Chinese character dictionary). Then, by comparing the first reference character and the standard template character, the error detection result of the target character is determined. This solution is simpler, more operable, and more transferable, and can detect any target character.

[0052] Figure 4 The diagram shown is a schematic flowchart illustrating the process of determining typo detection results according to an exemplary embodiment of this application. Figure 3 Extending from the illustrated embodiment Figure 4 The illustrated embodiment will be described in detail below. Figure 4 The illustrated embodiments and Figure 3 The differences between the embodiments shown are not repeated here, and the similarities are not repeated here.

[0053] like Figure 4 As shown in the embodiment of this application, the error detection result corresponding to the target character is determined based on the first reference character and the standard template character, including the following steps.

[0054] Step S410: Determine whether the font style of the first reference character is the same as that of the standard template character.

[0055] For example, if the standard template font is KaiTi and the first reference font is SongTi, then the font styles of the first reference font and the standard template font are different.

[0056] Exemplarily, in the actual application process, if the judgment result of step S410 is that the font style of the first reference character is the same as that of the standard template character, then step S450 is executed; if the judgment result of step S410 is that the font style of the first reference character is different from that of the standard template character, then step S420 is executed.

[0057] Step S420: Perform font style conversion on the first reference character to obtain a second reference character corresponding to the first reference character. The second reference character is in the first font style.

[0058] Exemplarily, use a style conversion model to perform font style conversion on the first reference character to obtain a second reference character corresponding to the first reference character. Alternatively, use a font style conversion algorithm to perform font style conversion on the first reference character. The embodiments of the present application do not limit the specific font style conversion method.

[0059] Continuing with the foregoing example, if the first font style is regular script, then the first reference character is also correspondingly converted into regular script to obtain the second reference character. Compared with the first reference character, the content of the second reference character remains unchanged, only the font style is changed.

[0060] Based on step S420, execute step S430: If the character feature data corresponding to the second reference character is the same as the character feature data corresponding to the standard template character, then determine that the misspelling detection result corresponding to the target character is that the target character is a correct character.

[0061] In another implementation, if the character feature data corresponding to the second reference character is the same as the character feature data corresponding to the labeled template character, it can further be determined whether the target character to be detected corresponding to the second reference character is a character in the target task. If so, then determine that the misspelling detection result corresponding to the target character is that the target character is a correct character. Exemplarily, the target task is a dictation task.

[0062] Alternatively, based on step S420, execute step S440: If the character feature data corresponding to the second reference character is different from the character feature data corresponding to the standard template character, then determine that the misspelling detection result corresponding to the target character is that the target character is a misspelling.

[0063] For example, a first reference character and a standard template character are fed into a classification network. The classification objective is to determine whether the input second reference character and the standard template character are the same character. For example, the second reference character and the standard template character are rendered as images. These images can be preprocessed into 64×64 grayscale images, which are then input into four Res Blocks in the classification network to extract features. Each Res Block includes convolutional layers, normalization layers, and activation function layers. Each Res Block is followed by a downsampling layer to downsample the input data. The last Res Block of the four Res Blocks is connected to two fully connected layers, and the output data is either 0 or 1. 0 indicates that the second reference character and the standard template character are different, and 1 indicates that they are the same. That is, when the classification network outputs 0, it indicates that the target character is an incorrect character; when the classification network outputs 1, it indicates that the target character is correct.

[0064] Step S450: If the character feature data corresponding to the first reference character is the same as the character feature data corresponding to the standard template character, then the error detection result corresponding to the target character is determined to be the correct character.

[0065] For example, following the method described in steps S430 and S440, the first reference character and the standard template character are input into the classification network. If the classification network outputs 0 based on the feature data corresponding to the first reference character and the feature data corresponding to the standard template character, the target character is determined to be an incorrect character. If the classification network outputs 1, the target character is determined to be a correct character.

[0066] In this embodiment, when the font styles of the first reference character and the standard template character are different, the font style of the first reference character is converted to be consistent with the font style of the standard template character, and then the second reference character and the standard template character are compared to determine whether they are consistent, thereby determining whether the target character is a correct character. Through the solution in this embodiment, the error detection result corresponding to the target character can be determined more accurately.

[0067] In some embodiments, determining the standard template character corresponding to the target character based on the target image includes: processing the target image using a font standardization model to obtain the standard template character corresponding to the target character.

[0068] The following is combined Figure 5 Provide an example to illustrate the training method for a font standardization model. For example... Figure 5 As shown, the training method for the font standardization model includes the following steps.

[0069] Step S510: Construct the network model to be trained.

[0070] For example, the network model to be trained uses a VisionTransformer architecture. The encoder is a 12-layer TransformerBlock, and the decoder is a 4-layer TransformerBlock.

[0071] Step S520: Perform self-supervised training on the network model to be trained to obtain a pre-trained font standardization model.

[0072] Specifically, step S520 includes: acquiring a first training dataset, which includes character sample images; cutting the character sample images into blocks to obtain multiple character image blocks corresponding to the character sample images; performing a masking operation on some of the character image blocks in the multiple character image blocks; and performing self-supervised training on the network model to be trained based on the character image blocks after the masking operation and the character image blocks without the masking operation to obtain a pre-trained font standardization model.

[0073] Figure 6 The diagram shown is a schematic representation of a pre-training process provided in an exemplary embodiment of this application. Figure 6 As shown, for the image sample "solution" in the first training dataset, it is scaled to 64×64 and then divided into 256 (16x16) non-overlapping character image blocks, each of which is 4×4 in size. One-quarter of the character image blocks are randomly selected from these 256 blocks and retained; the unselected blocks are masked. Only the selected unmasked character image blocks are fed into the encoder to extract features. The encoder's input consists of the features of the 256 sequentially arranged character image blocks. If a character image block is masked in the input image sample, a shared, learnable mask token is used instead; otherwise, the features decoded by the encoder are used. The goal of joint training of the encoder and decoder is to reconstruct the original input image sample. Furthermore, a large amount of unlabeled data (containing both correct and incorrect characters) can be used during pre-training, which greatly improves the model's generalization ability.

[0074] Step S530: Supervised training is performed on the pre-trained font normalization model to obtain the font normalization model.

[0075] Specifically, step S530 includes: obtaining a second training dataset, which includes a training sample group, which includes handwritten character sample images and standard template characters corresponding to the handwritten character sample images; and performing supervised training on the pre-trained font standardization model based on the second training dataset to obtain the font standardization model.

[0076] Figure 7 The diagram shown is a schematic representation of a supervised training process provided in an exemplary embodiment of this application. Figure 7 As shown, this process only uses correctly written sample character images because there are no corresponding standard template characters for incorrect character images. The training process involves using the pre-trained standardized model as the initial model for this step, and performing supervised training using handwritten Chinese character images and paired standard template character images. That is, the training objective is to input a handwritten Chinese character image and output the corresponding standard template character image (grayscale image). In addition, no masking operation is performed on the input sample character images during this training step.

[0077] By employing a combination of self-supervised and supervised training of the model in this embodiment, the problem of recognizing all misspelled characters without expanding the dictionary or adding additional misspelling classifications is solved. Furthermore, during model training, it is not necessary to annotate the radical sequences of misspelled character sample images, significantly saving data costs.

[0078] Figure 8 The diagram shown is a flowchart illustrating a job grading method provided in an exemplary embodiment of this application. Figure 8 As shown in the embodiments of this application, the job correction method includes the following steps.

[0079] Step S810: Obtain the image of the assignment to be graded.

[0080] The image of the assignment to be graded includes the handwritten words to be graded. Specifically, the image of the assignment to be graded can be an image of a response to a dictation word or an image of a response to a word combination. This application embodiment does not limit the specific type of the image of the assignment to be graded.

[0081] Step S820: Determine the error detection result corresponding to the handwritten characters to be corrected.

[0082] For example, the misspelling detection method described in any of the foregoing embodiments is used to determine the misspelling detection result corresponding to the handwritten character to be corrected.

[0083] Step S830: Correct the handwritten characters to be corrected based on the error detection results corresponding to the handwritten characters to be corrected.

[0084] The solution in this application embodiment can more accurately and quickly determine the error detection result of the handwritten character to be corrected, and thus accurately obtain the correction result of the handwritten character to be corrected.

[0085] Exemplary device

[0086] The above text combined Figures 3 to 8 The method embodiments of this application are described in detail below, in conjunction with... Figure 9 and Figure 10 The present application provides a detailed description of the apparatus embodiments. It should be understood that the descriptions of the method embodiments correspond to the descriptions of the apparatus embodiments; therefore, any parts not described in detail can be found in the foregoing method embodiments.

[0087] Figure 9 The diagram shown is a schematic representation of the typo detection device provided in an exemplary embodiment of this application. Figure 9 As shown, the misspelling detection device 90 provided in this application embodiment includes:

[0088] The acquisition module 910 is used to acquire a target image, which contains the target character to be detected.

[0089] The first determining module 920 is used to determine the standard template character corresponding to the target character based on the target image. The character content of the standard template character is the same as the character content of the target character, and the standard template character is a first font style.

[0090] The second determining module 930 is used to determine, based on the target image, a first reference character that meets preset similarity conditions to the target character, wherein the first reference character is the correct character;

[0091] The third determination module 940 is used to determine the misspelling detection result corresponding to the target character based on the first reference character and the standard template character.

[0092] In one embodiment of this application, the third determining module 940 is further configured to: if the font style of the first reference character is different from that of the standard template character, then perform font style conversion on the first reference character to obtain a second reference character corresponding to the first reference character, wherein the second reference character is the first font style; if the character feature data corresponding to the second reference character is different from the character feature data corresponding to the standard template character, then determine that the misspelling detection result corresponding to the target character is a misspelling.

[0093] In one embodiment of this application, the third determining module 940 is further configured to determine that the target character is a correct character if the first reference character has the same font style as the standard template character and the character feature data corresponding to the first reference character is the same as the character feature data corresponding to the standard template character.

[0094] In one embodiment of this application, the first determining module 920 is further configured to process the target image using a font standardization model to obtain a standard template character corresponding to the target character.

[0095] In one embodiment of this application, the first determining module 920 is further configured to: construct a network model to be trained; perform self-supervised training on the network model to be trained to obtain a pre-trained font standardization model; and perform supervised training on the pre-trained font standardization model to obtain a font standardization model.

[0096] In one embodiment of this application, the first determining module 920 is further configured to: acquire a first training dataset, the first training dataset including character sample images; slice the character sample images to obtain multiple character image blocks corresponding to the character sample images; perform masking operations on some of the character image blocks in the multiple character image blocks; and perform self-supervised training on the network model to be trained based on the character image blocks after the masking operation and the character image blocks without the masking operation to obtain a pre-trained font standardization model.

[0097] In one embodiment of this application, the first determining module 920 is further configured to: obtain a second training dataset, the second training dataset including a training sample group, the training sample group including handwritten character sample images and standard template characters corresponding to the handwritten character sample images; and perform supervised training on the pre-trained font standardization model based on the second training dataset to obtain a font standardization model.

[0098] Figure 10 The diagram shown is a structural schematic of a job correction device provided in an exemplary embodiment of this application. Figure 10 As shown, the job correction device 100 provided in this application embodiment includes the following steps.

[0099] The acquisition module 1010 is used to acquire the image of the homework to be graded, which includes the handwritten words to be graded.

[0100] The determination module 1020 is used to determine the error detection result corresponding to the handwritten characters to be corrected.

[0101] Below, for reference Figure 11 This describes an electronic device according to embodiments of the present application. Figure 11 The diagram shown is a structural schematic of an electronic device provided in an exemplary embodiment of this application.

[0102] like Figure 11 As shown, the electronic device 110 includes one or more processors 1101 and memory 1102.

[0103] The processor 1101 may be a central processing unit (CPU) or other form of processing unit with data processing capabilities and / or instruction execution capabilities, and may control other components in the electronic device 110 to perform desired functions.

[0104] The memory 1102 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and / or cache memory. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 1101 may execute the program instructions to implement the methods of the various embodiments of this application described above and / or other desired functions. Various contents, such as target images, target words to be detected, standard template words, and first reference words, may also be stored in the computer-readable storage medium.

[0105] In one example, the electronic device 110 may also include an input device 1103 and an output device 1104, which are interconnected via a bus system and / or other forms of connection mechanism (not shown).

[0106] The input device 1103 may include, for example, a keyboard, a mouse, etc.

[0107] The output device 1104 can output various information to the outside, including target images, target characters to be detected, standard template characters, and first reference characters. The output device 1104 may include, for example, a display, a speaker, a printer, and a communication network and its connected remote output devices, etc.

[0108] Of course, for the sake of simplicity, Figure 11 Only some of the components of the electronic device 110 relevant to this application are shown in this illustration; components such as buses, input / output interfaces, etc., are omitted. In addition, the electronic device 110 may include any other suitable components depending on the specific application.

[0109] In addition to the methods and apparatus described above, embodiments of this application may also be computer program products, which include computer program instructions that, when executed by a processor, cause the processor to perform the steps of the methods according to the various embodiments of this application described above.

[0110] The computer program product can be written in any combination of one or more programming languages ​​to perform the operations of the embodiments of this application. The programming languages ​​include object-oriented programming languages ​​such as Java and C++, as well as conventional procedural programming languages ​​such as C or similar languages. The program code can be executed entirely on the user's computing device, partially on the user's computing device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server.

[0111] Furthermore, embodiments of this application may also be computer-readable storage media storing computer program instructions that, when executed by a processor, cause the processor to perform the steps of the methods described above according to various embodiments of this application.

[0112] The computer-readable storage medium may be any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may, for example, include, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses, or devices, or any combination thereof. More specific examples of readable storage media (a non-exhaustive list) include: electrical connections having one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0113] The basic principles of this application have been described above with reference to specific embodiments. However, it should be noted that the advantages, benefits, and effects mentioned in this application are merely examples and not limitations, and should not be considered as essential features of each embodiment of this application. Furthermore, the specific details disclosed above are for illustrative and facilitative purposes only, and are not limitations. These details do not limit the application to the necessity of employing the aforementioned specific details for implementation.

[0114] The block diagrams of devices, apparatuses, devices, and systems involved in this application are merely illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, apparatuses, devices, and systems can be connected, arranged, and configured in any manner. Words such as “comprising,” “including,” “having,” etc., are open-ended terms meaning “including but not limited to,” and are used interchangeably with them. The terms “or” and “and” as used herein refer to the terms “and / or,” and are used interchangeably with them unless the context clearly indicates otherwise. The term “such as” as used herein refers to the phrase “such as but not limited to,” and is used interchangeably with it.

[0115] It should also be noted that in the apparatus, equipment, and methods of this application, the components or steps can be disassembled and / or recombined. These disassemblies and / or recombinations should be considered as equivalent solutions of this application.

[0116] The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use this application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the general principles defined herein can be applied to other aspects without departing from the scope of this application. Therefore, this application is not intended to be limited to the aspects shown herein, but rather to be accorded the widest scope consistent with the principles and novel features disclosed herein.

[0117] The above description has been given for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of this application to the forms disclosed herein. Although numerous exemplary aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, alterations, additions, and sub-combinations thereof.

Claims

1. A method for detecting misspelled words, characterized in that, include: Acquire a target image, wherein the target image contains the target character to be detected; The target image is processed using a font standardization model to convert the target character into a standard template character of a first font style, wherein the character content of the standard template character is the same as the character content of the target character. Based on the target image, a first reference character that meets preset similarity conditions with the target character is determined, and the first reference character is the correct character; Determine whether the font styles of the first reference character and the standard template character are the same; If the font style of the first reference character is different from that of the standard template character, then the font style of the first reference character is converted to obtain the second reference character corresponding to the first reference character, and the second reference character is the first font style; The second reference character and the standard template character are input into the classification network to obtain the misspelling detection result corresponding to the target character; or, If the first reference character has the same font style as the standard template character, then the first reference character and the standard template character are input into the classification network to obtain the misspelling detection result corresponding to the target character.

2. The method according to claim 1, characterized in that, The step of inputting the second reference character and the standard template character into the classification network to obtain the misspelling detection result corresponding to the target character includes: The second reference character and the standard template character are input into the classification network so that the classification network compares the character feature data corresponding to the second reference character and the character feature data corresponding to the standard template character. If the character feature data corresponding to the second reference character is different from the character feature data corresponding to the standard template character, then the error detection result corresponding to the target character is determined to be that the target character is an error. or, If the character feature data corresponding to the second reference character is the same as the character feature data corresponding to the standard template character, then the error detection result corresponding to the target character is determined to be the correct character.

3. The method according to claim 1, characterized in that, The step of inputting the first reference character and the standard template character into the classification network to obtain the misspelling detection result corresponding to the target character includes: The first reference character and the standard template character are input into the classification network so that the classification network compares the character feature data corresponding to the first reference character and the character feature data corresponding to the standard template character. If the character feature data corresponding to the first reference character is the same as the character feature data corresponding to the standard template character, then the error detection result corresponding to the target character is determined to be that the target character is a correct character. or, If the character feature data corresponding to the first reference character is different from the character feature data corresponding to the standard template character, then the error detection result corresponding to the target character is determined to be that the target character is an error.

4. The method according to claim 1, characterized in that, The training method for the font standardization model includes: Construct the network model to be trained; The network model to be trained is subjected to self-supervised training to obtain a pre-trained font standardization model; The pre-trained font standardization model is subjected to supervised training to obtain the font standardization model.

5. The method according to claim 4, characterized in that, The step of performing self-supervised training on the network model to be trained to obtain a pre-trained font standardization model includes: Obtain a first training dataset, which includes word sample images; The character sample image is segmented to obtain multiple character image blocks corresponding to the character sample image; A masking operation is performed on a portion of the character image blocks from the plurality of character image blocks; Based on the character image blocks after the masking operation and the character image blocks without the masking operation, the network model to be trained is subjected to self-supervised training to obtain the pre-trained font standardization model.

6. The method according to claim 4, characterized in that, The step of supervising the training of the pre-trained font normalization model to obtain the font normalization model includes: Obtain a second training dataset, which includes a training sample group, the training sample group including handwritten character sample images and standard template characters corresponding to the handwritten character sample images; Based on the second training dataset, the pre-trained font standardization model is subjected to supervised training to obtain the font standardization model.

7. A method for grading homework, characterized in that, include: Obtain an image of the assignment to be graded, the image of which includes handwritten text to be graded; Using the misspelling detection method according to any one of claims 1 to 6, the misspelling detection result corresponding to the handwritten character to be corrected is determined.

8. A computer-readable storage medium, characterized in that, The storage medium stores a computer program for performing the method described in any one of claims 1 to 7.

9. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions; The processor is configured to perform the method described in any one of claims 1 to 7.