Text recognition method and device, storage medium and electronic equipment

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By decomposing the character to be identified into a sequence of radicals and obtaining feature data, and combining it with an encoder-decoder model, the problem of low accuracy in misspelling recognition in existing technologies is solved, and efficient and accurate misspelling recognition is achieved.

CN116246278BActive Publication Date: 2026-06-26IFLYTEK CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: IFLYTEK CO LTD
Filing Date: 2022-12-16
Publication Date: 2026-06-26

Application Information

Patent Timeline

16 Dec 2022

Application

26 Jun 2026

Publication

CN116246278B

IPC: G06V30/19

AI Tagging

Technology Topics

Text recognition Algorithm

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing methods for identifying misspelled words mainly rely on stroke structure, resulting in low accuracy and an inability to accurately identify misspelled words in students' handwritten assignments.

Method used

By breaking down the character to be identified into a sequence of radicals, the feature data of each radical element and the feature data of the writing template are obtained. The features are extracted using an encoder-decoder model, and the misspellings are identified by combining the radical and stroke recognition sequences.

Benefits of technology

It improves the accuracy and speed of misspelling recognition, and can easily and accurately determine the recognition result of the character to be recognized, making it suitable for grading students' handwritten homework.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116246278B_ABST

Patent Text Reader

Abstract

The application provides a character recognition method and device, a storage medium and an electronic device, and relates to the technical field of character processing. The character recognition method comprises: disassembling a to-be-recognized character to obtain a component sequence of the to-be-recognized character, the component sequence comprising at least one component element, and the at least one component element being combined to form the to-be-recognized character; if each of the at least one component element corresponds to a writing template, determining feature data of each of the at least one component element; determining feature data of the writing template corresponding to each of the at least one component element; and determining a wrong character recognition result corresponding to the to-be-recognized character based on the feature data of each of the at least one component element and the feature data of the writing template corresponding to each of the at least one component element. Through the scheme in the application, not only wrong character recognition can be performed, but also different character and correct character recognition can be performed, and the recognition accuracy of the to-be-recognized character is effectively improved based on the feature data of the writing template.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of text processing technology, specifically to a text recognition method, apparatus, storage medium, and electronic device. Background Technology

[0002] During their school years, students often have a lot of handwritten assignments. After completing their handwritten assignments, students compare them with the standard answers to determine if the characters they wrote are correct. However, this method can generally only identify misspelled characters, not incorrect ones.

[0003] In related methods for identifying misspelled characters, the method involves obtaining the stroke structure of the characters written by the student to identify misspelled characters. However, the stroke structure is relatively simple, which leads to low accuracy in identifying misspelled characters, and further to low accuracy in judging misspelled characters. Summary of the Invention

[0004] To address the aforementioned technical problems, this application is proposed. Embodiments of this application provide a character recognition method, apparatus, storage medium, and electronic device.

[0005] In a first aspect, one embodiment of this application provides a character recognition method, comprising: disassembling a character to be recognized to obtain a radical sequence of the character to be recognized, the radical sequence including at least one radical element, the at least one radical element being combined to form the character to be recognized; if each of the at least one radical element has a corresponding writing template, then determining the feature data of each of the at least one radical element; determining the feature data of the writing template corresponding to each of the at least one radical element; and determining the misspelling recognition result corresponding to the character to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element.

[0006] In conjunction with the first aspect, in some implementations of the first aspect, the misspelling recognition result of the character to be recognized is determined based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element, including: if the feature data of each radical element is consistent with the feature data of the writing template corresponding to the radical element, then the stroke recognition sequence corresponding to each of the at least one radical element is obtained; and the misspelling recognition result of the character to be recognized is determined based on the stroke recognition sequence corresponding to each of the at least one radical element.

[0007] In conjunction with the first aspect, in some implementations of the first aspect, determining the misspelling recognition result of the character to be recognized based on the stroke recognition sequence corresponding to each of at least one radical element includes: obtaining the standard stroke sequence corresponding to each of at least one radical element; if the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, then obtaining M dictated characters, where M is a positive integer; and determining the misspelling recognition result of the character to be recognized based on the M dictated characters.

[0008] In conjunction with the first aspect, in some implementations of the first aspect, based on M dictated characters, the misspelling recognition result corresponding to the character to be recognized is determined, including: if the character to be recognized is the same as one of the M dictated characters, then the character to be recognized is determined to be a correct character recognition result; if the character to be recognized is different from all of the M dictated characters, then the character to be recognized is determined to be a misspelling recognition result.

[0009] In conjunction with the first aspect, in some implementations of the first aspect, determining the misspelling recognition result of the character to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element further includes: determining whether there is a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence of the radical element; if there is a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence, then the character to be recognized is determined to be a misspelling recognition result.

[0010] In conjunction with the first aspect, in some implementations of the first aspect, determining the misspelling recognition result of the character to be recognized based on the feature data of each of at least one radical element and the feature data of the writing template corresponding to each of at least one radical element further includes: determining whether there are any radical elements whose feature data and the feature data of the writing template are inconsistent; if there are any radical elements whose feature data and the feature data of the writing template are inconsistent, then the character to be recognized is determined to be a misspelling recognition result.

[0011] In conjunction with the first aspect, in some implementations of the first aspect, the character to be identified is decomposed to obtain the radical sequence of the character to be identified, including: using an encoder-decoder model to decompose the character to be identified to obtain the radical sequence of the character to be identified, wherein the encoder-decoder model contains an attention mechanism that can extract features.

[0012] Secondly, one embodiment of this application provides a character recognition device, comprising: a first determining module, configured to disassemble a character to be recognized to obtain a radical sequence of the character to be recognized, the radical sequence including at least one radical element, the at least one radical element being combined to form the character to be recognized; a second determining module, configured to determine the feature data of each of the at least one radical element if each of the at least one radical element corresponds to a writing template; a third determining module, configured to determine the feature data of the writing template corresponding to each of the at least one radical element; and a fourth determining module, configured to determine the misspelling recognition result corresponding to the character to be recognized based on the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element.

[0013] Thirdly, one embodiment of this application provides a computer-readable storage medium storing a computer program for performing the character recognition method described in the first aspect.

[0014] Fourthly, one embodiment of this application provides an electronic device, the electronic device comprising: a processor; a memory for storing processor-executable instructions; the processor being configured to perform the character recognition method described in the first aspect.

[0015] The character recognition method provided in this application has the following beneficial effects.

[0016] First, radical sequences, compared to stroke sequences, more accurately represent the structural features of a character. Therefore, determining the recognition result based on the radical sequence of the character to be identified can improve the recognition accuracy. Furthermore, radical sequences are less complex than stroke sequences, further reducing the computational load when determining the recognition result, thus increasing the recognition speed.

[0017] Secondly, after obtaining the radical sequence, this application further determines the feature data of the radical elements and the feature data of the writing template after evaluating that each radical element in the radical sequence has a corresponding writing template. Based on the feature data of the radical elements and the feature data of the writing template, the error recognition result of the character to be recognized is determined. That is, the reference template is used as the standard for recognizing misspelled characters, incorrect characters, and correct characters, and the similarity of the feature data of the two is compared. This method is highly operable and can simply and accurately determine the various recognition results of the character to be recognized. Attached Figure Description

[0018] The above and other objects, features, and advantages of this application will become more apparent from the more detailed description of the embodiments of this application in conjunction with the accompanying drawings. The drawings are provided to further illustrate the embodiments of this application and form part of the specification. They are used together with the embodiments of this application to explain this application and do not constitute a limitation thereof. In the drawings, the same reference numerals generally represent the same components or steps.

[0019] Figure 1 The diagram shown is a schematic representation of the implementation environment of the character recognition method provided in an embodiment of this application.

[0020] Figure 2 The diagram shown is an application scenario illustration provided by an exemplary embodiment of this application.

[0021] Figure 3 The diagram shown is a flowchart illustrating a text recognition method provided in an exemplary embodiment of this application.

[0022] Figure 4 The diagram shown is a flowchart illustrating the process of determining the recognition result corresponding to the character to be recognized, provided in another exemplary embodiment of this application.

[0023] Figure 5 The diagram shown is a schematic diagram of the modeling of radical sequence and stroke recognition sequence provided in an exemplary embodiment of this application.

[0024] Figure 6 The diagram shown is a structural schematic of a character recognition device provided in an exemplary embodiment of this application.

[0025] Figure 7 The diagram shown is a structural schematic of an electronic device provided in an exemplary embodiment of this application. Detailed Implementation

[0026] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0027] Exemplary scenario

[0028] Figure 1 The diagram shown illustrates the implementation environment of a text recognition method provided in an embodiment of this application. Figure 1As shown, this implementation environment includes a user terminal 11 and a server 12, which are connected by communication. The user terminal 11 has an application installed to perform preprocessing on the characters to be recognized. The server 12 can be a standalone physical server, a server cluster consisting of multiple servers, or a cloud server capable of cloud computing. Furthermore, the server can be considered a server for a specific business (character recognition or financial transactions). In addition, the server can be a physical machine or a virtual machine, and the number can be one or more. This application embodiment does not limit the type or number of servers.

[0029] Specifically, the user terminal can first acquire an image containing the character to be recognized and use a classic object detection scheme to detect the character in the image. Then, using a preprocessing application installed on the user terminal 11, the character to be recognized is decomposed to obtain the radical sequence of the character. Further, to improve the processing speed of the character to be recognized, the user terminal 11 sends the radical sequence of the character to be recognized to the server 12. The server 12, based on the received radical sequence of the character to be recognized, searches the relevant database to see if each radical element in the radical sequence has a writing template. If it does, it retrieves the writing template corresponding to the radical element and, through a pre-trained feature comparison model deployed on the server 12, first determines the feature data of the radical element and the feature data of the writing template, then calculates the similarity between the two feature data, and finally determines the recognition result corresponding to the character to be recognized based on the calculation result.

[0030] Figure 2 The diagram illustrates an application scenario provided by an exemplary embodiment of this application, specifically a dictation homework correction scenario. Specifically, students write the relevant words and phrases dictated by the teacher on their homework paper. After obtaining the student's written homework paper, the teacher uses user terminal 11 to photograph the paper. User terminal 11 detects the dictated words on the homework paper and parses the radical sequence of each word. Further, user terminal 11 sends the radical sequence of the words to be recognized to server 12, and server 12 uses... Figure 1 The method described in the example calculates the similarity between the feature data of the radical element of the character to be identified and the feature data of the writing template. Based on the similarity, the identification result of the character to be identified is determined. Finally, after all the characters to be identified in the dictation on the homework paper are corrected, the identification result corresponding to each student's answer is sent to the user terminal 11.

[0031] Exemplary methods

[0032] Figure 3 The diagram shown is a schematic flowchart of a text recognition method provided in an exemplary embodiment of this application. Figure 3As shown in the figure, the text recognition method provided by the embodiment of the present application includes the following steps.

[0033] Step S310: Decompose the character to be recognized to obtain the radical sequence of the character to be recognized.

[0034] The radical sequence includes at least one radical element, and at least one radical element combines to form the character to be recognized. In addition, the radical sequence also includes the position information of the radical element in the character to be recognized, that is, each radical element in the radical sequence is also attached with the attribute characteristics of the position information.

[0035] Specifically, a radical is the basic structure of a Chinese character form and is a component of a compound character. Therefore, according to the configuration of the Chinese character form, the character to be recognized can be decomposed to obtain the radical sequence of the character to be recognized. Specifically, a compound character refers to a Chinese character composed of two or more single characters. For example, the independent parts on the top, bottom, left, and right of a character are used as radical elements. Exemplarily, for the character to be recognized "Jiang", its corresponding radical elements include the three dots of water and "Gong", and the radical sequence corresponding to "Jiang" includes the three dots of water and the position information of the three dots of water in "Jiang", "Gong" and the position information of "Gong" in "Jiang"; for the character to be recognized "Ba", its corresponding radical elements include the hand radical and "Ba", and the radical sequence corresponding to "Ba" includes the hand radical and the position information of the hand radical in "Ba", "Ba" and the position information of "Ba" in "Ba".

[0036] Step S320: If at least one radical element each corresponds to a writing template, determine the characteristic data of each of the at least one radical element.

[0037] The writing template refers to the standard reference template corresponding to the radical element, and the writing template can be any font. Exemplarily, the writing template is Song typeface, regular script, etc.

[0038] Before determining the characteristic data of each of the at least one radical element, it further includes: obtaining the set of all radical elements included in the Chinese characters in the Chinese character library. If the radical element of the character to be recognized is not within the set of radical elements, it is considered that the recognition result of the character to be recognized is a misspelled character recognition result. If the radical elements of the character to be recognized are all within the set of radical elements, it is considered that each of the at least one radical element has its corresponding writing template, and further, determine the characteristic data of each of the at least one radical element.

[0039] In another implementation manner, if there is a radical element without a corresponding writing template, it is considered that the character to be recognized corresponding to this radical element is a misspelled character recognition result.

[0040] Step S330: Determine the characteristic data of the writing templates corresponding to each of the at least one radical element.

[0041] For example, feature data of the writing template corresponding to at least one radical element can be determined by a feature extraction model, or feature data of the writing template corresponding to at least one radical element can be determined by a feature extraction algorithm.

[0042] Step S340: Based on the feature data of at least one radical element and the feature data of the writing template corresponding to at least one radical element, determine the misspelling recognition result of the character to be recognized.

[0043] Specifically, the misspelling recognition results include misspelling recognition results, incorrect character recognition results, and correct character recognition results. The similarity between the feature data corresponding to the radical element and the feature data corresponding to the writing template can be compared, and based on the similarity, the recognition result corresponding to the character to be recognized is determined.

[0044] For example, the feature data is a feature vector. The similarity between the radical element and the writing template is determined by calculating the cosine distance between the feature vector corresponding to the radical element and the feature vector corresponding to the writing template.

[0045] In this embodiment, firstly, radical sequences, compared to stroke sequences, more accurately represent the structural features of a character. Therefore, determining the recognition result based on the radical sequence of the character to be recognized can improve the recognition accuracy. Furthermore, radical sequences have lower complexity than stroke sequences, further reducing the computational load when determining the recognition result, thus improving recognition speed. Secondly, after obtaining the radical sequence, this application, after evaluating that each radical element in the sequence has a corresponding writing template, further determines the feature data of the radical elements and the feature data of the writing template, and determines the misspelling recognition result of the character to be recognized based on the feature data of the radical elements and the feature data of the writing template. That is, using the reference template as the standard for recognizing misspelled characters, incorrect characters, and correct characters, and comparing the similarity of their feature data, this method is highly operable and can simply and accurately determine various recognition results of the character to be recognized.

[0046] Figure 4 The diagram shown is a flowchart illustrating the process of determining the recognition result corresponding to the character to be recognized, provided in another exemplary embodiment of this application. Figure 3 Extending from the illustrated embodiment Figure 4 The illustrated embodiment will be described in detail below. Figure 4 The illustrated embodiments and Figure 3 The differences between the embodiments shown are not repeated here, and the similarities are not repeated here.

[0047] like Figure 4As shown, in the embodiments of the present application, based on the feature data of each radical element and the feature data of the writing template corresponding to each radical element, a misspelling recognition result corresponding to the word to be recognized is determined, including the following steps.

[0048] Step S410, determine whether the feature data of each radical element is consistent with the feature data of the writing template corresponding to the radical element.

[0049] Specifically, follow the Figure 3 method in the shown embodiment to determine the consistency between the feature data of each radical element and the feature data of the writing template.

[0050] Exemplarily, in the actual application process, if the judgment result of step S410 is negative, that is, there is inconsistency between the feature data of the radical element and the feature data of the writing template, then execute step S420; if the judgment result of step S410 is positive, that is, the feature data of each radical element is consistent with the feature data of the writing template corresponding to the radical element, then execute step S430 and step S440.

[0051] Specifically, preset equivalent similarity conditions. If the similarity value between the feature data of the radical element and the feature data of the writing template meets the equivalent similarity conditions, then it is considered that the feature data of the radical element and the feature data of the writing template are consistent. If the similarity value between the feature data of the radical element and the feature data of the writing template does not meet the equivalent similarity conditions, then it is considered that the feature data of the radical element and the feature data of the writing template are inconsistent.

[0052] Exemplarily, the writing template and the radical element can be input into the comparison model. The comparison model extracts the feature data of the writing template and the feature data of the radical element. Further, the feature data of the writing template and the feature data of the radical element are input into the fully connected layer in the comparison model, and a similarity value between 0 and 1 output by the comparison model is obtained.

[0053] Exemplarily, the equivalent similarity condition is that the cosine distance between the feature data of the radical element and the feature data of the writing template is greater than 0.7. For example, if the actually calculated cosine distance between the two is equal to 0.65, then it is considered that the feature data of the radical element and the feature data of the writing template are consistent; otherwise, it is considered that the feature data of the two are inconsistent.

[0054] Step S420, determine that the recognition result corresponding to the word to be recognized is a misspelling recognition result.

[0055] Specifically, a misspelling refers to a word that is written wrong itself. For example, for the Chinese character "丰", the three horizontal strokes of "丰" are written as four horizontal strokes. At this time, the written Chinese character is a non-existent word, that is, a misspelling.

[0056] Further, the characteristic data of the radical elements can reflect the writing structure characteristics of the radical elements as a whole in a low-dimensional space or a high-dimensional space. If the characteristic data of the radical elements is inconsistent with the characteristic data of the writing template, it is considered that the recognition result of the to-be-recognized character is a misspelled character recognition result.

[0057] Step S430, obtain the stroke recognition sequence corresponding to each of at least one radical element.

[0058] Exemplarily, the strokes can be horizontal, vertical, left-falling, right-falling, hooked vertical, vertical hook with a horizontal stroke, horizontal fold with a hook, etc. Similarly, the stroke recognition sequence includes strokes and the position data of the strokes in the entire radical element. For example, for the radical element "日", its corresponding stroke sequence includes vertical, horizontal fold, horizontal, horizontal, and the position information of "vertical, horizontal fold, horizontal, horizontal" in "日". Through the position information of each stroke in the radical element "日", the relative position between every two strokes can be determined.

[0059] Step S440, determine the recognition result corresponding to the to-be-recognized character based on the stroke recognition sequence corresponding to each of at least one radical element.

[0060] Specifically, first obtain the standard stroke sequence corresponding to each of at least one radical element. If there is a radical element in at least one radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence, determine that the recognition result corresponding to the to-be-recognized character is a misspelled character recognition result.

[0061] Further, the standard stroke sequence refers to the correct stroke writing template corresponding to the stroke recognition sequence. The inconsistency between the stroke recognition sequence and the standard stroke sequence means that there is at least one stroke in the stroke recognition sequence that is inconsistent with the stroke in the standard stroke sequence, and / or there is at least one position information of a stroke in the stroke recognition sequence that is inconsistent with the position information of the corresponding stroke in the standard stroke sequence.

[0062] Continuing with the example in step S430, for the radical element "日", first determine the third stroke "horizontal" in the stroke recognition sequence of "日", its position data in the to-be-recognized character "日" is a, and the position data of the third stroke "horizontal" in the standard stroke sequence corresponding to "日" in "日" is b. If the difference between the position data a and the position data b is greater than the preset threshold, it is considered that the stroke sequence of the radical element "日" of the to-be-recognized character is inconsistent with the standard stroke sequence. Further, determine that the to-be-recognized character corresponding to the radical element "日" is a misspelled character recognition result.

[0063] In another possible implementation, if in at least one radical element, the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, obtain M dictation characters, where M is a positive integer.

[0064] Specifically, the stroke recognition sequence corresponding to a radical element being consistent with the standard stroke sequence corresponding to the radical element means that all the strokes present in the stroke recognition sequence are consistent with the strokes in the standard stroke sequence, and moreover, the position information of each stroke in the stroke recognition sequence is consistent with the position information of the corresponding stroke in the standard stroke sequence.

[0065] Further, if it is determined that the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, then M dictation words are obtained. The reporting subjects of the M dictation words can be human beings or the system.

[0066] If the word to be recognized is the same as one of the M dictation words, then it is determined that the word to be recognized is the correct recognition result; if the word to be recognized is different from all of the M dictation words, then it is determined that the word to be recognized is the misspelled word recognition result.

[0067] That is, in the case where the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, it can be further determined whether the word to be recognized is the dictation word. A misspelled word means that the word itself is not incorrect, but it is misused in a vocabulary or a sentence, for example, "bodyguard" is written as "prostitute guard".

[0068] Exemplarily, the radical sequence and the stroke recognition sequence of the word to be recognized determined previously can be compared with the radical sequence and the stroke recognition sequence of the dictation words to determine whether the word to be recognized is the same as one of the M dictation words. Alternatively, the deep feature data of the word to be recognized and the deep feature data corresponding to each of the M dictation words can be directly extracted, and the deep feature data of the two can be compared to determine whether the word to be recognized is the dictation word.

[0069] Exemplarily, among the M dictation words include "decline", "weekend", "angle", "sit waiting for death", etc. If the word to be recognized is "seat", and the stroke recognition sequence corresponding to each radical element in "seat" is consistent with the standard stroke sequence corresponding to the radical element, but there is no "seat" among the dictation words, then the word to be recognized "seat" is a misspelled word. On the other hand, if the word to be recognized is "sit", then it can be determined that "sit" is the correct word.

[0070] In the embodiments of the present application, a preliminary judgment on the recognition result of the word to be recognized is achieved through the radical sequence. Based on the radical sequence, a deep judgment on the recognition result of the word to be recognized is achieved through the stroke recognition sequence. Through the dual judgment of the radical sequence and the stroke recognition sequence, on the one hand, the recognition result of the word to be recognized can be accurately and efficiently determined. On the other hand, by comparing the word to be recognized with the M dictation words, the judgment of the recognition results of misspelled words,别字, and correct words in the dictation scenario is simultaneously achieved.

[0071] In an embodiment of the present application, the character to be recognized is disassembled to obtain the radical sequence of the character to be recognized, including: using an encoder-decoder model to disassemble the character to be recognized to obtain the radical sequence of the character to be recognized, and the encoder-decoder model includes an attention mechanism capable of extracting features.

[0072] Specifically, first, an encoder-decoder model including an attention mechanism is trained. This encoder-decoder model can hierarchically model the radical sequence of the character to be recognized. In addition, it can also hierarchically model the stroke sequence included in the radical elements in the radical sequence.

[0073] Exemplarily, an image containing the character to be recognized or the character to be recognized itself is input into the encoder-decoder module. First, the model decodes the radicals of the character to be recognized, which is the first level. On the basis of the first level, when it is determined that the feature data of the radical element and the feature data of the writing template are consistent, the stroke recognition sequence of the radical element is continuously decoded, which is the second level. In the training stage of the encoder-decoder model, in order to improve the accuracy of the encoder-decoder model, the training dataset includes not only correctly written Chinese characters but also incorrectly written Chinese characters.

[0074] In an embodiment of the present application, using the encoder-decoder model, the radical sequence and the stroke recognition sequence are respectively extracted at the first level and the second level. Using the results extracted at these two levels, the structural features of the character to be recognized are accurately and comprehensively characterized in different dimensions. And through the joint training of correct writing samples and incorrect writing samples, the robustness of the encoder-decoder model is improved.

[0075] Figure 5 The following shows a schematic diagram of modeling the radical sequence and the stroke recognition sequence provided by an exemplary embodiment of the present application. As Figure 5 shown, for the character to be recognized "夫", first, an image containing the character to be recognized "夫" or the character "夫" itself is input into the encoder-decoder model. The encoder-decoder model uses the attention mechanism to decode the radical sequence of "夫", including the radical element "二" and the radical element "人". The feature data of the radical element "人" and the feature data corresponding to the writing template of the radical element "人" are input into a binary classifier to determine whether the feature data of the radical element "人" of the character to be recognized and the feature data corresponding to the writing template are consistent. The feature data of the radical element "人" and the feature data corresponding to the writing template of the radical element "人" are input into a binary classifier to determine whether the feature data of the radical element "人" of the character to be recognized and the feature data corresponding to the writing template are consistent.

[0076] Exemplarily, the binary classifier outputs 0 or 1 based on the feature data of the radical elements in the input and the feature data of the writing templates of the radical elements. Here, 0 indicates that the radical element and the corresponding writing template are inconsistent; 1 indicates that the radical element and the corresponding writing template are consistent.

[0077] Further, when the feature data of the above two radical elements and the feature data of the writing templates are both consistent, the stroke order of the radical element "two" is decoded to include "horizontal, horizontal" and the position information corresponding to each of "horizontal, horizontal"; the stroke sequence of the radical element "person" includes "left-falling stroke, right-falling stroke" and the position information corresponding to each of "left-falling stroke, right-falling stroke".

[0078] As described above in conjunction with Figures 3 to 5 , the embodiments of the text recognition method of the present application have been described in detail. Below, in conjunction with Figure 6 , the embodiments of the text recognition device of the present application will be described in detail. It should be understood that the description of the embodiments of the text recognition method corresponds to the description of the embodiments of the text recognition device. Therefore, for the parts not described in detail, reference can be made to the previous method embodiments.

[0079] Figure 6 The following shows a schematic structural diagram of a text recognition device provided by an exemplary embodiment of the present application. As Figure 6 shown, the text recognition device provided by the embodiments of the present application includes:

[0080] A first determination module 610, configured to disassemble the word to be recognized to obtain a radical sequence of the word to be recognized, where the radical sequence includes at least one radical element, and at least one radical element combines to form the word to be recognized;

[0081] A second determination module 620, configured to determine the feature data of at least one radical element if at least one radical element each corresponds to a writing template;

[0082] A third determination module 630, configured to determine the feature data of the writing templates corresponding to at least one radical element respectively;

[0083] A fourth determination module 640, configured to determine the misspelling recognition result corresponding to the word to be recognized based on the feature data of at least one radical element and the feature data of the writing templates corresponding to at least one radical element respectively.

[0084] In an embodiment of the present application, the fourth determination module 640 is further configured to, if in at least one radical element, the feature data of each radical element is consistent with the feature data of the writing template corresponding to the radical element, obtain the stroke recognition sequence corresponding to at least one radical element respectively; and determine the misspelling recognition result corresponding to the word to be recognized based on the stroke recognition sequence corresponding to at least one radical element respectively.

[0085] In one embodiment of this application, the fourth determining module 640 is further configured to: obtain the standard stroke sequence corresponding to each of at least one radical element; if the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, then obtain M dictated characters, where M is a positive integer; and determine the misspelling recognition result corresponding to the character to be recognized based on the M dictated characters.

[0086] In one embodiment of this application, the fourth determining module 640 is further configured to: if the character to be identified is the same as one of the M dictated characters, then determine that the character to be identified is the correct character recognition result; if the character to be identified is not the same as any of the M dictated characters, then determine that the character to be identified is the incorrect character recognition result.

[0087] In one embodiment of this application, the fourth determining module 640 is further configured to determine whether there is a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence of the radical element; if there is a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence, then the character to be recognized is determined to be a misspelling recognition result.

[0088] In one embodiment of this application, the fourth determining module 640 is further configured to determine whether there are any radical elements whose feature data is inconsistent with the feature data of the writing template of the radical element; if there are any radical elements whose feature data is inconsistent with the feature data of the writing template, then the character to be identified is determined to be a misspelling recognition result.

[0089] In one embodiment of this application, the fourth determining module 640 is further configured to decompose the character to be identified using an encoder-decoder model to obtain the radical sequence of the character to be identified. The encoder-decoder model includes an attention mechanism that can extract features.

[0090] Below, for reference Figure 7 This describes an electronic device according to embodiments of the present application. Figure 7 The diagram shown is a structural schematic of an electronic device provided in an exemplary embodiment of this application.

[0091] like Figure 7 As shown, the electronic device 70 includes one or more processors 701 and memory 702.

[0092] The processor 701 may be a central processing unit (CPU) or other form of processing unit with data processing capabilities and / or instruction execution capabilities, and may control other components in the electronic device 70 to perform desired functions.

[0093] The memory 702 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and / or cache memory. The non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 701 may execute the program instructions to implement the character recognition methods of the various embodiments of this application described above and / or other desired functions. The computer-readable storage medium may also store various contents such as the character to be recognized, radical sequences, feature data corresponding to radical elements, feature data corresponding to the writing template, and recognition results.

[0094] In one example, the electronic device 70 may also include an input device 703 and an output device 704, which are interconnected via a bus system and / or other forms of connection mechanism (not shown).

[0095] The input device 703 may include, for example, a keyboard, a mouse, etc.

[0096] The output device 704 can output various information to the outside, including the character to be recognized, the radical sequence, the feature data corresponding to the radical elements, the feature data corresponding to the writing template, the recognition result, etc. The output device 704 may include, for example, a display, a speaker, a printer, and a communication network and its connected remote output devices, etc.

[0097] Of course, for the sake of simplicity, Figure 7 Only some of the components of the electronic device 70 relevant to this application are shown in this illustration; components such as buses, input / output interfaces, etc., are omitted. In addition, the electronic device 70 may include any other suitable components depending on the specific application.

[0098] In addition to the methods and devices described above, embodiments of this application may also be computer program products, which include computer program instructions that, when executed by a processor, cause the processor to perform the steps in the character recognition methods according to various embodiments of this application described above.

[0099] The computer program product can be written in any combination of one or more programming languages to perform the operations of the embodiments of this application. The programming languages include object-oriented programming languages such as Java and C++, as well as conventional procedural programming languages such as C or similar languages. The program code can be executed entirely on the user's computing device, partially on the user's computing device, as a standalone software package, partially on the user's computing device and partially on a remote computing device, or entirely on a remote computing device or server.

[0100] Furthermore, embodiments of this application may also be computer-readable storage media storing computer program instructions thereon, which, when executed by a processor, cause the processor to perform the steps in the character recognition methods according to various embodiments of this application described above.

[0101] The computer-readable storage medium may be any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may, for example, include, but is not limited to, electrical, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatuses, or devices, or any combination thereof. More specific examples of readable storage media (a non-exhaustive list) include: electrical connections having one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0102] The basic principles of this application have been described above with reference to specific embodiments. However, it should be noted that the advantages, benefits, and effects mentioned in this application are merely examples and not limitations, and should not be considered as essential features of each embodiment of this application. Furthermore, the specific details disclosed above are for illustrative and facilitative purposes only, and are not limitations. These details do not limit the application to the necessity of employing the aforementioned specific details for implementation.

[0103] The block diagrams of devices, apparatuses, devices, and systems involved in this application are merely illustrative examples and are not intended to require or imply that they must be connected, arranged, or configured in the manner shown in the block diagrams. As those skilled in the art will recognize, these devices, apparatuses, devices, and systems can be connected, arranged, and configured in any manner. Words such as “comprising,” “including,” “having,” etc., are open-ended terms meaning “including but not limited to,” and are used interchangeably with them. The terms “or” and “and” as used herein refer to the terms “and / or,” and are used interchangeably with them unless the context clearly indicates otherwise. The term “such as” as used herein refers to the phrase “such as but not limited to,” and is used interchangeably with it.

[0104] It should also be noted that in the apparatus, equipment, and methods of this application, the components or steps can be disassembled and / or recombined. These disassemblies and / or recombinations should be considered as equivalent solutions of this application.

[0105] The above description of the disclosed aspects is provided to enable any person skilled in the art to make or use this application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the general principles defined herein can be applied to other aspects without departing from the scope of this application. Therefore, this application is not intended to be limited to the aspects shown herein, but rather to be accorded the widest scope consistent with the principles and novel features disclosed herein.

[0106] The above description has been given for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of this application to the forms disclosed herein. Although numerous exemplary aspects and embodiments have been discussed above, those skilled in the art will recognize certain variations, modifications, alterations, additions, and sub-combinations thereof.

Claims

1. A character recognition method, characterized in that, include: The character to be identified is decomposed to obtain the radical sequence of the character to be identified. The radical sequence includes at least one radical element, and the at least one radical element is combined to form the character to be identified. If each of the at least one radical element has a corresponding writing template, then the feature data of each of the at least one radical element is determined. If all the radical elements of the character to be identified are within the radical element set, then it is determined that each of the at least one radical element has a corresponding writing template, and the radical element set is the set of all radical elements included in the Chinese character library. Determine the feature data of the writing template corresponding to each of the at least one radical element; Based on the similarity between the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element, the misspelling recognition result of the character to be identified is determined.

2. The character recognition method according to claim 1, characterized in that, The method of determining the misspelling recognition result of the character to be recognized based on the similarity between the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element includes: If the feature data of each radical element is consistent with the feature data of the writing template corresponding to the radical element, then the stroke recognition sequence corresponding to each of the at least one radical element is obtained. Based on the stroke recognition sequence corresponding to each of the at least one radical element, the misspelling recognition result of the character to be recognized is determined.

3. The character recognition method according to claim 2, characterized in that, The step of determining the misspelling recognition result of the character to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element includes: Obtain the standard stroke sequence corresponding to each of the at least one radical element; If, in the at least one radical element, the stroke recognition sequence corresponding to each radical element is consistent with the standard stroke sequence corresponding to the radical element, then M dictation characters are obtained, where M is a positive integer; Based on the M dictated characters, determine the misspelling recognition result corresponding to the character to be recognized.

4. The character recognition method according to claim 3, characterized in that, The step of determining the misspelling recognition result corresponding to the character to be recognized based on the M dictated characters includes: If the character to be identified is the same as one of the M dictated characters, then the character to be identified is determined to be the correct character recognition result; If the character to be identified is different from all M dictated characters, then the character to be identified is determined to be a misspelling result.

5. The character recognition method according to claim 3, characterized in that, The step of determining the misspelling recognition result of the character to be recognized based on the stroke recognition sequence corresponding to each of the at least one radical element further includes: Determine whether, among the at least one radical element, there exists a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence of the radical element; If, among the at least one radical element, there exists a radical element whose stroke recognition sequence is inconsistent with the standard stroke sequence, then the character to be recognized is determined to be a misspelled character.

6. The character recognition method according to claim 2, characterized in that, The method of determining the misspelling recognition result of the character to be recognized based on the similarity between the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element, further includes: Determine whether, among the at least one radical element, there exists a feature data of the radical element that is inconsistent with the feature data of the writing template of the radical element; If, among the at least one radical element, there exists a radical element whose feature data is inconsistent with the feature data of the writing template, then the character to be identified is determined to be a misspelled character.

7. The character recognition method according to any one of claims 1 to 6, characterized in that, The process of disassembling the character to be identified to obtain the radical sequence of the character to be identified includes: The character to be identified is decomposed using an encoder-decoder model to obtain the radical sequence of the character to be identified. The encoder-decoder model includes an attention mechanism that can extract features.

8. A character recognition device, characterized in that, include: The first determining module is used to decompose the character to be identified to obtain the radical sequence of the character to be identified. The radical sequence includes at least one radical element, and the at least one radical element is combined to form the character to be identified. The second determining module is used to determine the feature data of each of the at least one radical element if each of the at least one radical element has a corresponding writing template. If all the radical elements of the character to be identified are within the radical element set, then it is determined that each of the at least one radical element has a corresponding writing template, and the radical element set is the set of all radical elements included in the Chinese character library. The third determining module is used to determine the feature data of the writing template corresponding to each of the at least one radical element; The fourth determining module is used to determine the misspelling recognition result of the character to be recognized based on the similarity between the feature data of each of the at least one radical element and the feature data of the writing template corresponding to each of the at least one radical element.

9. A computer-readable storage medium, characterized in that, The storage medium stores a computer program for executing the character recognition method according to any one of claims 1 to 7.

10. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions; The processor is used to execute the character recognition method according to any one of claims 1 to 7.

Citation Information

Patent Citations

Character recognition method and device, equipment and storage medium
CN114332871A
Method and system for detecting and correcting Chinese characters and computing equipment
CN114387603A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Character recognition method and device, equipment and storage medium

Method and system for detecting and correcting Chinese characters and computing equipment