Training data acquisition method, model training method, and English correction method and device
By using a large model to score and correct English texts, high-quality corrected texts are selected as training data. This solves the problems of low efficiency and unstable quality in training corpus generation, and achieves more efficient and higher-quality training data acquisition, thereby improving the model's error correction capability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NEW ORIENTAL EDUCATION & TECH GRP CO LTD
- Filing Date
- 2026-02-27
- Publication Date
- 2026-06-12
AI Technical Summary
In existing technologies, the generation efficiency of training corpora is low and the quality is difficult to guarantee, mainly due to the inconsistent quality of manual annotation.
By using a large model to correct errors in English text, an error correction score is determined, and high-quality corrected texts are selected as training data based on the score. Training data is automatically acquired to improve quality.
This improved the efficiency and quality of training data acquisition, thereby enhancing the model's error correction performance.
Smart Images

Figure CN122197868A_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of machine learning technology, specifically to a method for acquiring training data, a method for training models, a method for correcting English errors, and an apparatus. Background Technology
[0002] With the continuous development of artificial intelligence technology, large-scale models are being used more and more widely in daily life. For example, in the field of text correction, in order to achieve automatic text correction, related technologies usually rely on a large amount of training data to train the model, enabling the model to identify and correct various grammatical, spelling, and semantic errors.
[0003] However, training corpora in related technologies are generally obtained through manual annotation, resulting in relatively low efficiency in training corpus generation. Moreover, due to differences in the professional level and / or comprehension ability of different annotators, the annotation quality varies, making it difficult to effectively guarantee the overall quality of the training corpora. Summary of the Invention
[0004] The purpose of this disclosure is to provide a method for acquiring training data, a method for training models, a method and apparatus for correcting English errors, in order to solve the aforementioned technical problems.
[0005] To achieve the above objectives, the first aspect of this disclosure provides a method for acquiring training data, the method comprising:
[0006] Get the first English text to be corrected; The first English text is corrected using a large model to obtain the first corrected text, and the error correction score of the first corrected text is determined based on the first corrected text and the first English text. Based on the error correction score of the first error-corrected text, training data for training the target model is determined, and the target model is used to correct erroneous English text.
[0007] A second aspect of this disclosure provides a model training method, the model training method comprising: Acquire training data, wherein the training data is obtained by the training data acquisition method described in the first aspect; The initial error correction model is trained based on the training data to obtain an English error correction model for correcting erroneous English text.
[0008] A third aspect of this disclosure provides an English error correction method, the English error correction method comprising: Identify the second English text that needs correction; The second English text is input into the English error correction model to obtain the error-corrected text for the second English text. The English error correction model includes a first sub-model for correcting grammatical errors in the English text. The first sub-model is trained by the model training method described in the second aspect.
[0009] A fourth aspect of this disclosure provides a training data acquisition apparatus, the training data acquisition apparatus comprising: The first acquisition module is used to acquire the first English text to be corrected. The first processing module is used to correct the first English text using a large model to obtain a first corrected text, and to determine the error correction score of the first corrected text based on the first corrected text and the first English text. The first determining module is used to determine training data for training the target model based on the error correction score of the first error-corrected text, wherein the target model is used to correct erroneous English text.
[0010] A fifth aspect of this disclosure provides a model training apparatus, the model training apparatus comprising: The second acquisition module is used to acquire training data, which is obtained through the training data acquisition device described in the fourth aspect. The training module is used to train the initial error correction model based on the training data to obtain an English error correction model for correcting erroneous English text.
[0011] A sixth aspect of this disclosure provides an English grammar correction device, the English grammar correction device comprising: The second determination module is used to determine the second English text to be corrected. The second processing module is used to input the second English text into the English error correction model to obtain the error-corrected text for the second English text. The English error correction model includes a first sub-model for correcting grammatical errors in the English text. The first sub-model is trained by the model training method described in the fifth aspect.
[0012] The seventh aspect of this disclosure provides a non-transitory computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the methods described in the first, second, or third aspects.
[0013] The eighth aspect of this disclosure provides an electronic device, characterized in that it comprises: A memory on which computer programs are stored; A processor for executing the computer program in the memory to implement the steps of the method described in the first, second, or third aspect.
[0014] The ninth aspect of this disclosure provides a computer program product, including a computer program, characterized in that, when executed by a processor, the computer program implements the steps of the method described in the first aspect, the second aspect, or the third aspect.
[0015] The above technical solution allows for the acquisition of the first English text to be corrected, followed by error correction using a large model to obtain the first corrected text. Based on the first corrected text and the first English text, an error correction score is determined for the first corrected text. This score then helps determine the training data for training the target model. This approach enables automatic acquisition of training data, significantly improving efficiency compared to manual annotation methods used in related technologies. Furthermore, determining training data through error correction scores allows for the selection of texts with better correction performance as training data, thus improving the quality of the training data and ultimately enhancing the performance of the model trained on it.
[0016] Other features and advantages of this disclosure will be described in detail in the following detailed description section. Attached Figure Description
[0017] The accompanying drawings are provided to further illustrate the present disclosure and form part of the specification. They are used together with the following detailed description to explain the present disclosure, but do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart illustrating a training data acquisition method according to an exemplary embodiment of the present disclosure; Figure 2 This is a flowchart illustrating another training data acquisition method according to an exemplary embodiment of the present disclosure; Figure 3 This is a flowchart illustrating a model training method according to an exemplary embodiment of the present disclosure; Figure 4 This is a flowchart illustrating another model training method according to an exemplary embodiment of the present disclosure; Figure 5 This is a flowchart illustrating another model training method according to an exemplary embodiment of the present disclosure; Figure 6 This is a flowchart illustrating an English error correction method according to an exemplary embodiment of the present disclosure; Figure 7 This is a flowchart illustrating another English error correction method according to an exemplary embodiment of the present disclosure; Figure 8 This is a block diagram illustrating a training data acquisition apparatus according to an exemplary embodiment of the present disclosure; Figure 9 This is a block diagram illustrating a model training apparatus according to an exemplary embodiment of the present disclosure; Figure 10 This is a block diagram illustrating an English error correction device according to an exemplary embodiment of the present disclosure; Figure 11 This is a schematic diagram of the structure of an electronic device according to an exemplary embodiment of the present disclosure. Figure 12 This is a schematic diagram of the structure of another electronic device according to an exemplary embodiment of the present disclosure. Detailed Implementation
[0018] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.
[0019] It should be understood that the steps described in the method embodiments of this disclosure may be performed in different orders and / or in parallel. Furthermore, the method embodiments may include additional steps and / or omit the steps shown. The scope of this disclosure is not limited in this respect.
[0020] The term "comprising" and its variations as used herein are open-ended inclusions, meaning "including but not limited to". The term "based on" means "at least partially based on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Definitions of other terms will be given in the description below.
[0021] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.
[0022] It should be noted that the terms "a" and "a plurality of" used in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".
[0023] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
[0024] As mentioned in the background section, training corpora in related technologies are generally obtained through manual annotation, resulting in relatively low efficiency in training corpus generation. Moreover, due to differences in the professional level and / or comprehension ability of different annotators, the annotation quality varies, making it difficult to effectively guarantee the overall quality of the training corpus.
[0025] In view of this, the present disclosure provides a training data acquisition method, a model training method, an English error correction method and apparatus to solve the above-mentioned technical problems.
[0026] The embodiments of this disclosure will be further explained below with reference to the accompanying drawings.
[0027] Figure 1 This is a flowchart illustrating a training data acquisition method according to an exemplary embodiment of the present disclosure, with reference to... Figure 1 Training data acquisition methods may include: S101: Get the first English text to be corrected.
[0028] The first English text to be corrected can be a sentence, a paragraph, or an article, or of course, other forms. This disclosure does not impose any restrictions on this.
[0029] For example, the English essay text to be corrected can be obtained, then the English essay text can be segmented into sentences to obtain multiple initial English texts, from which candidate English texts with grammatical errors can be selected. Afterwards, the candidate English texts can be preprocessed to obtain the first English text to be corrected. Preprocessing can include removing invalid characters from the candidate English texts (e.g., URL (Uniform Resource Locator) format codes and / or HTML (HyperText Markup Language) format codes), formatting the number of spaces between sentences, formatting line breaks, and formatting emoji symbols, etc.
[0030] S102: Correct the first English text using a large model to obtain the first corrected text, and determine the correction score of the first corrected text based on the first corrected text and the first English text.
[0031] For example, after obtaining the first corrected text, the first corrected text and the first English text can be input into the large model, so that the large model can determine the error correction score of the first corrected text based on at least one of the semantic similarity, text similarity and structural similarity between the first corrected text and the first English text.
[0032] S103: Based on the error correction score of the first error-corrected text, determine the training data to be used to train the target model, which is used to correct erroneous English text.
[0033] Understandably, the higher the error correction score of the corrected text, the stronger its accuracy and reasonableness (i.e., the closer the corrected text is to the true correct content). Therefore, an error correction score threshold can be preset, and after obtaining the error correction score of the first corrected text, this score is compared with the threshold. If the error correction score of the first corrected text is greater than or equal to the threshold, it indicates that the error correction quality of the first corrected text meets the preset standard, and the first corrected text can be used as the correct English text corresponding to the first English text. Thus, the first corrected text and the first English text can be used as a set of training data. If the error correction score of the first corrected text is less than the threshold, it indicates that the error correction quality of the first corrected text does not meet the standard, and it cannot be used as the correct English text corresponding to the first English text. Therefore, the first corrected text and its corresponding first English text should be discarded, or the first English text can be input into the larger model again to regenerate the corrected text of the first English text.
[0034] The above technical solution allows for the acquisition of the first English text to be corrected, followed by error correction using a large model to obtain the first corrected text. Based on the first corrected text and the first English text, an error correction score is determined for the first corrected text. This score then helps determine the training data for training the target model. This approach enables automatic acquisition of training data. Compared to manual annotation in related technologies, this solution not only improves the efficiency of training data acquisition but also enhances the quality of the training data. Furthermore, determining training data through error correction scores allows for the selection of corrected texts with better correction performance as training data, further improving the quality of the training data and ultimately enhancing the performance of the model trained on that data.
[0035] In one possible approach, determining the training data for training the target model based on the error correction score of the first corrected text may include: If the error correction score of the first corrected text is less than the error correction score threshold, the modification information for the first English text is determined, and the training data used to train the target model is determined based on the modification information, the error correction score of the first corrected text, and the first English text.
[0036] The information to be modified can be determined based on the actual situation, and this embodiment of the disclosure does not impose any restrictions on it. For example, the modification information may include the location of the modification, the content of the modification, or the number of modifications.
[0037] For example, the first English text is: "As we know, Confucius is a person who is good at educating and thinking, and lived in Spring and Autumn." The first corrected text is: "As we know, Confucius is a person who is good at educating and thinking, and lived in the Spring and Autumn." The preset error correction score threshold is 0.95, and the error correction score of the first corrected text is 0.63.
[0038] Since 0.63 < 0.95, the modification information for the first English text can be determined. For example, the modification information could be: Modification 1: Add a comma between "know" and "Confucius"; Revision 2: Add the conjunction "and" between "thinking" and "lived"; Modification 3: Based on Modification 2, add a comma between "thinking" and "and".
[0039] After determining the modification information for the first English text, the training data used to train the target model can be determined based on the modification information, the error correction score of the first corrected text, and the first English text.
[0040] For example, the modified information, the error correction score of the first corrected text, and the first English text can be input into a large model to obtain a second corrected text. Based on the second corrected text and the first English text, the error correction score of the second corrected text is determined. If the error correction score of the second corrected text is greater than or equal to the error correction score threshold, it indicates that the error correction quality of the second corrected text meets the preset standard. The second corrected text can be used as the correct English text corresponding to the first English text, and thus the second corrected text and the first English text can be used as a set of training data. If the error correction score of the second corrected text is less than the error correction score threshold, it indicates that the error correction quality of the second corrected text does not meet the standard and cannot be used as the correct English text corresponding to the first English text. Therefore, the second corrected text and its corresponding first English text are discarded. Alternatively, the first English text can be input into the large model again to regenerate the corrected text of the first English text through the large model.
[0041] It is worth noting that, in this embodiment, when determining the training data for training the target model based on the error correction score of the corrected text, the modification information of the corrected text can be determined if the error correction score is less than the error correction score threshold. Based on the modification information, the error correction score of the corrected text, and the first English text, the training data for training the target model is then determined. Alternatively, if the error correction score of the corrected text is less than the error correction score threshold, the large model can first be asked to regenerate the corrected text of the first English text, and the error correction score of the newly generated corrected text can be determined. If the error correction score is also less than the error correction score threshold, the modification information of the corrected text can then be determined. Based on the modification information, the error correction score of the corrected text, and the first English text, the training data for training the target model is then determined, such as... Figure 2 As shown in S201 to S207. In this case, the modification information may include only the modification information of the last modification, or it may include the modification information of each modification. This embodiment of the disclosure does not impose any limitation on this.
[0042] Using the above method, even when the error correction score of the first corrected text is less than the error correction score threshold, the large model can determine the training data for training the target model based on the modification information, the error correction score of the first corrected text, and the first English text. Specifically, by inputting the modification information, the error correction score of the first corrected text, and the first English text into the large model, the large model can refer to the error correction score when generating new corrected text to understand the overall quality level of the current corrected text, and can refer to the modification information to better understand which parts need improvement, thereby generating higher-quality corrected text and thus obtaining higher-quality training data.
[0043] In some possible ways, the training data used to train the target model is determined based on the modification information, the error correction score of the first corrected text, and the first English text. This may include: The modified information, the error correction score of the first corrected text, and the first English text are input into the large model to obtain the second corrected text. Based on the second corrected text and the first English text, the error correction score of the second corrected text is determined. If the error correction score of the second corrected text is less than the error correction score threshold, a summary text is determined to summarize the historical error correction situation of the first English text. Based on the summary text and the first English text, training data for training the target model is determined.
[0044] The summary text can be determined according to the actual situation, and this embodiment of the disclosure does not impose any restrictions on it. For example, the summary text may include which modifications to the first English text were correct, which modifications were incorrect, and whether the modifications were reasonable, etc.
[0045] For example, the first English text is: "As we know, Confucius is a person who is good at educating and thinking, and lived in Spring and Autumn." The second corrected text is: "As we know, Confucius is a person who is good at educating and thinking, and lived in the Spring and Autumn." The preset error correction score threshold is 0.95, and the error correction score of the first corrected text is 0.63.
[0046] Since 0.63 < 0.95, the summary text for the first English text can be determined. For example, the summary text could be: The modifications to the first English text include the following three parts: Modification 1: Add a comma between "know" and "Confucius"; Revision 2: Add the conjunction "and" between "thinking" and "lived"; Modification 3: Based on Modification 2, add a comma between "thinking" and "and".
[0047] These three revisions correspond to the first English text's phrase "As we know Confucius is a people who is good at educating and thinking." There seem to be no issues with these revisions; the revised text, "As we know, Confucius is a person who is good at educating and thinking," is grammatically correct as well. The key point to consider is the latter part of the first English text, "lived in Spring and Autumn." This likely refers to living in spring or autumn, or perhaps to the Spring and Autumn Period in ancient Chinese history. Combined with "As we know Confucius is a people who is good at educating and thinking," it confirms that this refers to the Spring and Autumn Period. Therefore, the second corrected text contains an omission and should include the correction "Spring and Autumn period."
[0048] After determining the summary text for the first English text, the training data used to train the target model can be determined based on the summary text and the first English text.
[0049] For example, the summary text and the first English text can be input into a large model to obtain a third corrected text. Based on the third corrected text and the first English text, a correction score for the third corrected text is determined. If the correction score of the third corrected text is greater than or equal to a correction score threshold, it indicates that the correction quality of the third corrected text meets a preset standard. The third corrected text can be used as the correct English text corresponding to the first English text, and thus the third corrected text and the first English text can be used as a set of training data. If the correction score of the third corrected text is less than the correction score threshold, it indicates that the correction quality of the third corrected text does not meet the standard and cannot be used as the correct English text corresponding to the first English text. Therefore, the third corrected text and its corresponding first English text are discarded. Alternatively, the first English text can be input into the large model again to regenerate the corrected text of the first English text.
[0050] It is worth noting that, in this embodiment, when determining the summary text for summarizing the historical error correction of the first English text, the summary text can be determined if the error correction score of the original text is less than the error correction score threshold. Based on the summary text and the first English text, training data for training the target model is then determined. Alternatively, if the error correction score of the original text is less than the error correction score threshold, the large model can first be allowed to regenerate the error correction text based on the modification information and the first English text. The error correction score of the newly generated text can then be determined. If the error correction score is less than the error correction score threshold, the summary text can then be determined. Based on the summary text and the first English text, training data for training the target model is then determined, such as... Figure 2 As shown in S201~S210.
[0051] Using the above method, when the error correction score of the second corrected text is less than the error correction score threshold, a summary text for summarizing the historical error correction of the first English text can be determined. Based on the summary text and the first English text, training data for training the target model can then be determined. Specifically, the summary text and the first English text can be input into the large model, allowing it to refer to the summary text when generating new corrected texts to understand the error correction situation for the first English text, thus enabling targeted error correction and generating higher-quality corrected texts, ultimately resulting in higher-quality training data.
[0052] In some possible ways, the training data used to train the target model can be determined based on the summary text and the first English text, and may include: The summary text and the first English text are input into the large model to obtain the third corrected text. Based on the third corrected text and the first English text, the error correction score of the third corrected text is determined. If the error correction score of the third corrected text is less than the error correction score threshold, the fourth and fifth corrected texts are determined. The fourth corrected text is the one with the highest error correction score among the historical error correction texts for the first English text, and the fifth corrected text is the one with the lowest error correction score among the historical error correction texts for the first English text. Based on the first English text, the third corrected text, the fourth corrected text, and the fifth corrected text, the training data used to train the target model is determined.
[0053] For example, the first English text, the third corrected text, the fourth corrected text, and the fifth corrected text can be input into a large model to obtain a sixth corrected text. Based on the sixth corrected text and the first English text, a correction score for the sixth corrected text is determined. If the correction score of the sixth corrected text is greater than or equal to a correction score threshold, it indicates that the correction quality of the sixth corrected text meets a preset standard, and the sixth corrected text can be used as the correct English text corresponding to the first English text. Therefore, the sixth corrected text and the first English text can be used as a set of training data. If the correction score of the sixth corrected text is less than the correction score threshold, it indicates that the correction quality of the sixth corrected text does not meet the standard and cannot be used as the correct English text corresponding to the first English text. Therefore, the sixth corrected text and its corresponding first English text are discarded. Alternatively, the first English text can be input into the large model again to regenerate the corrected text of the first English text.
[0054] Understandably, the error correction score of the third corrected text could be either the lowest or the highest. If the error correction score of the third corrected text is the lowest, then the training data used to train the target model is determined based on the third corrected text, the fourth corrected text, and the first English text. If the error correction score of the third corrected text is the highest, then the training data used to train the target model is determined based on the third corrected text, the fifth corrected text, and the first English text.
[0055] By using the above method, when the error correction quality of the error correction text generated based on the summary text and the first English text is not up to standard, the fourth error correction text corresponding to the highest error correction score and the fifth error correction text corresponding to the lowest error correction score can be determined. Then, the large model can regenerate the error correction text based on the first English text, the third error correction text, the fourth error correction text, and the fifth error correction text. This allows the large model to refer to error correction texts of different quality levels, thereby better understanding the error correction direction, improving the error correction quality, and ultimately obtaining higher quality training data.
[0056] In one possible manner, the training data used to train the target model is determined based on the first English text, the third corrected text, the fourth corrected text, and the fifth corrected text, and may include: The first text difference between the third and fourth corrected texts is determined, and the second text difference between the third and fifth corrected texts is determined. The first text difference, the second text difference, and the first English text are input into the large model to obtain the sixth corrected text. Based on the sixth corrected text and the first English text, the error correction score of the sixth corrected text is determined. If the error correction score of the sixth corrected text is greater than or equal to the error correction score threshold, the sixth corrected text and the first English text are used as training data.
[0057] The textual differences may refer to semantic differences, sentence structure differences, and vocabulary differences, etc., and the embodiments disclosed herein do not impose any limitations on them.
[0058] Understandably, the error correction score of the sixth corrected text may still be less than the error correction score threshold. In this case, the fourth and fifth corrected texts can be repeatedly determined, and training data for training the target model can be determined based on the first English text, the fourth corrected text, the fifth corrected text, and the current corrected text, until the error correction score of the generated corrected text is greater than or equal to the error correction score threshold, or the number of repetitions reaches a preset number, such as... Figure 2 As shown in S201~S213.
[0059] In this embodiment, since the fourth corrected text is the corrected text with the best correction quality for the first English text, and the fifth corrected text is the corrected text with the worst correction quality for the first English text, by determining the first text difference between the current corrected text and the fourth corrected text, and the second text difference between the current corrected text and the fifth corrected text, and by regenerating the corrected text based on the first text difference, the second text difference, and the first English text using a large model, the large model can better understand the differences in quality between different corrected texts. Thus, when generating new corrected texts, it can reduce error patterns in low-quality corrected texts, while drawing on the correction methods in high-quality corrected texts, thereby further improving the quality of the corrected texts.
[0060] Once the training data is obtained, it can be used to train the target model.
[0061] For example, if the target model includes a grammar error detection model for detecting grammatical errors in English text and a grammar error correction model for correcting grammatical errors in English text, then when training the target model based on training data, the training data can first be divided into first training data for training the grammar error detection model and second training data for training the grammar error correction model. The first training data includes a first English text and labels indicating whether grammatical errors exist in the first English text. The second training data includes the first English text and corresponding correction text. Then, the grammar error detection model can be trained based on the first training data, and the grammar error correction model can be trained based on the second training data. After both the grammar error detection model and the grammar error correction model are trained, the target model for correcting English grammar is obtained.
[0062] Based on the same technical concept, this disclosure also provides a model training method, such as... Figure 3 As shown, model training methods may include: S301: Obtain training data, which is obtained through the training data acquisition method described above; S302: Train the initial error correction model based on the training data to obtain an English error correction model for correcting erroneous English text.
[0063] For example, in a K-12 (kindergarten to 12th grade) English essay grammar correction scenario, one can refer to... Figure 4 As shown in S401~S404: S401: Determine the base model; an open-source model that performs well in the K12 English essay grammar correction task can be used as the base model; S402: Load the training data into the base model; S403: Supervised fine-tuning methods are used to fine-tune the monitoring metrics and hyperparameters of the base model in order to improve the model fitting accuracy as much as possible. S404: After the model is fine-tuned, a quality assessment test can be performed on the fine-tuned model to determine whether the performance of the fine-tuned model on the K12 English essay grammar correction task meets the preset standard.
[0064] In this embodiment, since the training data is automatically acquired through a large model, compared with the acquisition of training data through manual annotation in related technologies, the efficiency of acquiring training data and the quality of training data can be improved, thereby improving the overall training speed and training effect of the model.
[0065] In some possible ways, to further optimize the training performance of the model at the training level, reinforcement learning can be used to train the initial error correction model based on the training data. Specifically, a reward model can be designed to reward or penalize the initial error correction model according to the quality of the corrected text, thereby guiding the initial error correction model to continuously optimize its output during training and ultimately generate higher-quality corrected text.
[0066] For example, it can be like Figure 5 As shown, it includes: S501: Training data preparation; Training data with relatively more complete sentences (e.g., sentences without meaningless symbols) and relatively low error correction scores can be selected from the training data obtained through the above training data acquisition methods, and used as a dedicated dataset for fine-tuning reinforcement learning. S502: Training framework selection; select an open-source model that performs well in the K12 English essay grammar correction task as the base model, and fine-tune the framework by using reinforcement learning frameworks such as Verl or Unsloth. S503: Selection of Reinforcement Learning Methods; S504: Design of hyperparameters and reward models; S505: Quality Assessment; After fine-tuning the reinforcement learning, a comprehensive evaluation is conducted on the model to ensure that its performance on the K12 English essay grammar correction task is further improved.
[0067] Based on the above-mentioned reinforcement training method, if the initial error correction model includes a grammar error correction model for correcting grammatical errors in English text, and the training data includes sample error-corrected text and the corresponding correct English text, then training the initial error correction model based on the training data yields an English error correction model for correcting erroneous English text, which may include: The sample corrected text is input into the syntax error correction model to obtain the predicted corrected text. The reward score for the predicted corrected text is determined by the preset reward model based on the predicted corrected text and the correct English text. The loss function value is determined based on the reward score, the predicted corrected text, and the correct English text, and the parameters of the syntax error correction model are updated based on the loss function value.
[0068] It is understood that the reward model can be an existing reward model in the related art, or an improved reward model of an existing reward model in the related art, or of course, other models. This disclosure does not impose any restrictions on this.
[0069] Given that during the training process of a syntax error correction model, the model continuously generates predicted and corrected text, and the quality of these predicted and corrected texts directly affects the final performance of the model, in order to optimize the performance and prediction accuracy of the syntax error correction model, the quality of the predicted and corrected texts generated by the model can be measured using multiple metrics, and corresponding rewards or penalties can be given accordingly, thereby guiding the syntax error correction model towards greater accuracy. Therefore, in a possible implementation, a preset reward model determines the reward score for the predicted and corrected text based on the predicted and corrected English text, which may include: Determine the error correction evaluation information for the predicted error-corrected text, including at least one of the following: semantic difference information between the predicted error-corrected text and the correct English text, perplexity of the predicted error-corrected text, complexity of the predicted error-corrected text, and error-correcting words in the predicted error-corrected text. The error correction evaluation information is determined based on the textual differences between the predicted error-corrected text and the correct English text. Determine the reward score for the predicted error-corrected text based on at least one of the following: error correction evaluation information, semantic difference information, perplexity, complexity, and error-correcting words.
[0070] In this embodiment, error correction evaluation information can be used to characterize the textual differences between the predicted and corrected text and the correct English text, including accuracy and / or recall. Semantic difference information can be used to characterize the semantic differences between the predicted and corrected text and the correct English text, and may include semantic similarity and keyword proportion. Error correction words can refer to words obtained by correcting words in the sample English text. For example, if the sample English text is "how are you" and the predicted and corrected text is "How are you", then "How" in the predicted and corrected text is the error correction word.
[0071] Understandably, error correction evaluation information reflects the accuracy and consistency between the predicted and corrected text and the correct English text at the character or word level; semantic difference information reflects the semantic similarity and contextual coherence between the predicted and corrected text and the correct English text; perplexity reflects the naturalness and fluency of the predicted and corrected text in the language model; complexity reflects the grammatical and lexical difficulty and complexity of the predicted and corrected text; and correction words reflect the correction of key errors in the predicted and corrected text and their impact on the overall meaning of the text. Therefore, determining the reward score for the predicted and corrected text based on at least one of the error correction evaluation information, semantic difference information, perplexity, complexity, and correction words allows for a more comprehensive and accurate evaluation of the quality of the predicted and corrected text. This provides more effective feedback signals for the training of the grammar error correction model, guiding the model to optimize across different dimensions, improving its error correction capabilities and accuracy in various text scenarios, and ultimately enhancing the overall performance and reliability of the text correction model.
[0072] Among possible approaches, determining the reward score for the predicted and corrected text based on at least one of error correction evaluation information, semantic difference information, perplexity, complexity, and error correction words may include: Based on the error correction evaluation information, a first reward score is determined; based on semantic differences, a second reward score is determined; based on the perplexity of the predicted error-corrected text, a third reward score is determined; based on complexity and error-correcting words, a fourth reward score is determined; based on the first, second, third, and fourth reward scores, a reward score for the predicted error-corrected text is determined.
[0073] For example, after obtaining the first reward score, the second reward score, the third reward score, and the fourth reward score, the reward score for the predicted and corrected text is obtained by summing the first reward score, the second reward score, the third reward score, and the fourth reward score.
[0074] Understandably, different application scenarios have different requirements for model training performance, meaning the requirements for model error correction may differ. Furthermore, the impact of error correction evaluation information, semantic difference information, perplexity, complexity, and the degree to which the error correction words affect the model's error correction results generally varies. Therefore, to further improve model training performance, when determining the reward score for the predicted and corrected text based on the first, second, third, and fourth reward scores, different weights can be assigned to these four reward scores. This further enhances the model's training performance and meets the application needs of the model in different scenarios.
[0075] In other words, among the possible methods, the reward score can be determined by the following formula:
[0076] in, Indicates the reward points. , , as well as Indicates weight, This indicates the first reward score. This indicates the second bonus score. This indicates the third reward score. This indicates the fourth bonus score.
[0077] In possible ways, the error correction evaluation information includes precision and recall. Based on this information, a first reward score is determined, which may include: Determine the first product between recall and the first coefficient, and determine the first sum between the first product and recall; determine the second product between precision and recall, and divide the second product by the first sum to obtain the first value; determine the second sum between the first preset value and the first coefficient, and use the product between the second sum and the first value as the first reward score.
[0078] For example, after obtaining the precision and recall, you can substitute them into the following formula to obtain the first reward score:
[0079] in, This represents the first reward score, and 1 represents the first preset value. Indicates accuracy rate. Indicates recall rate, This represents the first coefficient, used to control the impact of precision and recall on... The impact, The smaller the value, the higher the accuracy. The greater the impact, The larger the value, the higher the recall rate. The greater the influence, the more its value can be determined based on the actual situation. For example, It can be equal to 0.5.
[0080] In some possible approaches, the semantic difference information includes the semantic similarity between the predicted error-corrected text and the correct English text, as well as the keyword ratio. The keyword ratio represents the ratio of the number of keywords in the predicted error-corrected text to the number of keywords in the correct English text. Accordingly, based on the semantic difference information, a second error correction score can be determined, which may include: The sum of semantic similarity and keyword proportion is used as the second reward score.
[0081] For example, after obtaining the semantic similarity and keyword proportion, the second reward score is obtained by substituting these values into the following formula:
[0082]
[0083]
[0084] in, This indicates the second bonus score. Indicates semantic similarity. Indicates the percentage of keywords. Indicates the predicted error-correcting text. This indicates the correct English text. The vector representation of the predicted error-corrected text. A vector representation of correct English text. This indicates the number of keywords in a correct English text. This indicates the number of keywords in the predicted error-corrected text.
[0085] It is understandable that the impact of semantic similarity and keyword proportion on the model's error correction performance can be the same or different in different application scenarios. To further improve the model's training performance in different application scenarios, when determining the second reward score based on semantic similarity and keyword proportion, different weights can be assigned to semantic similarity and keyword proportion, thereby further improving the model's training performance and meeting the application needs of the model in different application scenarios.
[0086] In other words, among the possible methods, the second reward score can be determined by the following formula:
[0087] in, and Indicates the weight.
[0088] Among possible approaches, determining the third error correction score based on the perplexity of the predicted error-corrected text may include: The confusion difference obtained by subtracting the confusion value from the second preset value is determined, and the confusion difference is divided by the preset normalization coefficient to obtain the third reward score.
[0089] The second preset value may be the same as or different from the first preset value, and this embodiment does not impose any restrictions on this.
[0090] The perplexity of the predicted and corrected text is an important metric for evaluating the performance of a language model, reflecting the model's degree of uncertainty regarding the text. Lower perplexity indicates a more accurate and certain prediction, while higher perplexity indicates a more uncertain and confusing prediction.
[0091] After obtaining the perplexity of the predicted error-correcting text, the perplexity can be substituted into the following formula to obtain the third reward score:
[0092] in, This indicates the third reward score. This indicates the perplexity of the predicted error-corrected text. This represents the normalization coefficient, used to scale the third bonus score proportionally.
[0093] In possible ways, the error-correcting word includes a first error-correcting word and / or a second error-correcting word, where the first error-correcting word is used to represent English words outside of a preset English vocabulary, and the second error-correcting word is used to represent colloquial English words. A fourth reward score is determined based on complexity and the error-correcting word, and may include: Determine a first number of error-correcting words, and determine at least one of a second number of first error-correcting words and a third number of second error-correcting words; determine a first sub-reward score based on the first number, the second number, and the complexity, and / or determine a second sub-reward score based on the first number and the third number; determine a fourth reward score based on the first sub-reward score and / or the second sub-reward score.
[0094] In this embodiment, the preset vocabulary list can be determined according to the actual situation, and this disclosure does not impose any restrictions on it. For example, the preset English vocabulary list can be the K12 vocabulary list.
[0095] As mentioned earlier, correction words refer to the words obtained after correcting the vocabulary in the sample English text. Therefore, the first correction word is an English word in the corrected vocabulary that is not included in the preset English vocabulary list, and the second correction word is an English word in the corrected vocabulary that is colloquial. For example, if the corrected vocabulary includes "permissible," "Yeah," "exactly," "function," and "Wow," and "permissible" is not included in the preset English vocabulary list, then "permissible" can be identified as the first correction word. Since "Yeah" and "Wow" are colloquial words, they can be identified as the second correction words.
[0096] After determining the first and second correction words, by counting the number of correction words, the first correction word, and the second correction word, we can obtain the first quantity, the second quantity, and the third quantity.
[0097] Once the first, second, and third quantities are obtained, the first sub-reward score can be determined based on the first, second, and complexity, and / or the second sub-reward score can be determined based on the first and third quantities. Then, the fourth reward score can be determined based on the first and / or second sub-reward scores.
[0098] Among possible methods, determining the first sub-reward score based on the first quantity, the second quantity, and the complexity can include: Determine the first difference obtained by subtracting the second quantity from the first quantity, and divide the first difference by the first quantity to obtain the second value; determine the second difference obtained by subtracting the second coefficient from the third preset value, and multiply the second difference by the complexity to obtain the third value; determine the sum between the second value and the third value as the first sub-reward score.
[0099] In other words, after obtaining the first quantity, the second quantity, and the complexity, we can substitute these quantities into the following formula to obtain the first sub-reward score:
[0100] in, This indicates the first sub-reward score. Indicates a word for error correction. Indicates the first quantity. Indicates the second quantity. Indicates complexity.
[0101] In some possible ways, different weights can be assigned to the second and third values, thus determining the first sub-reward score using the following formula:
[0102] in, Represents the coefficient.
[0103] Among possible methods, determining the second error-correction sub-score based on the first and third quantities may include: The fourth value, obtained by dividing the third quantity by the first quantity, is inverted to obtain the second sub-reward score.
[0104] In other words, after obtaining the first and third quantities, we can substitute them into the following formula to obtain the second sub-reward score:
[0105] in, This indicates the second sub-reward score. Indicates a word for error correction. Indicates the second correction word. Indicates the third quantity. Indicates the first quantity.
[0106] In one possible approach, to reduce the difference between the second sub-reward score and other reward scores, a scaling factor can be set to control the size of the second sub-reward score. That is, after obtaining the first and third quantities, the first and third quantities can be substituted into the following formula to obtain the second sub-reward score:
[0107] in, This represents the scaling factor.
[0108] After obtaining the first, second, third, and fourth reward scores using the above method, these scores can be substituted into the reward score calculation formula to obtain the final reward score. Then, the reward score, the predicted and corrected text, and the correct English text can be input into the loss function to obtain the loss function value.
[0109] The loss function can be determined according to the actual situation, and this disclosure does not impose any restrictions on it.
[0110] Based on the same technical concept, this disclosure also provides an English error correction method, such as... Figure 6 As shown, English error correction methods may include: S601: Determine the second English text to be corrected.
[0111] The second English text to be corrected can be a sentence, a paragraph, or an article, or of course, other forms. This embodiment of the disclosure does not impose any restrictions on this.
[0112] S602: Input the second English text into the English error correction model to obtain the error-corrected text for the second English text. The English error correction model includes a first sub-model for correcting grammatical errors in the English text. The first sub-model is trained using a model training method.
[0113] For example, if the second English text to be corrected is a paragraph or text, it can first be segmented into sentences to obtain multiple sentences. Then, any URLs and HTML formatting codes that may exist in each sentence are removed, and the number of spaces between sentences, line breaks, and emoji symbols are formatted to obtain the target sentence to be input into the English error correction model. The target sentences are then input into the English error correction model in sequence to obtain the corrected text for the second English text.
[0114] In some possible approaches, the English error correction model also includes a second sub-model for detecting grammatical errors in the English text. Accordingly, the second English text is input into the English error correction model to obtain the error-corrected text for the second English text, which may include: The second sub-model is used to detect grammatical errors in the second English text, and the detection results are obtained. When the detection results indicate that there are grammatical errors in the second English text, the first sub-model is used to correct the grammatical errors in the English text, and the corrected text for the second English text is obtained.
[0115] For example, you can refer to Figure 7 As shown in S701~S707: S701: Obtain the English essay text to be corrected; S702: Perform sentence segmentation on the English essay text to be corrected to obtain multiple English sentences; S703: For each English sentence, preprocess each English sentence to obtain the target English sentence; the preprocessing includes: removing URL format code and HTML format code from each English sentence, formatting the number of spaces between sentences, line breaks and emoji symbols, etc. S704: For each target English sentence, input the target English sentence into the second sub-model for syntax error detection and obtain the detection results; S705: If the detection result indicates that the target English sentence has a grammatical error, input the target English sentence into the first sub-model, and correct the grammatical error in the target English sentence through the first sub-model to obtain the corrected text for the target English sentence; S706: After obtaining the error-corrected text of each target English sentence, post-processing can be performed on each error-corrected text to obtain the target content to be output for the second English text. The post-processing includes: The text to be corrected is matched with its corresponding target English sentence to obtain the location, content, and type of the correction for the target English sentence. If the error type includes a preset type (e.g., capitalization issues and sentence-ending punctuation issues), the preset type errors are validated and corrected. After correction, the output content is formatted to obtain the target content to be output for the second English text. The output formatting includes: (1) Add text descriptions to the location of the modification, the content of the modification, and the error type to make the error easier to understand, such as: [if] is changed to [If], and the type is [case error]: (2) Combine each target English sentence and its corresponding error correction text.
[0116] S707: Output target content.
[0117] The above technical solution allows for the initial detection of grammatical errors in the text to be corrected using a second sub-model. Only when the detection results indicate the presence of grammatical errors does the first sub-model perform grammatical correction. Compared to directly correcting grammatical errors in related technologies, this approach improves accuracy. Specifically, directly correcting grammatical errors might lead to incorrect modifications of otherwise correct text, especially when the text has a complex grammatical structure. In this solution, prior grammatical error detection identifies which parts actually have grammatical problems, thus avoiding miscorrection of correct parts and improving the accuracy of error correction.
[0118] Based on the same technical concept, embodiments of this disclosure also provide a training data acquisition device, such as... Figure 8 As shown, the training data acquisition device 800 may include: The first acquisition module 801 is used to acquire the first English text to be corrected; The first processing module 802 is used to correct the first English text using a large model to obtain a first corrected text, and to determine the error correction score of the first corrected text based on the first corrected text and the first English text. The first determining module 803 is used to determine training data for training the target model based on the error correction score of the first error-corrected text, wherein the target model is used to correct erroneous English text.
[0119] The training data acquisition device 800 described above can acquire the first English text to be corrected, and then correct the first English text using a large model to obtain the first corrected text. Based on the first corrected text and the first English text, a correction score is determined for the first corrected text. Furthermore, training data for training the target model can be determined based on the correction score of the first corrected text. This allows for automatic acquisition of training data, which is more efficient than the manual annotation methods used in related technologies. In addition, determining training data through the correction score allows for the selection of corrected texts with better correction performance as training data, thereby improving the quality of the training data and ultimately enhancing the performance of the model trained on that data.
[0120] In one possible manner, the first determining module 803 may be used to determine modification information for the first English text when the error correction score of the first corrected text is less than the error correction score threshold, and to determine training data for training the target model based on the modification information, the error correction score of the first corrected text, and the first English text.
[0121] In one possible manner, the first determining module 803 may include: The first processing submodule is used to input the modified information, the error correction score of the first corrected text, and the first English text into the large model to obtain the second corrected text, and to determine the error correction score of the second corrected text based on the second corrected text and the first English text. The first determining submodule is used to determine a summary text for summarizing the historical error correction of the first English text when the error correction score of the second error correction text is less than the error correction score threshold, and to determine training data for training the target model based on the summary text and the first English text.
[0122] In some possible ways, the first determining submodule may include: The processing unit is used to input the summary text and the first English text into the large model to obtain the third error correction text, and to determine the error correction score of the third error correction text based on the third error correction text and the first English text. The first determining unit is configured to determine a fourth corrected text and a fifth corrected text when the error correction score of the third corrected text is less than the error correction score threshold, wherein the fourth corrected text is the corrected text with the highest corresponding error correction score among the historical corrected texts for the first English text, and the fifth corrected text is the corrected text with the lowest corresponding error correction score among the historical corrected texts for the first English text. The second determining unit is used to determine training data for training the target model based on the first English text, the third error-correcting text, the fourth error-correcting text, and the fifth error-correcting text.
[0123] In some possible ways, the second determining unit may include: The first determining subunit is used to determine a first text difference between the third corrected text and the fourth corrected text, and to determine a second text difference between the third corrected text and the fifth corrected text; The processing subunit is used to input the first text difference, the second text difference, and the first English text into the large model to obtain the sixth corrected text, and to determine the error correction score of the sixth corrected text based on the sixth corrected text and the first English text. The second determining subunit is used to use the sixth corrected text and the first English text as training data when the error correction score of the sixth corrected text is greater than or equal to the error correction score threshold.
[0124] Regarding the training data acquisition device 800 in the above embodiments, the specific methods by which each module performs operations have been described in detail in the embodiments related to the method, and will not be elaborated here.
[0125] Based on the same technical concept, this disclosure also provides a model training device, such as... Figure 9 As shown, the model training device 900 may include: The second acquisition module 901 is used to acquire training data, which is obtained through the training data acquisition device 800. Training module 902 is used to train the initial error correction model based on the training data to obtain an English error correction model for correcting erroneous English text.
[0126] In one possible manner, the initial error correction model includes a grammar error correction model for correcting grammatical errors in English text, and the training data includes sample corrected text and the correct English text corresponding to the sample corrected text. Accordingly, the training module 902 may include: The second processing submodule is used to input the sample error-corrected text into the syntax error correction model to obtain the predicted error-corrected text. The second determining submodule is used to determine the reward score for the predicted error-corrected text based on the predicted error-corrected text and the correct English text using a preset reward model; The third processing submodule is used to determine the loss function value based on the reward score, the predicted error-corrected text, and the correct English text, and to update the parameters of the syntax error correction model based on the loss function value.
[0127] In some possible ways, the second determining submodule may include: The third determining unit is used to determine the error correction evaluation information of the predicted error correction text, including at least one of the semantic difference information between the predicted error correction text and the correct English text, the perplexity of the predicted error correction text, the complexity of the predicted error correction text, and the error correction words of the predicted error correction text. The error correction evaluation information is determined based on the textual differences between the predicted error correction text and the correct English text. The fourth determining unit is used to determine a reward score for the predicted error-corrected text based on at least one of the error correction evaluation information, the semantic difference information, the perplexity, the complexity, and the error-correcting word.
[0128] In some possible ways, the fourth determining unit may include: The third determining subunit is used to determine the first reward score based on the error correction evaluation information; The fourth determining subunit is used to determine the second reward score based on the semantic differences; The fifth determining subunit is used to determine the third reward score based on the perplexity of the predicted error-corrected text; The sixth determining subunit is used to determine the fourth reward score based on the complexity and the error-correcting word; The seventh determining subunit is used to determine the reward score for the predicted error-correcting text based on the first reward score, the second reward score, the third reward score, and the fourth reward score.
[0129] In some possible embodiments, the error correction evaluation information includes precision and recall, and accordingly, the third determining subunit may include: A first determining component is configured to determine a first product between the recall rate and a first coefficient, and to determine a first sum between the first product and the recall rate; The second determining component is used to determine a second product between the precision and the recall, and divide the second product by the first sum to obtain a first value; The third determining component is used to determine the second sum between the first preset value and the first coefficient, and to use the product between the second sum and the first value as the first reward score.
[0130] In one possible manner, the semantic difference information includes the semantic similarity between the predicted error-correcting text and the correct English text, as well as the keyword ratio. The keyword ratio is used to characterize the ratio of the number of keywords in the predicted error-correcting text to the number of keywords in the correct English text. Accordingly, the fourth determining subunit can use the sum of the semantic similarity and the keyword ratio as the second reward score.
[0131] In one possible manner, the fifth determining subunit may be used to determine the confusion difference obtained by subtracting the confusion from the second preset value, and divide the confusion difference by a preset normalization coefficient to obtain the third reward score.
[0132] In a possible manner, the error-correcting word includes a first error-correcting word and / or a second error-correcting word, wherein the first error-correcting word is used to characterize English words located outside a preset English vocabulary, and the second error-correcting word is used to characterize colloquial English words. Accordingly, the sixth determining subunit may include: The fourth determining component determines a first number of the error-correcting words, and determines at least one of a second number of the first error-correcting words and a third number of the second error-correcting words; The fifth determining component is used to determine a first sub-reward score based on the first quantity, the second quantity, and the complexity, and / or to determine a second sub-reward score based on the first quantity and the third quantity; The sixth determining component is used to determine the fourth reward score based on the first sub-reward score and / or the second sub-reward score.
[0133] In some possible ways, the fifth determining component may include: The first determining sub-component is used to determine a first difference obtained by subtracting the second quantity from the first quantity, and to divide the first difference by the first quantity to obtain a second value; The second determining subcomponent is used to determine the second difference obtained by subtracting the second coefficient from the third preset value, and multiply the second difference by the complexity to obtain the third value; The third determining sub-component is used to determine the sum between the second value and the third value as the first sub-reward score.
[0134] In one possible manner, the fifth determining component can be used to invert the fourth value obtained by dividing the third quantity by the first quantity to obtain the second sub-reward score.
[0135] Regarding the model training device 900 in the above embodiments, the specific methods by which each module performs operations have been described in detail in the embodiments related to the method, and will not be elaborated here.
[0136] Based on the same technical concept, this disclosure also provides an English error correction device, such as... Figure 10 As shown, the English grammar correction device 1000 includes: The second determination module 1001 is used to determine the second English text to be corrected; The second processing module 1002 is used to input the second English text into an English error correction model to obtain error-corrected text for the second English text. The English error correction model includes a first sub-model for correcting grammatical errors in the English text, and the first sub-model is trained by a model training device.
[0137] In one possible manner, the English error correction model further includes a second sub-model for detecting grammatical errors in English text, and correspondingly, the second processing module 1002 may include: The detection submodule is used to perform grammatical error detection on the second English text through the second sub-model and obtain the detection results; The error correction submodule is used to correct the grammatical errors in the English text through the first sub-model when the detection result indicates that there are grammatical errors in the second English text, so as to obtain the error-corrected text for the second English text.
[0138] Regarding the English error correction device 1000 in the above embodiments, the specific methods by which each module performs its operations have been described in detail in the embodiments related to the method, and will not be elaborated here.
[0139] Based on the same technical concept, embodiments of this disclosure also provide a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of a training data acquisition method, a model training method, or an English error correction method.
[0140] Based on the same technical concept, this disclosure also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of a training data acquisition method, a model training method, or an English error correction method.
[0141] Based on the same technical concept, this disclosure also provides an electronic device, including: A memory on which computer programs are stored; A processor is used to execute computer programs stored in memory to implement steps of training data acquisition methods, model training methods, or English error correction methods.
[0142] Figure 11 This is a block diagram illustrating an electronic device according to an exemplary embodiment. Figure 11 As shown, the electronic device 1100 may include: a processor 1101 and a memory 1102. The electronic device 1100 may also include one or more of a multimedia component 1103, an input / output (I / O) interface 1104, and a communication component 1105.
[0143] The processor 1101 controls the overall operation of the electronic device 1100 to complete all or part of the steps in the training data acquisition method, model training method, or English error correction method described above. The memory 1102 stores various types of data to support the operation of the electronic device 1100. This data may include, for example, instructions for any application or method operating on the electronic device 1100, and application-related data such as contact data, sent and received messages, pictures, audio, video, etc. The memory 1102 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. Multimedia component 1103 may include a screen and an audio component. The screen may be, for example, a touchscreen, and the audio component is used to output and / or input audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in memory 1102 or transmitted via communication component 1105. The audio component also includes at least one speaker for outputting audio signals. I / O interface 1104 provides an interface between processor 1101 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual or physical buttons. Communication component 1105 is used for wired or wireless communication between the electronic device 1100 and other devices. Wireless communication, such as Wi-Fi, Bluetooth, Near Field Communication (NFC), 2G, 3G, 4G, NB-IoT, eMTC, or other 5G technologies, or combinations thereof, is not limited here. Therefore, the corresponding communication component 1105 may include: a Wi-Fi module, a Bluetooth module, an NFC module, etc.
[0144] In an exemplary embodiment, the electronic device 1100 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the training data acquisition method, model training method, or error correction method described above.
[0145] In another exemplary embodiment, a computer-readable storage medium including program instructions is also provided. When executed by a processor, these program instructions implement the steps of the training data acquisition method, model training method, or English error correction method described above. For example, the computer-readable storage medium may be the memory 1102 including the program instructions described above. These program instructions may be executed by the processor 1101 of the electronic device 1100 to complete the training data acquisition method, model training method, or English error correction method described above.
[0146] Figure 12 This is a block diagram illustrating an electronic device 1200 according to an exemplary embodiment. For example, the electronic device 1200 may be provided as a server. (Refer to...) Figure 12 The electronic device 1200 includes a processor 1222, which may be one or more, and a memory 1232 for storing computer programs executable by the processor 1222. The computer program stored in the memory 1232 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processor 1222 may be configured to execute the computer program to perform the aforementioned training data acquisition method, model training method, or English error correction method.
[0147] Additionally, the electronic device 1200 may also include a power supply component 1226 and a communication component 1250. The power supply component 1226 can be configured to perform power management of the electronic device 1200, and the communication component 1250 can be configured to enable communication of the electronic device 1200, such as wired or wireless communication. Furthermore, the electronic device 1200 may also include an input / output (I / O) interface 1258. The electronic device 1200 can operate on an operating system stored in the memory 1232.
[0148] In another exemplary embodiment, a computer-readable storage medium including program instructions is also provided. When executed by a processor, these program instructions implement the steps of the training data acquisition method, model training method, or English error correction method described above. For example, the non-transitory computer-readable storage medium may be the memory 1232 including the program instructions described above. These program instructions may be executed by the processor 1222 of the electronic device 1200 to complete the training data acquisition method, model training method, or English error correction method described above.
[0149] In another exemplary embodiment, a computer program product is also provided, the computer program product comprising a computer program executable by a programmable device, the computer program having a code portion for performing the training data acquisition method, model training method, or English error correction method described above when executed by the programmable device.
[0150] The preferred embodiments of this disclosure have been described in detail above with reference to the accompanying drawings. However, this disclosure is not limited to the specific details of the above embodiments. Within the scope of the technical concept of this disclosure, various simple modifications can be made to the technical solutions of this disclosure, and these simple modifications all fall within the protection scope of this disclosure.
[0151] It should also be noted that the various specific technical features described in the above specific embodiments can be combined in any suitable manner without contradiction. In order to avoid unnecessary repetition, this disclosure will not describe the various possible combinations separately.
[0152] Furthermore, various different embodiments of this disclosure can be combined in any way, as long as they do not violate the spirit of this disclosure, they should also be regarded as the content disclosed in this disclosure.
Claims
1. A method for acquiring training data, characterized in that, The training data acquisition method includes: Get the first English text to be corrected; The first English text is corrected using a large model to obtain the first corrected text, and the error correction score of the first corrected text is determined based on the first corrected text and the first English text. Based on the error correction score of the first error-corrected text, training data for training the target model is determined, and the target model is used to correct erroneous English text.
2. The training data acquisition method according to claim 1, characterized in that, The step of determining the training data for training the target model based on the error correction score of the first corrected text includes: If the error correction score of the first error-corrected text is less than the error correction score threshold, modification information for the first English text is determined, and training data for training the target model is determined based on the modification information, the error correction score of the first error-corrected text, and the first English text.
3. The training data acquisition method according to claim 2, characterized in that, The step of determining training data for training the target model based on the modified information, the error correction score of the first corrected text, and the first English text includes: The modified information, the error correction score of the first corrected text, and the first English text are input into the large model to obtain the second corrected text, and the error correction score of the second corrected text is determined based on the second corrected text and the first English text. If the error correction score of the second error correction text is less than the error correction score threshold, a summary text is determined to summarize the historical error correction situation of the first English text, and training data for training the target model is determined based on the summary text and the first English text.
4. The training data acquisition method according to claim 3, characterized in that, The step of determining the training data for training the target model based on the summary text and the first English text includes: The summary text and the first English text are input into the large model to obtain the third error-corrected text, and the error correction score of the third error-corrected text is determined based on the third error-corrected text and the first English text. If the error correction score of the third corrected text is less than the error correction score threshold, a fourth corrected text and a fifth corrected text are determined. The fourth corrected text is the corrected text with the highest corresponding error correction score among the historical corrected texts of the first English text, and the fifth corrected text is the corrected text with the lowest corresponding error correction score among the historical corrected texts of the first English text. Based on the first English text, the third error-correcting text, the fourth error-correcting text, and the fifth error-correcting text, training data for training the target model is determined.
5. The training data acquisition method according to claim 4, characterized in that, The step of determining training data for training the target model based on the first English text, the third corrected text, the fourth corrected text, and the fifth corrected text includes: Determine a first text difference between the third corrected text and the fourth corrected text, and determine a second text difference between the third corrected text and the fifth corrected text; The first text difference, the second text difference, and the first English text are input into the large model to obtain the sixth corrected text, and the error correction score of the sixth corrected text is determined based on the sixth corrected text and the first English text. If the error correction score of the sixth error-corrected text is greater than or equal to the error correction score threshold, the sixth error-corrected text and the first English text are used as training data.
6. A model training method, characterized in that, The model training method includes: Acquire training data, wherein the training data is obtained by the training data acquisition method according to any one of claims 1-5; The initial error correction model is trained based on the training data to obtain an English error correction model for correcting erroneous English text.
7. The model training method according to claim 6, characterized in that, The initial error correction model includes a grammar error correction model for correcting grammatical errors in English text. The training data includes sample corrected text and the corresponding correct English text. Training the initial error correction model based on the training data to obtain an English error correction model for correcting erroneous English text includes: The sample corrected text is input into the syntax error correction model to obtain the predicted corrected text; The reward score for the predicted error-corrected text is determined based on the predicted error-corrected text and the correct English text using a preset reward model. Based on the reward score, the predicted error-corrected text, and the correct English text, a loss function value is determined, and the parameters of the syntax error correction model are updated based on the loss function value.
8. The model training method according to claim 7, characterized in that, The step of determining the reward score for the predicted error-corrected text based on the predicted error-corrected text and the correct English text using a preset reward model includes: Determine the error correction evaluation information of the predicted error correction text, including at least one of the semantic difference information between the predicted error correction text and the correct English text, the perplexity of the predicted error correction text, the complexity of the predicted error correction text, and the error correction words of the predicted error correction text, wherein the error correction evaluation information is determined based on the textual differences between the predicted error correction text and the correct English text. The reward score for the predicted error-corrected text is determined based on at least one of the error correction evaluation information, the semantic difference information, the perplexity, the complexity, and the error correction word.
9. The model training method according to claim 8, characterized in that, The step of determining a reward score for the predicted error-corrected text based on at least one of the error correction evaluation information, the semantic difference information, the perplexity, the complexity, and the error-correcting word includes: Based on the error correction evaluation information, the first reward score is determined; The second reward score is determined based on the semantic differences. The third reward score is determined based on the perplexity of the predicted error-corrected text; The fourth reward score is determined based on the complexity and the error-correcting word. The reward score for the predicted error-correcting text is determined based on the first reward score, the second reward score, the third reward score, and the fourth reward score.
10. The model training method according to claim 9, characterized in that, The error correction evaluation information includes accuracy and recall. Determining the first reward score based on the error correction evaluation information includes: Determine a first product between the recall rate and a first coefficient, and determine a first sum between the first product and the recall rate; Determine a second product between the precision and the recall, and divide the second product by the first sum to obtain a first value; Determine a second sum between a first preset value and the first coefficient, and use the product of the second sum and the first value as the first reward score.
11. The model training method according to claim 9, characterized in that, The semantic difference information includes the semantic similarity between the predicted error-corrected text and the correct English text, as well as the keyword ratio. The keyword ratio represents the ratio of the number of keywords in the predicted error-corrected text to the number of keywords in the correct English text. Determining the second reward score based on the semantic difference information includes: The sum of the semantic similarity and the keyword proportion is used as the second reward score.
12. The model training method according to claim 9, characterized in that, The step of determining the third reward score based on the perplexity of the predicted error-corrected text includes: The confusion difference obtained by subtracting the confusion value from the second preset value is determined, and the confusion difference is divided by the preset normalization coefficient to obtain the third reward score.
13. The model training method according to claim 9, characterized in that, The error-correcting words include a first error-correcting word and / or a second error-correcting word. The first error-correcting word is used to represent English words located outside the preset English vocabulary list, and the second error-correcting word is used to represent colloquial English words. The step of determining the fourth reward score based on the complexity and the error-correcting words includes: Determine a first number of the error correction words, and determine at least one of a second number of the first error correction words and a third number of the second error correction words; A first sub-reward score is determined based on the first quantity, the second quantity, and the complexity, and / or a second sub-reward score is determined based on the first quantity and the third quantity; The fourth reward score is determined based on the first sub-reward score and / or the second sub-reward score.
14. The model training method according to claim 13, characterized in that, The step of determining the first sub-reward score based on the first quantity, the second quantity, and the complexity includes: Determine the first difference obtained by subtracting the second quantity from the first quantity, and divide the first difference by the first quantity to obtain the second value; Determine the second difference obtained by subtracting the second coefficient from the third preset value, and multiply the second difference by the complexity to obtain the third value; The sum of the second value and the third value is determined as the first sub-reward score.
15. The model training method according to claim 13, characterized in that, The step of determining the second sub-reward score based on the first quantity and the third quantity includes: The fourth value obtained by dividing the third quantity by the first quantity is inverted to obtain the second sub-reward score.
16. An English error correction method, characterized in that, The English error correction method includes: Identify the second English text that needs correction; The second English text is input into the English error correction model to obtain the error-corrected text for the second English text. The English error correction model includes a first sub-model for correcting grammatical errors in the English text. The first sub-model is trained by the model training method as described in any one of claims 6-15.
17. The English grammar correction method according to claim 16, characterized in that, The English error correction model further includes a second sub-model for detecting grammatical errors in English text. The step of inputting the second English text into the English error correction model to obtain the corrected text includes: The second sub-model is used to detect grammatical errors in the second English text, and the detection results are obtained. When the detection result indicates that there is a grammatical error in the second English text, the first sub-model is used to correct the grammatical error in the English text to obtain the corrected text for the second English text.
18. A training data acquisition device, characterized in that, The training data acquisition device includes: The first acquisition module is used to acquire the first English text to be corrected. The first processing module is used to correct the first English text using a large model to obtain a first corrected text, and to determine the error correction score of the first corrected text based on the first corrected text and the first English text. The first determining module is used to determine training data for training the target model based on the error correction score of the first error-corrected text, wherein the target model is used to correct erroneous English text.
19. A model training device, characterized in that, The model training device includes: The second acquisition module is used to acquire training data, which is obtained by the training data acquisition device according to claim 18. The training module is used to train the initial error correction model based on the training data to obtain an English error correction model for correcting erroneous English text.
20. An English error correction device, characterized in that, The English grammar correction device includes: The second determination module is used to determine the second English text to be corrected. The second processing module is used to input the second English text into the English error correction model to obtain the error-corrected text for the second English text, wherein the English error correction model includes a first sub-model for correcting grammatical errors in the English text, and the first sub-model is trained by the model training method as described in claim 19.
21. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program performs the steps of the method described in any one of claims 1-17.
22. An electronic device, characterized in that, include: A memory on which computer programs are stored; A processor for executing the computer program in the memory to implement the steps of the method according to any one of claims 1-17.
23. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program implements the steps of the method according to any one of claims 1-17.