Translation model training method, translation method, device, medium and electronic equipment

By training the translation model with training sample texts and thought chains for classical Chinese allusions and idioms, the problem of distortion in the translation of classical Chinese allusions and idioms was solved, achieving a more accurate and natural translation effect, and improving translation quality and user satisfaction.

CN122242538APending Publication Date: 2026-06-19NEW ORIENTAL EDUCATION & TECH GRP CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NEW ORIENTAL EDUCATION & TECH GRP CO LTD
Filing Date
2026-03-30
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing translation systems suffer from semantic failures when handling classical allusions and idioms, leading to a decline in translation quality and accuracy. In particular, the end-to-end Transformer translation method still has shortcomings in handling classical allusions and idioms.

Method used

The translation model is trained by acquiring multiple sets of training samples, including sample texts to be translated, sample thought chains, and sample translations. Different training methods are used for allusions and words with and without corresponding translations to construct the target translation model.

Benefits of technology

It improves the accuracy of the translation model, enhances the translation quality of classical Chinese words and phrases, provides more accurate and natural translation results, meets users' demand for high-quality translation, reduces the need for manual intervention, and adapts to large-scale translation needs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242538A_ABST
    Figure CN122242538A_ABST
Patent Text Reader

Abstract

This disclosure relates to a translation model training method, translation method, apparatus, medium, and electronic device. The translation model training method includes: acquiring multiple sets of training samples, including sample text to be translated, sample thought chains of the sample text to be translated, and sample translations; the sample text to be translated includes sample allusions and phrases; using the sample text to be translated as input parameters to the model, and the sample thought chains and sample translations as output parameters to train the translation model; and determining the translation model that meets the training termination condition as the target translation model. This improves the accuracy of the target translation model, thereby enhancing the translation quality of texts containing allusions and phrases, providing more accurate and natural translation results, meeting users' needs for high-quality translation, and improving user satisfaction. Furthermore, it can automatically handle complex cultural elements, reduce the need for manual intervention, improve translation quality, and meet the needs of large-scale translation.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of computer technology, and more specifically, to a translation model training method, translation method, apparatus, medium, and electronic equipment. Background Technology

[0002] With globalization, the demand for cross-cultural communication is growing, and the need to accurately convey cultural connotations is becoming increasingly urgent. However, existing translation systems are insufficient in terms of cultural adaptability, leading to a decline in user experience. This is especially true when dealing with cultural elements such as classical allusions and idioms. Because of their historical origins and extended meanings, literal translation of classical allusions and idioms can lead to a failure in semantic communication, severely impacting the quality and accuracy of the translation—in other words, resulting in translation distortion.

[0003] Currently, various methods have been tried to address the problem of inaccurate translation of classical Chinese allusions and idioms, including human translation, rule-based matching, and end-to-end Transformer-based translation. However, human translation suffers from low efficiency, high cost, and difficulty in scaling. Rule-based matching lacks flexibility and cannot dynamically identify and handle complex cultural elements. While end-to-end Transformer-based translation has certain advantages in semantic understanding, it still has shortcomings in processing classical Chinese allusions and idioms. Summary of the Invention

[0004] The purpose of this disclosure is to provide a translation model training method, translation method, apparatus, medium, and electronic device.

[0005] To achieve the above objectives, the first aspect of this disclosure provides a translation model training method, the method comprising: Multiple sets of training samples are obtained, including sample text to be translated, sample thought chains of the sample text to be translated, and sample translations generated based on the sample thought chains. The sample text to be translated includes sample allusions and words. The translation model is trained by using the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters. The translation model that meets the training termination condition is identified as the target translation model.

[0006] A second aspect of this disclosure provides a translation method, the method comprising: Obtain the text to be translated, which includes at least one target allusion word; Based on the text to be translated and the target translation model, the target translation of the text to be translated is obtained; The target translation model is obtained according to the translation model training method described in the first aspect of this disclosure.

[0007] A third aspect of this disclosure provides a translation model training apparatus, the translation model training apparatus comprising: The first acquisition module is used to acquire multiple sets of training samples. The training samples include sample text to be translated, sample thought chains of the sample text to be translated, and sample translations generated based on the sample thought chains. The sample text to be translated includes sample allusions and words. The first training module is used to train the translation model by taking the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters. The first determination module is used to determine the translation model that meets the training termination condition as the target translation model.

[0008] A fourth aspect of this disclosure provides a translation apparatus, the translation apparatus comprising: The second acquisition module is used to acquire the text to be translated, wherein the text to be translated includes at least one target allusion word; The second determining module is used to obtain the target translation of the text to be translated based on the text to be translated and the target translation model; The target translation model is obtained according to the translation model training method described in the first aspect of this disclosure.

[0009] The fifth aspect of this disclosure provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the translation model training method described in the first aspect of this disclosure, and / or implements the steps of the translation method described in the second aspect of this disclosure.

[0010] A sixth aspect of this disclosure provides an electronic device, comprising: A memory on which computer programs are stored; A processor is configured to execute the computer program in the memory to implement the steps of the translation model training method of the first aspect of this disclosure, and / or to implement the steps of the translation method of the second aspect of this disclosure.

[0011] The above technical solution trains the translation model by using sample text containing allusions and related terms as input parameters, and sample thought chains and translated texts as output parameters. This improves the accuracy of the target translation model, enhancing the translation quality of texts containing allusions and related terms during subsequent translations. This results in more accurate and natural translations, meeting users' demands for high-quality translations and increasing user satisfaction. Furthermore, it automates the processing of complex cultural elements, reducing the need for manual intervention, improving translation quality, and meeting the demands of large-scale translation.

[0012] Other features and advantages of this disclosure will be described in detail in the following detailed description section. Attached Figure Description

[0013] The accompanying drawings are provided to further illustrate the present disclosure and form part of the specification. They are used together with the following detailed description to explain the present disclosure, but do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart illustrating a translation model training method according to an exemplary embodiment.

[0014] Figure 2 This is a schematic diagram illustrating a method for training a first model based on a first type of training samples, according to an exemplary embodiment.

[0015] Figure 3 This is a schematic diagram illustrating a method for training a second model based on a second type of training samples, according to an exemplary embodiment.

[0016] Figure 4 This is a flowchart illustrating a translation method according to an exemplary embodiment.

[0017] Figure 5 This is a block diagram illustrating a translation model training apparatus according to an exemplary embodiment.

[0018] Figure 6 This is a block diagram illustrating a translation apparatus according to an exemplary embodiment.

[0019] Figure 7 This is a block diagram illustrating an electronic device according to an exemplary embodiment. Detailed Implementation

[0020] The specific embodiments of this disclosure will be described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for illustration and explanation only and are not intended to limit this disclosure.

[0021] To improve translation quality, meet globalization demands, and enhance product competitiveness in the international market, a new translation solution is needed to address the translation distortion caused by cultural omissions during the translation process, particularly the distortion of cultural elements such as classical allusions and phrases.

[0022] In view of this, this disclosure provides a translation model training method, translation method, apparatus, medium, and electronic device. By using sample text to be translated, including sample allusions and phrases, as input parameters to the model, and using sample thought chains and sample translations of the sample text as output parameters, the translation model is trained to obtain a target translation model. This improves the accuracy of the target translation model, enhancing the translation quality of texts containing allusions and phrases when translating based on the target model, providing more accurate and natural translation results, meeting users' demands for high-quality translation, and increasing user satisfaction. Furthermore, it can automatically handle complex cultural elements, reduce the need for manual intervention, improve translation quality, and meet the needs of large-scale translation.

[0023] Figure 1 This is a flowchart illustrating a translation model training method according to an exemplary embodiment. For example... Figure 1 As shown, the translation model training method may include the following steps.

[0024] In step S11, multiple sets of training samples are obtained. The training samples include the sample text to be translated, the sample thought chain of the sample text to be translated, and the sample translation generated based on the sample thought chain. The sample text to be translated includes sample allusions and words.

[0025] The Chain of Thought (CoT) refers to the thought process involved in reasoning, specifically the thought process of translating the sample text into a translation. Furthermore, the translation can be in any language different from the sample text. For example, assuming the sample text is Chinese, the translation could be in English, French, Japanese, etc.

[0026] In step S12, the sample text to be translated is used as the model input parameter, and the sample thought chain and sample translation are used as the model output parameters to train the translation model.

[0027] The purpose of sample thought chains is to enable models to simulate the thought processes of human translators, thereby taking into account linguistic nuances and cultural context during translation, and improving the quality and naturalness of the translation. This approach helps models better understand and handle the complexities of language, especially when dealing with culturally specific texts containing allusions and idioms. For example, sample thought chains are used to teach models how to translate a sample text into a sample translation.

[0028] In step S13, the translation model that meets the training termination condition is determined as the target translation model.

[0029] Training termination conditions may include reaching the required number of predictions in the training epochs, and / or the model's error being less than or equal to a preset error. When these training termination conditions are met, the model at that point can be designated as the target translation model.

[0030] By employing the aforementioned technical solution, the translation model is trained using sample text containing allusions and idioms as input parameters, and sample thought chains and translated texts as output parameters. This results in a target translation model. The accuracy of the target translation model is improved, enhancing the translation quality of texts containing allusions and idioms in subsequent translations based on it. This provides more accurate and natural translation results, meeting users' demands for high-quality translations and increasing user satisfaction. Furthermore, it can automatically handle complex cultural elements, reducing the need for manual intervention, improving translation quality, and meeting the demands of large-scale translation.

[0031] Considering that some classical allusions and phrases have corresponding translations, while others do not, in order to improve the accuracy of the translation model, different training methods are used to train different models for classical allusions and phrases that do not have corresponding translations.

[0032] The training process of the first target model is described below. The first target translation model is used to translate texts containing allusions and words that do not have corresponding translations.

[0033] In one embodiment, obtaining multiple sets of training samples may include performing the following steps for each training sample: Obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If the preset allusion knowledge base does not contain a translation of the sample allusion word corresponding to the sample allusion word, then the sample text to be translated is determined as the first sample text to be translated, the sample allusion word is determined as the first sample allusion word, and the first semantic information corresponding to the first sample allusion word and the translation of the first semantic information corresponding to the first semantic information are determined. The first semantic information includes the literal meaning and / or the extended meaning. The allusion knowledge base includes at least multiple allusion words and translations of some allusion words. Determine the first sample thought chain of the first sample text to be translated, and translate the sample text to be translated based on the first sample thought chain and the first semantic information translation to obtain the first sample translation; The first type of training samples is composed of the first sample text to be translated, the first sample thought chain, and the first sample translation, and the multiple sets of training samples include the first type of training samples.

[0034] First, obtain the sample text to be translated and identify the sample allusions and related terms included in the sample text. For example, an allusion knowledge base can be constructed by collecting all allusions and related terms; ideally, this knowledge base should include all allusions and related terms. Then, sample allusions and related terms are identified from the sample text to be translated based on the allusion knowledge base using rule-based matching. For example, sample allusions and related terms can be identified from the sample text to be translated based on string matching using the allusion knowledge base.

[0035] Furthermore, when collecting allusions and terms, corresponding translations can also be collected. It should be understood that not all allusions and terms have corresponding translations; therefore, the constructed allusion knowledge base may only contain translations for some allusions and terms. If a translation of the identified sample allusion and term is found in the allusion knowledge base, it is determined that a corresponding translation exists in the allusion knowledge base; if no translation is found, it is determined that no corresponding translation exists in the allusion knowledge base.

[0036] Next, if it is determined that there is no translation of the sample allusion word corresponding to the sample allusion word in the allusion knowledge base, the sample text to be translated can be determined as the first sample text to be translated, the sample allusion word can be determined as the first sample allusion word, and the first semantic information corresponding to the first sample allusion word and the translation of the first semantic information corresponding to the first semantic information can be determined.

[0037] For example, suppose the sample text to be translated is "He always does things the old way, without knowing how to adapt, truly like refusing to adapt to changes; sticking rigidly to outdated methods." The first semantic information identified includes both literal and extended meanings. The literal meaning is: to carve marks on a boat to find a lost sword. The extended meaning is: a metaphor for being rigidly bound by rules, inflexible, and failing to adjust methods according to actual circumstances. Then, the translation of the first phonetic information is determined based on the first phonetic information. For example, existing translation techniques can be used to translate the literal meaning into a literal translation and the extended meaning into an extended meaning translation. For instance, the literal translation is: cutting marks on the boat to seek the sword, and the extended meaning is: refusing to adapt to changes; sticking rigidly to outdated methods. Thus, the translation of the first phonetic information is obtained.

[0038] It should be understood that the literal or figurative meaning may also be identified, and this disclosure does not limit this.

[0039] Next, the first sample thought chain of the first sample text to be translated is determined, and the sample text to be translated is translated based on the first sample thought chain and the first semantic information translation.

[0040] The process for constructing the first sample thought chain is as follows: 1. Analyze the allusions and idioms in the first sample text to be translated. This mainly involves identifying the allusions and their sources, and extracting primary semantic information. Primary semantic information includes both literal and figurative meanings.

[0041] 2. Match the equivalent expression in the target language. Translate the first speech information into the corresponding target speech translation.

[0042] 3. Analyze the structure of the first sample text to be translated and proceed with step-by-step translation. Begin the step-by-step translation by considering the context of the allusions and words.

[0043] 4. Combine the results of each step of the translation process into a final translation.

[0044] For example, taking the first sample text to be translated as "He always does things the old way and doesn't know how to be flexible, which is like trying to find a sword by marking the boat" as an example, the first sample thought chain is constructed according to the above steps.

[0045] First, we analyze the allusions and words in the first sample text to be translated.

[0046] The idiom is "carving a mark on a boat to find a lost sword." It originates from the *Lüshi Chunqiu* (Master Lü's Spring and Autumn Annals). The poem "Observing the Present" tells the story of a man from the state of Chu who lost his sword in the water while traveling by boat. He made a mark on the side of the boat and then went into the water to look for the sword after the boat stopped, but ultimately failed to find it. The first semantic information includes the literal meaning and the extended meaning. The literal meaning is: making a mark on the boat to find the lost sword. The extended meaning is: to be rigid in one's ways, not to be flexible, and not to adjust one's methods according to the actual situation.

[0047] Next, we match the equivalent expression in the target language. Assuming the target language is English, the literal translation is: cutting marks on the boat to seek the sword, and the implied translation is: refusing to adapt to changes; sticking rigidly to outdated methods.

[0048] After that, analyze the structure of the first sample text to be translated and translate it step by step. For the sentence "He always does things in the old way", the structure is: subject (He) + frequency adverb (always) + adverbial of manner (in the old way) + predicate (does things); the corresponding translation is: He always does things using old methods. For the sentence "without knowing how to adapt", the structure is: negative phrase (without knowing) + verb (adapt, that is, flexibly adjust the method); the corresponding translation is: without knowing how to adapt / without being willing to adapt. For the sentence "He is truly like someone who cuts marks on the boat to seek the sword", the structure is: emphatic sentence pattern (truly) + allusion idiom (cut marks on the boat to seek the sword, which needs to be translated in combination with its extended meaning); the corresponding translation is: He is truly like someone who cuts marks on the boat to seek the sword, or a more natural equivalent expression: He is truly rigid and refuses to adapt, just like the idiom of "cutting marks on the boat to seek the sword".

[0049] Finally, combine them into a complete translated text: He always does things using old methods without knowing how to adapt, truly like someone who cuts marks on the boat to seek the sword. That is, the first sample translation is "He always does things using old methods without knowing how to adapt, truly like someone who cuts marks on the boat to seek the sword".

[0050] The thinking chain of the first sample is: First, analyze the allusion idiom in the first sample text to be translated. The allusion idiom in this sentence is "cut marks on the boat to seek the sword", and its origin: "吕氏春秋" should be "Master Lu's Spring and Autumn Annals" "Observing the Present" tells the story of a man from the State of Chu whose sword fell into the water while he was on a boat. He made a mark on the side of the boat and waited for the boat to stop before diving into the water to look for the sword according to the mark, but he ultimately failed to retrieve it. Literal meaning: Making marks on the boat to search for the lost sword. Extended meaning: Metaphorically referring to being rigidly adherent to established methods, not understanding how to adapt, and not adjusting methods according to the actual situation. Secondly, the equivalent expressions that match the first semantic information are: for the literal meaning, the translation is "cutting marks on the boat to seek the sword"; for the extended meaning, the translation is "refusing to adapt to changes; sticking rigidly to outdated methods". Thirdly, analyzing the structure of the first sample text to be translated and translating it step by step: "He always does things in the old way", structure: subject (He) + frequency adverb (always) + adverbial of manner (in the old way) + predicate (does things), translation: He always does things using old methods. "Without knowing how to adapt", structure: negative phrase (without knowing) + verb (adapt, which means to flexibly adjust methods), translation: without knowing how to adapt / without being willing to adapt. "He is truly like someone who cuts marks on the boat to seek the sword", structure: emphatic sentence pattern (truly) + allusion idiom (cutting marks on the boat to seek the sword, which needs to be translated in combination with the extended meaning), translation: He is truly like someone who cuts marks on the boat to seek the sword. Or a more natural equivalent expression: He is truly rigid and refuses to adapt, just like the idiom of "cutting marks on the boat to seek the sword". Finally, after completing the translation of this sentence, the final translation of this sentence is: He always does things using old methods without knowing how to adapt, truly like someone who cuts marks on the boat to seek the sword.

[0051] In this embodiment, the first sample thought chain includes the first sample allusion word, the first semantic information corresponding to the first sample allusion word, the first semantic information translation corresponding to the first semantic information, the step-by-step translation result of the first sample text to be translated, and the first sample translation. In addition, the first sample thought chain also includes the source or origin of the sample allusion word to facilitate a better understanding of the sample allusion word.

[0052] After determining the first sample thought chain and the first sample translation of the first sample text to be translated in the manner described above, the first type of training samples are formed based on the first sample text to be translated, the first sample thought chain and the first sample translation.

[0053] By employing the aforementioned technical solution, texts containing allusions and idioms without corresponding translations can be identified and extracted to generate a first type of training sample. Subsequently, a translation model can be trained based on this first training sample. This trained model will then be able to process texts containing such allusions and idioms without corresponding translations. This expands the capabilities of the translation model, enabling it to more accurately translate texts containing culturally specific expressions and allusions, thereby improving the overall quality and adaptability of the translation.

[0054] After obtaining the first type of training samples, the first model can be trained based on the first type of training samples to obtain the first target translation model. In one embodiment, the translation model is trained by using the sample text to be translated as the model input parameters and the sample thought chain and sample translation as the model output parameters, which may include: Input the first sample text to be translated from the first type of training samples into the first translation model to obtain the first thought chain and the first translation generated by the first translation model. The first translation is generated based on the first thought chain. Based on the first translation and the first sample translation, determine the loss value of the first translation model; Adjust the parameters of the first translation model based on the loss value of the first translation model; The step of determining the translation model that meets the training termination condition as the target translation model includes: The first translation model that meets the training termination condition is determined as the first target translation model, and the target translation model includes the first target translation model.

[0055] In this disclosure, training can be performed based on GRPO (Gradient-based Reinforcement Policy Optimization).

[0056] Figure 2 This is a schematic diagram illustrating the training of a first model based on a first type of training samples, according to an exemplary embodiment. For example... Figure 2 As shown, assume that a language model is first initialized as the first model, denoted as the initialization strategy model. For example, Qwen-7B-Instruct can be used as the policy model, and the hyperparameters of the first model can be set to... and The initial reference strategy is The first type of training samples are sampled to obtain the training samples used in this training round. The training samples used in each training round may be different. For example... Figure 2 As shown, the first type of training samples can be denoted as distribution P1(Q). In each training round, the first sample text to be translated is sampled from distribution P1(Q). x 1. For example, in the first training round, the first sample text to be translated is... x 1. Input Initialization Strategy Model The initialization strategy model is obtained from this. Multiple first translations are generated, and each first translation corresponds to a first thought chain. Then, based on the first translations and the first sample translations, the loss value of the first translation model is determined, and the parameters of the first translation model are adjusted according to the loss value.

[0057] In one approach, there are multiple first translations. Determining the loss value of the first translation model based on the first translation and the first sample translation may include: For each of the first translations, a reward for the first translation is determined based on the first sample text to be translated, the first sample translation, and the first translation; and an advantage value for the first translation is determined based on the rewards of the plurality of first translations. The loss value of the first translation model is determined based on the advantage value of each of the first translations.

[0058] In this approach, the current strategy model can be used. Multiple first translations were collected and denoted as follows: For each first translation, a reward is calculated. It should be understood that the model is updated each time it is trained using a training sample; therefore, the current policy model here... This refers to the strategy model whose parameters have not yet been updated. In other words, the model used in this round of updates is called... .

[0059] The specific implementation method for determining the reward of the first translation based on the first sample text to be translated, the first sample translation, and the first translation can be as follows: determining a first reward based on the first sample text to be translated, the first sample translation, and the first translation; determining a second reward based on the cultural background knowledge related to the first translation and the first sample allusions and words; determining a third reward based on the first translation and other first translations of the first sample text to be translated; determining a fourth reward based on the first translation; and determining the reward of the first translation based on the first reward, the second reward, the third reward, and the fourth reward.

[0060] For example, the first reward could be a semantic fidelity reward. The second reward could be a cultural background matching reward. The third reward can be a self-consistent reward. The fourth reward could be a smoothness reward. x1 represents the first sample text to be translated, and y represents the first translated text. (Refer to...) Figure 2 Rewards for semantic fidelity Cultural background matching reward Self-consistent rewards Smoothness reward The reward for the first translation is calculated using a weighted summation method. .

[0061] Among them, the reward for the first translation . , , and These represent the weights of the semantic fidelity reward, cultural background matching reward, self-consistency reward, and fluency reward, respectively. The values ​​of each weight can be determined empirically or based on different user needs for translation; this disclosure does not impose any limitations on this.

[0062] Optionally, determining the first reward based on the first sample text to be translated, the first sample translation, and the first translation may include: Determine the first similarity between the first sample text to be translated and the first sample translation; Determine the second similarity between the first sample text to be translated and the first translated text; Determine the bilingual substitution test (BLEU) scores of the first sample translation and the first translation. The first reward is determined based on the first similarity, the second similarity, and the BLEU score.

[0063] For example, determining the first similarity between the first sample text to be translated and the first sample translation can be done by calculating the cosine similarity between the embedding representations of the first sample text to be translated and the embedding representations of the first sample translation. For instance, the first similarity can be denoted as... , This is the embedded representation of the first sample text to be translated. This is the embedded representation of the first sample translation.

[0064] Determining the second similarity between the first sample text to be translated and the first translated text can be done by calculating the cosine similarity between the embedding representations of the first sample text and the first translated text. For example, the second similarity can be denoted as... , This is an embedded representation of the first translation.

[0065] The BLEU score for determining the first sample translation and the first translation can be obtained by: according to the formula BP1 stands for brevity penalty 1, used to penalize translations that are too short. Its calculation formula is as follows:

[0066] Where c1 is the length of the first translation and r1 is the length of the first sample translation. It is the n-gram precision, representing the proportion of n-grams in the first translation that match the first sample translation. It is the weight, usually taken as , where N1 is the largest n-gram order (usually 4).

[0067] The first similarity was determined in the manner described above. Second similarity and BLEU score Then, the first reward is determined using the following formula. :

[0068] Optionally, a second reward may be determined based on the cultural background knowledge related to the first translation and the first sample allusions, including: determining the similarity between the cultural background knowledge related to the first translation and the first sample allusions as the second reward.

[0069] For example, the cultural background matching bonus, i.e., the second bonus, can be determined using the following formula. :

[0070] in, A collection of documents representing the cultural background knowledge of the idiom or allusion. Embedded representation of the first translation An embedded representation of a collection of documents containing cultural background knowledge of the idiom or allusion. Characterizes cosine similarity.

[0071] Optionally, determining the third reward of the first translation based on the first translation and other first translations of the first sample text to be translated includes: taking the average similarity between the first translation and other first translations as the third reward of the first translation.

[0072] For example, the self-consistent reward, i.e., the third reward, can be determined using the following formula. :

[0073] Where G represents the number of the first translations, Representing the i-th first translation, Represents the j-th first translation.

[0074] Optionally, determining a fourth reward for the first translation based on the first translation includes: determining the perplexity of the first translation as the fourth reward for the first translation.

[0075] For example, the fluency bonus of the first translation, i.e., the fourth bonus, can be determined using the following formula. :

[0076] The perplexity can be obtained based on the GPT2 and LLaMA models.

[0077] After obtaining the first, second, third, and fourth rewards in the manner described above, and obtaining the reward for the first translation based on the first, second, third, and fourth rewards, an advantage estimation is performed, that is, the advantage value of the first translation is further determined.

[0078] Optionally, the advantage value of the first translation is determined based on the rewards of multiple first translations, including: Determine the difference between the reward for the first translation and the average reward for multiple first translations; The ratio of the difference to the standard deviation of the rewards of multiple first translations is determined as the advantage value of the first translation.

[0079] For example, the advantage value of each first translation can be determined using the following formula:

[0080] in, The value representing the advantage of the i-th first translation. The reward for the i-th first translation is represented. The average reward representing multiple first translations. The standard deviation representing the reward for multiple first translations.

[0081] After calculating the advantage value of each first translation as described above, the loss value of the first translation model is determined based on the advantage value of each first translation. For example, the policy gradient loss can be calculated. The policy gradient loss consists of two parts: PPO pruning objective: limiting the difference between the old and new policies, and KL divergence regularization: preventing the policy from deviating too far from the reference policy.

[0082] For example, the loss value of the first translation model can be determined using the following formula:

[0083] in, The characterization is based on the input of the first sample text to be translated. Time-based strategy model generate The probability, The representation of the policy model when the first sample text to be translated, x1, is input. generate The probability, Characterizing the first translation The advantage value, Used to restrict Within a range (e.g., [ To avoid gradient explosion or vanishing problems. Representation Strategy Model Compared with the reference strategy model Kullback-Leibler divergence between them .

[0084] After obtaining the loss value of the first model, the current model is updated. For example, such as... Figure 2 As shown, after determining the loss value, the parameters are updated and the gradient is calculated. The gradient update strategy model is then applied. Parameters in After convergence is determined, it is checked whether the maximum number of training steps has been reached. If the maximum number of training steps has been reached, training ends, and the first target translation model is obtained. If the maximum number of training steps has not been reached, it is checked whether the model performance has reached the target level, i.e., whether the loss value of the first model is less than the preset value. If the loss value of the first model is less than the preset value, the target level has been reached, and training ends, resulting in the first target translation model. If the loss value of the first model is not less than the preset value, the target level has not been reached, and the model with updated parameters is updated as the initial model, and training is performed again, i.e., the old model is updated. Then return to the sampling phase.

[0085] Thus, we can obtain the first target translation model for translating texts containing allusions and idioms, including those without corresponding translations. This improves the translation quality of sentences containing allusions and idioms, providing more accurate and natural translations, meeting users' demands for high-quality translations, and enhancing user satisfaction and loyalty.

[0086] The training process of the second-objective model is described below. The second-objective translation model is used to translate texts containing allusions and words with corresponding translations.

[0087] In one embodiment, obtaining multiple sets of training samples may include performing the following steps for each training sample: Obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If a sample allusion translation exists in the preset allusion knowledge base that corresponds to the sample allusion, then the sample text to be translated is determined as the second sample text to be translated, the sample allusion is determined as the second sample allusion, and the sample allusion translation corresponding to the second sample allusion is determined as the second sample allusion translation. The allusion knowledge base includes at least multiple allusion words and some allusion translations. The second sample thought chain of the second sample text to be translated is determined, and the second sample text to be translated is translated according to the second sample thought chain and the translation of the second sample allusion words to obtain the second sample translation; The second type of training samples are composed of the second sample text to be translated, the second sample thought chain, and the second sample translation, and the multiple sets of training samples include the second type of training samples.

[0088] The specific methods for obtaining the sample text to be translated and identifying the sample allusions and words included in the sample text have been described above and will not be repeated here.

[0089] When it is determined that there is a sample allusion translation in the allusion knowledge base that corresponds to the sample allusion, the sample text to be translated is determined as the second sample text to be translated, the sample allusion is determined as the second sample allusion, and the sample allusion translation corresponding to the second sample allusion is directly determined as the second sample allusion translation.

[0090] Furthermore, in order to further improve the accuracy of the determined translations of the second sample allusions, the translations of the sample allusions corresponding to the second sample allusions can be determined as the translations of the second sample allusions. This can include: when it is determined that the extended meaning of the translations of the sample allusions corresponding to the second sample allusions matches the extended meaning of the sample allusions, the translations of the sample allusions corresponding to the second sample allusions are determined as the translations of the second sample allusions.

[0091] For example, it can be determined whether the extended meaning of the second sample allusion matches or is consistent with the extended meaning of the corresponding sample allusion translation. If they match or are consistent, the corresponding sample allusion translation is then determined as the second sample allusion translation. This improves the accuracy of determining the second sample allusion translation.

[0092] The process for constructing the second sample thought chain is as follows: 1. Analysis of Allusions and Terms in the Second Sample Text to be Translated (Two-Way Analysis). This mainly involves identifying the allusions and terms in the second sample text to be translated, and retrieving the corresponding English allusions from a pre-defined allusion knowledge base. The analysis includes determining the origin, literal meaning, and extended meaning of the allusions and terms, and simultaneously analyzing the origin, literal meaning, and extended meaning of the corresponding sample allusion translation. If the extended meanings of both are consistent, the sample allusion translation corresponding to the second sample allusion is determined as the correct translation of the second sample allusion.

[0093] 2. Match the equivalent allusion to the target speech (direct replacement). Directly abandon the literal translation of the second sample allusion and directly use the corresponding English allusion, because it is a conventional expression in English and can convey the same meaning without additional explanation.

[0094] 3. Analyze the structure of the second sample text to be translated and translate it step by step. Analyze the structure of the second sample text to be translated and translate it step by step by combining the corresponding translations of the words in the allusion knowledge base.

[0095] 4. Combine the results of each step of the translation process into a final translation.

[0096] The following uses the second sample text to be translated, "He always does useless work, it's like casting pearls before swine," as an example to construct the thought chain for the second sample.

[0097] First, we analyze the allusions and words in the second sample text to be translated.

[0098] The Chinese allusion phrase is: "play the lute to a cow". Source: "On Doubts" (telling the story of a musician playing elegant music to a cow, and the cow remained unmoved. Later, it is used to metaphorically mean wasting words on someone who doesn't understand). Literal meaning: playing a lute piece to a cow. Extended meaning: metaphorically meaning wasting efforts on someone who doesn't understand reason or doesn't have the ability to understand. English equivalent allusion: "cast pearls before swine" (from the Bible, Matthew: "Do not give dogs what is sacred; do not throw your pearls to pigs. If you do, they may trample them under their feet, and turn and tear you to pieces."). Literal meaning: throwing pearls in front of pigs. Extended meaning: metaphorically meaning offering something precious or profound to someone without appreciation or understanding, in vain. Among them, the extended meanings of Chinese and English are exactly the same, but the metaphorical images are different.

[0099] Next, match the equivalent allusion in the target language (direct replacement). Abandon the literal translation of the Chinese allusion "play the lute to a cow", and directly use the English equivalent allusion "cast pearls before swine", because it is a well-established expression in English and can convey the same meaning without additional explanation.

[0100] After that, analyze the structure of the second sample text to be translated and translate it step by step. For the sentence "He always does useless work", structure: subject + frequency adverb + verb-object phrase ("do useless work", "do futile things"), translation: He always does futile things. For the sentence "It's simply like playing the lute to a cow", structure: emphatic sentence pattern + allusion (which needs to be replaced with the English equivalent allusion), translation: He is simply [cast pearls before swine] (directly embed the English allusion to replace the Chinese metaphor).

[0101] Finally, combine them into a complete translation. The final translation, which is the second sample translation, is "He always does futile things - he is simply casting pearls before swine".

[0102] So far, we have obtained the second sample thought chain and the second sample translation of the second text to be translated based on the second sample thought chain.

[0103] After determining the second sample thought chain of the second text to be translated and the second sample translation in the above way, a second type of training sample is formed according to the second text to be translated, the second sample thought chain, and the second sample translation.

[0104] By employing the aforementioned technical solution, texts to be translated containing allusions and idioms with existing corresponding translations can be identified and extracted to generate a second type of training sample. The translation model can then be trained based on this second training sample. This trained model will be able to process texts containing such allusions and idioms with existing translations. In this way, the capabilities of the translation model can be expanded, enabling it to more accurately translate texts containing culturally specific expressions and allusions, thereby improving the overall quality and adaptability of the translation.

[0105] After obtaining the second type of training samples, the second model can be trained based on the second type of training samples to obtain the second target translation model. In one embodiment, the translation model is trained by using the sample text to be translated as the model input parameters and the sample thought chain and sample translation as the model output parameters, which may include: The second sample text to be translated from the second type of training samples is input into the second translation model to obtain the second thought chain and the second translation generated by the second translation model. The second translation is generated based on the second thought chain. Based on the second translation and the second sample translation, determine the loss value of the second translation model; Adjust the parameters of the second translation model based on the loss value of the second translation model; The step of determining the translation model that meets the training termination condition as the target translation model includes: The second translation model that meets the training termination condition is determined as the second target translation model, and the target translation model includes the second target translation model.

[0106] In this disclosure, training can be performed based on GRPO (Gradient-based Reinforcement Policy Optimization).

[0107] Figure 3 This is a schematic diagram illustrating the training of a second model based on a second type of training samples, according to an exemplary embodiment. For example... Figure 3 As shown, assume that a language model is first initialized as the second model and the initial model is denoted as the initialization strategy model. For example, Qwen-7B-Instruct can be used as the policy model, and the hyperparameters of the second model can be set to... and The initial reference strategy is The second type of training samples are sampled to obtain the training samples used in this training round. The training samples used in each training round may be different. For example... Figure 3As shown, the second type of training samples can be denoted as distribution P2(Q). In each training round, the second sample text to be translated, x2, is sampled from distribution P2(Q). For example, in each training round, the second sample text to be translated, x2, collected this time, is input into the initialization strategy model. The initialization strategy model is obtained from this. Multiple second translations are generated, and a second thought chain corresponds to each second translation. Then, based on the second translations and the second sample translations, the loss value of the second translation model is determined, and the parameters of the second translation model are adjusted according to the loss value.

[0108] In one approach, there are multiple second translations. The loss value of the second translation model is determined based on the second translation and the second sample translation. This can include: For each second translation, a reward for the second translation is determined based on the second sample text to be translated, the second sample translation, and the second translation; and an advantage value for the second translation is determined based on the rewards of the plurality of second translations. The loss value of the second translation model is determined based on the advantage value of each second translation.

[0109] In this approach, the current strategy model can be used. Multiple second translations were collected and denoted as... For each second translation, a reward is calculated. It should be understood that the model is updated each time it is trained using a training sample; therefore, the current policy model here... This refers to the strategy model whose parameters were not updated in this round; in other words, the model used in this update process is called... .

[0110] The specific implementation method for determining the reward of the second translation based on the second sample text to be translated, the second sample translation, and the second translation can be as follows: determining a fifth reward based on the second sample text to be translated, the second sample translation, and the second translation; determining a sixth reward for the second translation based on the second translation and other second translations of the second sample text to be translated; and determining a reward for the correspondence between allusions and words based on the second translation and the translations of allusions and words in the second sample text. The reward for the second translation is determined based on the fifth reward, the sixth reward, and the allusion correspondence reward.

[0111] For example, the fifth reward could be a semantic fidelity reward. The sixth reward can be a self-consistent reward. The reward for corresponding allusions can be recorded as . This is the second sample text to be translated. This represents the second translation. (See reference.) Figure 3 Rewards for semantic fidelity Self-consistent rewards Rewards for correspondence with historical allusions The reward for the second translation is calculated using a weighted summation method. .

[0112] Among them, the reward for the second translation , , and These represent the weights of the semantic fidelity reward, the self-consistency reward, and the allusion correspondence reward, respectively. The values ​​of each weight can be determined empirically or based on different user needs for translation; this disclosure does not impose any limitations on this.

[0113] Optionally, determining the fifth reward based on the second sample text to be translated, the second sample translation, and the second translation may include: Determine the third similarity between the second sample text to be translated and the second sample translation; Determine the fourth similarity between the second sample text to be translated and the second translation; Determine the bilingual substitution test (BLEU) scores for the second sample translation and the second sample translation; Determine the semantic role alignment between the second sample text to be translated and the second translated text; Determine the probability that the second translation contains the translation of the allusion words in the second sample; The fifth reward is determined based on the third similarity, the fourth similarity, the BLEU score, the semantic role alignment, and the probability.

[0114] For example, determining the third similarity between the second sample text to be translated and the second sample translation can be done by calculating the cosine similarity between the embedding representation of the second sample text and the embedding representation of the second sample translation. For instance, the third similarity can be denoted as... , This is the embedded representation of the second sample text to be translated. This is the embedded representation of the second sample translation.

[0115] Determining the fourth similarity between the second sample text to be translated and the second translated text can be achieved by calculating the cosine similarity between the embedded representation of the second sample text to be translated and the embedded representation of the second translated text. For example, the fourth similarity can be denoted as... , This is an embedded representation of the second translation.

[0116] The BLEU score for the second sample translation and the second translation can be determined by: according to the formula BP2 stands for brevity penalty 2, used to penalize translations that are too short. Its calculation formula is as follows:

[0117] Where c2 is the length of the second translation and r2 is the length of the second sample translation. It is the n-gram precision, representing the proportion of n-grams in the second translation that match the second sample translation. It is the weight, usually taken as , where N2 is the largest n-gram order (usually 4).

[0118] Semantic role alignment can be denoted as The value ranges from 0 to 1. The calculation method is as follows: First, use a cross-language SRL tool (such as BabelNet) to extract the predicates and arguments (Agent, Patient, etc.) of the second sample text to be translated and the second translation. Second, calculate the role matching ratio (e.g., in the second sample text, agent = government). In the second translation, the agent is equal to government, meaning the semantic roles of the second sample text to be translated and the second translation are matched.

[0119] The probability that the second translation contains the translation of the second sample's allusions can be called semantic entailment. The value range is [0,1]. The calculation method is as follows: use an NLI model (such as XLM-RoBERTa) to determine the inference model. The probability that the second translation contains the English allusion (Idiomen) corresponding to the idiom.

[0120] The third similarity was determined in the manner described above. Fourth similarity BLEU score Semantic role alignment and semantic entailment Then, the fifth reward is determined using the following formula. :

[0121] Optionally, determining the sixth reward of the second translation based on the second translation and other second translations of the second sample text to be translated may include: determining the average similarity between the second translation and the other second translations as the sixth reward of the second translation.

[0122] For example, the self-consistency reward, i.e., the sixth reward, of the second translation can be determined using the following formula. :

[0123] Where G represents the number of the second translation, Representing the i-th second translation, Representing the j-th second translation, The embedding representation representing the i-th second translation, The embedding representation of the j-th second translation.

[0124] It should be understood that the number of second translations may be the same as or different from the number of first translations, and this disclosure does not limit this.

[0125] Optionally, determining the allusion correspondence reward based on the second translation and the second sample allusion translation includes: if the second translation includes the second sample allusion translation, then the allusion correspondence reward is determined to be 1; if the second translation does not include the second sample allusion translation, then the allusion correspondence reward is determined based on the similarity between the second sample allusion translation and the second translation, as well as the fluency of the second translation.

[0126] For example, the reward for corresponding allusions can be determined using the following formula. :

[0127] in, The weights representing the naturalness of the language and the similarity to the definition are set, with a default value of 0.5. Indicating the fluency of the second translation, Embedded representation of the second translation Embedded representation of the translation of allusions and words in the second sample.

[0128] After determining the fifth reward, sixth reward, and allusion correspondence reward for each second translation using the above method, the reward for the second translation is determined based on the formula described above. Finally, the advantage value for each second translation is determined based on the rewards for all second translations.

[0129] For example, the advantage value of each second translation can be determined using the following formula:

[0130] in, The advantage value representing the i-th second translation, The reward for the i-th second translation. The average reward representing multiple second translations. The standard deviation representing the reward for multiple second translations.

[0131] After calculating the advantage value of each second translation as described above, the loss value of the second translation model is determined based on the advantage value of each second translation. For example, the policy gradient loss can be calculated. The policy gradient loss consists of two parts: PPO pruning objective: limiting the difference between the old and new policies, and KL divergence regularization: preventing the policy from deviating too far from the reference policy.

[0132] For example, the loss value of the second translation model can be determined using the following formula:

[0133] in, The loss value representing the second translation model. The characterization is based on the input of the second sample text to be translated. Time-based strategy model generate The probability, The characterization is based on the input of the second sample text to be translated. Time-based strategy model generate The probability, Characterizing the second translation The advantage value, Used to restrict Within a range (e.g., [ To avoid gradient explosion or vanishing problems. Representation Strategy Model Compared with the reference strategy model Kullback-Leibler divergence between them , These are the hyperparameters of the model.

[0134] After obtaining the loss value of the second model, the current model is updated. For example, such as... Figure 3 As shown, after determining the loss value, the parameters are updated and the gradient is calculated. The gradient update strategy model is then applied. Parameters in After convergence is determined, it is checked whether the maximum number of training steps has been reached. If the maximum number of training steps has been reached, training ends, and the second target translation model is obtained. If the maximum number of training steps has not been reached, it is checked whether the model performance has reached the target level, i.e., whether the loss value of the second model is less than the preset value. If the loss value of the second model is less than the preset value, the target level has been reached, and training ends, resulting in the second target translation model. If the loss value of the second model is not less than the preset value, the target level has not been reached, and the model with updated parameters is updated to the initial model, and training is performed again, i.e., the old model is updated. Then return to the sampling phase.

[0135] Thus, a second target translation model can be obtained for translating texts containing allusions and idioms, including those with corresponding translations. This improves the translation quality of sentences containing allusions and idioms, providing more accurate and natural translation results, meeting users' needs for high-quality translations, and enhancing user satisfaction and loyalty.

[0136] By employing the above technical solution, a first target translation model is trained to translate texts containing allusions that do not have corresponding translations, and a second target translation model is trained to translate texts containing allusions that do have corresponding translations. In this way, different target translation models can be used for different types of texts to be translated, further improving the accuracy of the translation.

[0137] It should be understood that translation models for different languages ​​can be trained in the above manner. For example, when the sample text to be translated is Chinese and the language of the translation is English, the target translation model obtained above is a model that translates Chinese into English. When the sample text to be translated is Korean and the language of the translation is Chinese, the target translation model obtained above is a model that translates Korean into Chinese.

[0138] This disclosure also provides a translation method. Figure 4 This is a flowchart illustrating a translation method according to an exemplary embodiment. For example... Figure 4 As shown, the translation method may include the following steps.

[0139] In step S41, the text to be translated is obtained, which includes at least one target allusion word.

[0140] The language of the text to be translated is the same as that of the sample text to be translated.

[0141] In step S42, the target translation of the text to be translated is obtained based on the text to be translated and the target translation model.

[0142] The target translation model is obtained based on the translation model training method provided in this disclosure.

[0143] By adopting the above technical solutions, the translation quality of texts containing allusions and classical Chinese phrases is improved, providing more accurate and natural translation results, meeting users' demands for high-quality translation, and enhancing user satisfaction. Furthermore, it can automate the processing of complex cultural elements, reducing the need for manual intervention, improving translation quality, and meeting the needs of large-scale translation.

[0144] In one embodiment, the target translation model includes a first target translation model and a second target translation model; obtaining the target translation of the text to be translated based on the text to be translated and the target translation model includes: If the pre-set allusion knowledge base does not contain a translation of each target allusion word, then the text to be translated is input into the first target translation model to obtain the target translation of the text to be translated. If the preset allusion knowledge base contains the translation of each target allusion word, then the text to be translated is input into the second target translation model to obtain the target translation of the text to be translated. If there are multiple target allusions, and the multiple target allusions include allusions with corresponding translations and allusions without corresponding translations, then the text to be translated is divided into a first type of subtext and a second type of subtext. The first type of subtext only includes allusions without corresponding translations, and the second type of subtext only includes allusions with corresponding translations. The first type of sub-text is input into the first target translation model to obtain the translation of the first type of sub-text. The second type of sub-text is input into the second target translation model to obtain the translation of the second type of sub-text. Based on the translations of the first type of sub-text and the second type of sub-text, the target translation of the text to be translated is obtained.

[0145] The first target translation model is used to translate texts containing allusions and words without corresponding translations, while the second target translation model is used to translate texts containing allusions and words with corresponding translations.

[0146] In one implementation, if no translation exists for each target allusion in the preset allusion knowledge base, the text to be translated is input into the first target translation model. That is, when no corresponding translation exists for any of the target allusions included in the text to be translated, the first target translation model can be used to translate the text. Specifically, inputting the text to be translated into the first target translation model yields the target translation of the text.

[0147] In another implementation, if a pre-defined allusion knowledge base contains a translation for each target allusion term, the text to be translated is input into a second target translation model. That is, when all target allusions in the text to be translated have corresponding translations, the second target translation model can be used to translate the text. In other words, inputting the text to be translated into the second target translation model yields the target translation of the text.

[0148] In another embodiment, if there are multiple target allusions, and these multiple target allusions include allusions with corresponding translations and allusions without corresponding translations, then in this embodiment, the text to be translated is first divided into a first type of sub-text and a second type of sub-text. The first type of sub-text includes only allusions without corresponding translations, and the second type of sub-text includes only allusions with corresponding translations. This results in sub-texts that include only allusions without corresponding translations, and sub-texts that include only allusions with corresponding translations. The first type of sub-text may include multiple sub-texts, and the second type of sub-text may also include multiple sub-texts.

[0149] Next, the first type of sub-text is input into the first target translation model to obtain its translation, and the second type of sub-text is input into the second target translation model to obtain its translation. Finally, based on the translations of the first and second types of sub-texts, the target translation of the text to be translated is obtained. For example, the translations of the first and second types of sub-texts can be sorted according to the sentence structure of the text to be translated to obtain the target translation.

[0150] This further improves translation efficiency and expands the scope of its application.

[0151] Based on the same inventive concept, this disclosure provides a translation model training device. Figure 5 This is a block diagram illustrating a translation model training apparatus according to an exemplary embodiment. Figure 5 As shown, the translation model training device 500 may include: The first acquisition module 501 is used to acquire multiple sets of training samples. The training samples include sample text to be translated, sample thought chains of the sample text to be translated, and sample translations generated based on the sample thought chains. The sample text to be translated includes sample allusions and words. The first training module 502 is used to train the translation model by taking the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters. The first determining module 503 is used to determine the translation model that meets the training termination condition as the target translation model.

[0152] Optionally, the first acquisition module 501 is configured to: For each training sample, obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If the preset allusion knowledge base does not contain a translation of the sample allusion word corresponding to the sample allusion word, then the sample text to be translated is determined as the first sample text to be translated, the sample allusion word is determined as the first sample allusion word, and the first semantic information corresponding to the first sample allusion word and the translation of the first semantic information corresponding to the first semantic information are determined. The first semantic information includes the literal meaning and / or the extended meaning. The allusion knowledge base includes at least multiple allusion words and translations of some allusion words corresponding to the allusion words. Determine the first sample thought chain of the first sample text to be translated, and translate the sample text to be translated based on the first sample thought chain and the first semantic information translation to obtain the first sample translation; The first type of training samples is composed of the first sample text to be translated, the first sample thought chain, and the first sample translation, and the multiple sets of training samples include the first type of training samples.

[0153] Optionally, the first sample thought chain includes the first sample allusion words, the first semantic information corresponding to the first sample allusion words, the first semantic information translation corresponding to the first semantic information, the step-by-step translation result of the first sample text to be translated, and the first sample translation.

[0154] Optionally, the first training module 502 is further configured to: Input the first sample text to be translated from the first type of training samples into the first translation model to obtain the first thought chain and the first translation generated by the first translation model. The first translation is generated based on the first thought chain. Based on the first translation and the first sample translation, determine the loss value of the first translation model; Adjust the parameters of the first translation model based on the loss value of the first translation model; The first determining module 503 is used for: The first translation model that meets the training termination condition is determined as the first target translation model, and the target translation model includes the first target translation model.

[0155] Optionally, the first translation may be multiple, and the first training module 502 is further configured to: For each of the first translations, a reward for the first translation is determined based on the first sample text to be translated, the first sample translation, and the first translation; and an advantage value for the first translation is determined based on the rewards of the plurality of first translations. The loss value of the first translation model is determined based on the advantage value of each of the first translations.

[0156] Optionally, the first training module 502 is further configured to: The first reward is determined based on the first sample text to be translated, the first sample translation, and the first translation. The second reward is determined based on the cultural background knowledge related to the first translation and the first sample allusions and words; Based on the first translation and other first translations of the first sample text to be translated, determine the third reward for the first translation; The fourth reward for the first translation is determined based on the first translation; The reward for the first translation is determined based on the first reward, the second reward, the third reward, and the fourth reward.

[0157] Optionally, the first training module 502 is further configured to: Determine the first similarity between the first sample text to be translated and the first sample translation; Determine the second similarity between the first sample text to be translated and the first translated text; Determine the bilingual substitution test (BLEU) scores of the first sample translation and the first translation. A first reward is determined based on the first similarity, the second similarity, and the BLEU score; and / or The similarity between the first translation and the cultural background knowledge related to the allusions and words in the first sample is determined as the second reward; and / or The average similarity between the first translation and other first translations is used as the third reward for the first translation; and / or The perplexity of the first translation is determined as the fourth reward for the first translation.

[0158] Optionally, the first training module 502 is further configured to: Determine the difference between the reward for the first translation and the average reward for multiple first translations; The ratio of the difference to the standard deviation of the rewards of multiple first translations is determined as the advantage value of the first translation.

[0159] Optionally, the first acquisition module 501 is further configured to: For each training sample, obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If a sample allusion translation exists in the preset allusion knowledge base that corresponds to the sample allusion, then the sample text to be translated is determined as the second sample text to be translated, the sample allusion is determined as the second sample allusion, and the sample allusion translation corresponding to the second sample allusion is determined as the second sample allusion translation. The allusion knowledge base includes at least multiple allusion words and some allusion translations. The second sample thought chain of the second sample text to be translated is determined, and the second sample text to be translated is translated according to the second sample thought chain and the translation of the second sample allusion words to obtain the second sample translation; The second type of training samples are composed of the second sample text to be translated, the second sample thought chain, and the second sample translation, and the multiple sets of training samples include the second type of training samples.

[0160] Optionally, the first acquisition module 501 is further configured to: When it is determined that the extended meaning of the sample allusion translation corresponding to the second sample allusion matches the extended meaning of the sample allusion, the sample allusion translation corresponding to the second sample allusion is determined as the second sample allusion translation.

[0161] Optionally, the first training module 502 is further configured to: The second sample text to be translated from the second type of training samples is input into the second translation model to obtain the second thought chain and the second translation generated by the second translation model. The second translation is generated based on the second thought chain. Based on the second translation and the second sample translation, determine the loss value of the second translation model; Adjust the parameters of the second translation model based on the loss value of the second translation model; The first determining module 503 is further configured to: The second translation model that meets the training termination condition is determined as the second target translation model, and the target translation model includes the second target translation model.

[0162] Optionally, the second translation may be multiple, and the first training module 502 is further configured to: For each second translation, a reward for the second translation is determined based on the second sample text to be translated, the second sample translation, and the second translation; and an advantage value for the second translation is determined based on the rewards of the plurality of second translations. The loss value of the second translation model is determined based on the advantage value of each second translation.

[0163] Optionally, the first training module 502 is further configured to: The fifth reward is determined based on the second sample text to be translated, the second sample translation, and the second translation. Based on the second translation and other second translations of the second sample text to be translated, determine the sixth reward for the second translation; The reward for the correspondence between the allusions is determined based on the second translation and the second sample translation of the allusions. The reward for the second translation is determined based on the fifth reward, the sixth reward, and the allusion correspondence reward.

[0164] Optionally, the first training module 502 is further configured to: Determine the third similarity between the second sample text to be translated and the second sample translation; Determine the fourth similarity between the second sample text to be translated and the second translation; Determine the BLEU score of the second sample translation and the second translation; Determine the semantic role alignment between the second sample text to be translated and the second translated text; Determine the probability that the second translation contains the translation of the allusion words in the second sample; A fifth reward is determined based on the third similarity, the fourth similarity, the BLEU score, the semantic role alignment, and the probability; and / or The average similarity between the second translation and the other second translations is determined as the sixth reward for the second translation; and / or If the second translation includes the translation of the second sample allusion, the allusion correspondence reward is determined to be 1. If the second translation does not include the translation of the second sample allusion, the allusion correspondence reward is determined based on the similarity between the second sample allusion translation and the second translation, as well as the fluency of the second translation.

[0165] Figure 6 This is a block diagram illustrating a translation apparatus according to an exemplary embodiment. Figure 6 As shown, the translation device 600 may include: The second acquisition module 601 is used to acquire the text to be translated, wherein the text to be translated includes at least one target allusion word; The second determining module 602 is used to obtain the target translation of the text to be translated based on the text to be translated and the target translation model; The target translation model is obtained based on the translation model training method provided in this disclosure.

[0166] Optionally, the target translation model includes a first target translation model and a second target translation model; the second determining module 602 is used for: If the pre-set allusion knowledge base does not contain a translation of each target allusion word, then the text to be translated is input into the first target translation model to obtain the target translation of the text to be translated. If the preset allusion knowledge base contains the translation of each target allusion word, then the text to be translated is input into the second target translation model to obtain the target translation of the text to be translated. If there are multiple target allusions, and the multiple target allusions include allusions with corresponding translations and allusions without corresponding translations, then the text to be translated is divided into a first type of subtext and a second type of subtext. The first type of subtext only includes allusions without corresponding translations, and the second type of subtext only includes allusions with corresponding translations. The first type of sub-text is input into the first target translation model to obtain the translation of the first type of sub-text. The second type of sub-text is input into the second target translation model to obtain the translation of the second type of sub-text. Based on the translations of the first type of sub-text and the second type of sub-text, the target translation of the text to be translated is obtained.

[0167] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.

[0168] Figure 7 This is a block diagram illustrating an electronic device according to an exemplary embodiment. Figure 7 As shown, the electronic device 700 may include a processor 701 and a memory 702. The electronic device 700 may also include one or more of a multimedia component 703, an input / output (I / O) interface 704, and a communication component 705.

[0169] The processor 701 controls the overall operation of the electronic device 700 to complete all or part of the steps in the aforementioned translation model training method and / or translation method. The memory 702 stores various types of data to support the operation of the electronic device 700. This data may include, for example, instructions for any application or method operating on the electronic device 700, and application-related data such as contact data, sent and received messages, images, audio, video, etc. The memory 702 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. Multimedia component 703 may include a screen and an audio component. The screen may be, for example, a touchscreen, and the audio component is used to output and / or input audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signals may be further stored in memory 702 or transmitted via communication component 705. The audio component also includes at least one speaker for outputting audio signals. I / O interface 704 provides an interface between processor 701 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual or physical buttons. Communication component 705 is used for wired or wireless communication between the electronic device 700 and other devices. Wireless communication may include Wi-Fi, Bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination thereof; therefore, the corresponding communication component 705 may include a Wi-Fi module, a Bluetooth module, or an NFC module.

[0170] In an exemplary embodiment, the electronic device 700 may be implemented by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field-programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the translation model training method and / or translation method described above.

[0171] In another exemplary embodiment, a computer-readable storage medium including program instructions is also provided, which, when executed by a processor, implement the steps of the translation model training method and / or translation method described above. For example, the computer-readable storage medium may be the memory 702 including the program instructions described above, which may be executed by the processor 701 of the electronic device 700 to complete the translation model training method and / or translation method described above.

[0172] In another exemplary embodiment, a computer program product is also provided, which includes a computer program executable by a processor, wherein the computer program, when executed by the processor, implements the steps of the above-described translation model training method and / or translation method.

[0173] In another exemplary embodiment, a computer program product is also provided, which includes a computer program executable by a processor, wherein the computer program, when executed by the processor, implements the steps of the above-described translation model training method and / or translation method.

[0174] The preferred embodiments of this disclosure have been described in detail above with reference to the accompanying drawings. However, this disclosure is not limited to the specific details of the above embodiments. Within the scope of the technical concept of this disclosure, various simple modifications can be made to the technical solutions of this disclosure, and these simple modifications all fall within the protection scope of this disclosure.

[0175] It should also be noted that the various specific technical features described in the above embodiments can be combined in any suitable manner without contradiction. To avoid unnecessary repetition, this disclosure will not describe the various possible combinations separately.

[0176] Furthermore, various different embodiments of this disclosure can be combined in any way, as long as they do not violate the spirit of this disclosure, they should also be regarded as the content disclosed in this disclosure.

Claims

1. A method for training a translation model, characterized in that, The method includes: Multiple sets of training samples are obtained, including sample text to be translated, sample thought chains of the sample text to be translated, and sample translations generated based on the sample thought chains. The sample text to be translated includes sample allusions and words. The translation model is trained by using the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters. The translation model that meets the training termination condition is identified as the target translation model.

2. The method according to claim 1, characterized in that, The acquisition of multiple sets of training samples includes: For each of the training samples, the following steps are performed: Obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If the preset allusion knowledge base does not contain a translation of the sample allusion word corresponding to the sample allusion word, then the sample text to be translated is determined as the first sample text to be translated, the sample allusion word is determined as the first sample allusion word, and the first semantic information corresponding to the first sample allusion word and the translation of the first semantic information corresponding to the first semantic information are determined. The first semantic information includes the literal meaning and / or the extended meaning. The allusion knowledge base includes at least multiple allusion words and translations of some allusion words. Determine the first sample thought chain of the first sample text to be translated, and translate the sample text to be translated based on the first sample thought chain and the first semantic information translation to obtain the first sample translation; The first type of training samples is composed of the first sample text to be translated, the first sample thought chain, and the first sample translation, and the multiple sets of training samples include the first type of training samples.

3. The method according to claim 2, characterized in that, The first sample thought chain includes the first sample allusion words, the first semantic information corresponding to the first sample allusion words, the first semantic information translation corresponding to the first semantic information, the step-by-step translation result of the first sample text to be translated, and the first sample translation.

4. The method according to claim 2, characterized in that, The step of training the translation model by using the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters includes: Input the first sample text to be translated from the first type of training samples into the first translation model to obtain the first thought chain and the first translation generated by the first translation model. The first translation is generated based on the first thought chain. Based on the first translation and the first sample translation, determine the loss value of the first translation model; Adjust the parameters of the first translation model based on the loss value of the first translation model; The step of determining the translation model that meets the training termination condition as the target translation model includes: The first translation model that meets the training termination condition is determined as the first target translation model, and the target translation model includes the first target translation model.

5. The method according to claim 4, characterized in that, The first translation can be multiple, and the step of determining the loss value of the first translation model based on the first translation and the first sample translation includes: For each of the first translations, a reward for the first translation is determined based on the first sample text to be translated, the first sample translation, and the first translation; and an advantage value for the first translation is determined based on the rewards of the plurality of first translations. The loss value of the first translation model is determined based on the advantage value of each of the first translations.

6. The method according to claim 5, characterized in that, The step of determining the reward for the first translation based on the first sample text to be translated, the first sample translation, and the first translation includes: The first reward is determined based on the first sample text to be translated, the first sample translation, and the first translation. The second reward is determined based on the cultural background knowledge related to the first translation and the first sample allusions and words; Based on the first translation and other first translations of the first sample text to be translated, determine the third reward for the first translation; The fourth reward for the first translation is determined based on the first translation; The reward for the first translation is determined based on the first reward, the second reward, the third reward, and the fourth reward.

7. The method according to claim 6, characterized in that, The determination of the first reward based on the first sample text to be translated, the first sample translation, and the first translation includes: Determine the first similarity between the first sample text to be translated and the first sample translation; Determine the second similarity between the first sample text to be translated and the first translated text; Determine the bilingual substitution test (BLEU) scores of the first sample translation and the first translation. A first reward is determined based on the first similarity, the second similarity, and the BLEU score; and / or The step of determining the second reward based on the cultural background knowledge related to the first translation and the first sample allusions includes: determining the similarity between the cultural background knowledge related to the first translation and the first sample allusions as the second reward; and / or The step of determining the third reward for the first translation based on the first translation and other first translations of the first sample text to be translated includes: using the average similarity between the first translation and other first translations as the third reward for the first translation; and / or The step of determining the fourth reward of the first translation based on the first translation includes: determining the perplexity of the first translation as the fourth reward of the first translation.

8. The method according to claim 5, characterized in that, The step of determining the advantage value of the first translation based on the rewards of multiple first translations includes: Determine the difference between the reward for the first translation and the average reward for multiple first translations; The ratio of the difference to the standard deviation of the rewards of multiple first translations is determined as the advantage value of the first translation.

9. The method according to claim 1, characterized in that, The acquisition of multiple sets of training samples includes: For each of the training samples, the following steps are performed: Obtain the sample text to be translated and identify the sample allusions and words included in the sample text to be translated; If a sample allusion translation exists in the preset allusion knowledge base that corresponds to the sample allusion, then the sample text to be translated is determined as the second sample text to be translated, the sample allusion is determined as the second sample allusion, and the sample allusion translation corresponding to the second sample allusion is determined as the second sample allusion translation. The allusion knowledge base includes at least multiple allusion words and some allusion translations. The second sample thought chain of the second sample text to be translated is determined, and the second sample text to be translated is translated according to the second sample thought chain and the translation of the second sample allusion words to obtain the second sample translation; The second type of training samples are composed of the second sample text to be translated, the second sample thought chain, and the second sample translation, and the multiple sets of training samples include the second type of training samples.

10. The method according to claim 9, characterized in that, The step of determining the translation of the sample allusion corresponding to the second sample allusion as the translation of the second sample allusion includes: When it is determined that the extended meaning of the sample allusion translation corresponding to the second sample allusion matches the extended meaning of the sample allusion, the sample allusion translation corresponding to the second sample allusion is determined as the second sample allusion translation.

11. The method according to claim 9, characterized in that, The step of training the translation model by using the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters includes: The second sample text to be translated from the second type of training samples is input into the second translation model to obtain the second thought chain and the second translation generated by the second translation model. The second translation is generated based on the second thought chain. Based on the second translation and the second sample translation, determine the loss value of the second translation model; Adjust the parameters of the second translation model based on the loss value of the second translation model; The step of determining the translation model that meets the training termination condition as the target translation model includes: The second translation model that meets the training termination condition is determined as the second target translation model, and the target translation model includes the second target translation model.

12. The method according to claim 11, characterized in that, The second translation consists of multiple texts. The step of determining the loss value of the second translation model based on the second translation and the second sample translation includes: For each second translation, a reward for the second translation is determined based on the second sample text to be translated, the second sample translation, and the second translation; and an advantage value for the second translation is determined based on the rewards of the plurality of second translations. The loss value of the second translation model is determined based on the advantage value of each second translation.

13. The method according to claim 12, characterized in that, The step of determining the reward for the second translation based on the second sample text to be translated, the second sample translation, and the second translation includes: The fifth reward is determined based on the second sample text to be translated, the second sample translation, and the second translation. Based on the second translation and other second translations of the second sample text to be translated, determine the sixth reward for the second translation; The reward for the correspondence between the allusions is determined based on the second translation and the second sample translation of the allusions. The reward for the second translation is determined based on the fifth reward, the sixth reward, and the allusion correspondence reward.

14. The method according to claim 13, characterized in that, The determination of the fifth reward based on the second sample text to be translated, the second sample translation, and the second translation includes: Determine the third similarity between the second sample text to be translated and the second sample translation; Determine the fourth similarity between the second sample text to be translated and the second translation; Determine the BLEU score of the second sample translation and the second translation; Determine the semantic role alignment between the second sample text to be translated and the second translated text; Determine the probability that the second translation contains the translation of the allusion words in the second sample; A fifth reward is determined based on the third similarity, the fourth similarity, the BLEU score, the semantic role alignment, and the probability; and / or The step of determining the sixth reward for the second translation based on the second translation and other second translations of the second sample text to be translated includes: determining the average similarity between the second translation and the other second translations as the sixth reward for the second translation; and / or The step of determining the allusion correspondence reward based on the second translation and the second sample allusion translation includes: if the second translation includes the second sample allusion translation, then the allusion correspondence reward is determined to be 1; if the second translation does not include the second sample allusion translation, then the allusion correspondence reward is determined based on the similarity between the second sample allusion translation and the second translation, as well as the fluency of the second translation.

15. A translation method, characterized in that, The method includes: Obtain the text to be translated, which includes at least one target allusion word; Based on the text to be translated and the target translation model, the target translation of the text to be translated is obtained; The target translation model is obtained by the translation model training method according to any one of claims 1-14.

16. The method according to claim 15, characterized in that, The target translation model includes a first target translation model and a second target translation model; obtaining the target translation of the text to be translated based on the text to be translated and the target translation model includes: If the pre-set allusion knowledge base does not contain a translation of each target allusion word, then the text to be translated is input into the first target translation model to obtain the target translation of the text to be translated. If the preset allusion knowledge base contains the translation of each target allusion word, then the text to be translated is input into the second target translation model to obtain the target translation of the text to be translated. If there are multiple target allusions, and the multiple target allusions include allusions with corresponding translations and allusions without corresponding translations, then the text to be translated is divided into a first type of subtext and a second type of subtext. The first type of subtext only includes allusions without corresponding translations, and the second type of subtext only includes allusions with corresponding translations. The first type of sub-text is input into the first target translation model to obtain the translation of the first type of sub-text. The second type of sub-text is input into the second target translation model to obtain the translation of the second type of sub-text. Based on the translations of the first type of sub-text and the second type of sub-text, the target translation of the text to be translated is obtained.

17. A translation model training device, characterized in that, The translation model training device includes: The first acquisition module is used to acquire multiple sets of training samples. The training samples include sample text to be translated, sample thought chains of the sample text to be translated, and sample translations generated based on the sample thought chains. The sample text to be translated includes sample allusions and words. The first training module is used to train the translation model by taking the sample text to be translated as the model input parameters and the sample thought chain and the sample translation as the model output parameters. The first determination module is used to determine the translation model that meets the training termination condition as the target translation model.

18. A translation device, characterized in that, The translation device includes: The second acquisition module is used to acquire the text to be translated, wherein the text to be translated includes at least one target allusion word; The second determining module is used to obtain the target translation of the text to be translated based on the text to be translated and the target translation model; The target translation model is obtained by the translation model training method according to any one of claims 1-14.

19. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the program implements the steps of the translation model training method according to any one of claims 1-14, and / or the steps of the translation method according to claim 15 or 16.

20. An electronic device, characterized in that, include: A memory on which computer programs are stored; A processor for executing the computer program in the memory to implement the steps of the translation model training method of any one of claims 1-14, and / or to implement the steps of the translation method of claim 15 or 16.