Training method and device of voice speech translation model

A speech translation and model training technology, applied in the field of speech translation, can solve problems such as wrong translation results, inaccurate translation results, and translation performance to be improved, and achieve the effect of accurate model parameters and improved translation performance

Active Publication Date: 2019-05-21
IFLYTEK CO LTD
View PDF8 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, combining speech recognition technology and text translation technology for speech translation has the disadvantage of error accumulation. Wrong word gets wrong translation
It can be seen that errors in the speech recognition stage will accumulate in the text translation stage, resulting in inaccurate translation results. That is to say, the translation performance of the existing speech translation model needs to be improved

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and device of voice speech translation model
  • Training method and device of voice speech translation model
  • Training method and device of voice speech translation model

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0061] It should be noted that, the traditional voice translation method is usually to perform voice recognition on the voice first, recognize it as a text in the same language, and then process the recognized text, that is, use text translation technology to process the recognized text. Translate, translate it into text in another language, and realize voice translation. However, this traditional speech translation method often has the problem of error accumulation, that is, if an error occurs during speech recognition, the error will be accumulated in the subsequent text translation process, resulting in inaccurate translation results.

[0062] Therefore, in order to solve the above-mentioned defects, such as figure 1 The end-to-end speech translation model shown performs speech translation. The speech translation model includes an encoder, an attention layer (Attention) and a decoder. Through this speech translation model, the source language speech can not be speech recogn...

no. 2 example

[0139] The above is a specific embodiment of a speech translation model training method provided in the first embodiment of the present application. Based on the speech translation model trained in the above embodiment, the embodiment of the present application also provides a speech translation method.

[0140] see Figure 8 , which shows a flow chart of a speech translation method provided by an embodiment of the present application, such as Figure 8 As shown, the method includes:

[0141] S801: Obtain a target voice to be translated.

[0142] In this embodiment, any speech translated by this embodiment is defined as the target speech. The language of the target speech is the same as that of the sample speech in the above-mentioned first embodiment.

[0143] It is understandable that the target voice can be obtained through recording according to actual needs. For example, the voice of a telephone conversation in people's daily life, or a recording of a meeting can be us...

no. 3 example

[0148] This embodiment will introduce a training device for a speech translation model. For related content, please refer to the above method embodiments.

[0149] see Figure 9 , which is a schematic diagram of the composition of a speech translation model training device provided in this embodiment, the device 900 includes:

[0150] A training data acquisition unit 901, configured to acquire model training data, the model training data including each sample speech;

[0151] The translation text obtaining unit 902 is configured to use the current speech translation model to directly translate the sample speech to obtain a predicted translation text, wherein the speech translation model shares some model parameters with a speech recognition model;

[0152] A recognition text obtaining unit 903, configured to use the current speech recognition model to recognize the sample speech to obtain a predictive recognition text;

[0153] The model parameter updating unit 904 is config...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a training method and device for a speech translation model. The method comprises the steps that model training data including all pieces of sample speech is obtained, then translating is conducted on the obtained sample speech directly by means of a current speech translation model, and predicted translation text is obtained; meanwhile, by means of a current speech recognition model, the obtained sample speech is recognized, and predicted recognition text is obtained; then, according to the obtained predicted translation text and predicted recognition text, parametersof the speech translation model and the speech recognition model are updated. Due to the fact that the speech translation model and the speech recognition model share part of the model parameters, when the parameters of the speech recognition model are updated, the shared model parameters in the speech translation model can be updated, and then the model parameters of the speech translation modelare more accurate; therefore, when speech translating is conducted by means of the speech translation model, the translation performance of the speech translation model can be improved.

Description

technical field [0001] The present application relates to the technical field of speech translation, in particular to a method and device for training a speech translation model. Background technique [0002] Existing speech translation methods generally include two steps, ie speech recognition and text translation are implemented by speech translation models. Specifically, firstly, a piece of speech is recognized into a text in the same language through speech recognition technology, and then the recognized text is translated into a text in another language by using text translation technology, thereby realizing the speech translation process. [0003] However, combining speech recognition technology and text translation technology for speech translation has the disadvantage of error accumulation. Wrong words get wrong translations. It can be seen that the errors in the speech recognition stage will accumulate in the text translation stage, resulting in inaccurate transla...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/00G10L15/06G10L15/16G06F17/28
Inventor 马志强刘俊华魏思胡国平
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products