Block-based self-attention real-time end-to-end speech translation method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech translation and attention, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of monotonous alignment of input and output, poor effect, etc., and achieve the effect of strong portability and convenient deployment

Pending Publication Date: 2022-03-04

沈阳雅译网络技术有限公司

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since there is a relatively strong constraint in the speech recognition task, that is, the input and output are monotonically aligned, there is no such constraint in the speech translation task

Therefore, direct transfer of block-based real-time speech recognition methods to real-time speech translation tasks is less effective.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The present invention will be further elaborated below in conjunction with the accompanying drawings of the description.

[0038] In real-time speech translation tasks, due to the limitations of traditional attention calculation methods, it is difficult to use them in speech translation tasks. By using the block attention method, the present invention can obtain certain context information when decoding the real-time speech translation model, and realize real-time translation, thereby improving the translation speed and reducing the delay.

[0039] The present invention provides a block-based self-attention real-time end-to-end speech translation method, comprising the following steps:

[0040] 1) Preprocess the recorded audio file training data, map the ID of each voice and its stored path with the corresponding target language text, and construct two mapping files;

[0041] 2) Extracting the acoustic features of the audio file, extracting the two acoustic features of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a block-based self-attention real-time end-to-end speech translation method, which comprises the following steps of: preprocessing recorded audio file training data, mapping an ID (Identity) of each speech, a storage path of each speech and a corresponding target language text, and constructing two mapping files; respectively extracting two acoustic features of a Mel filter bank and a Mel frequency cepstrum coefficient of the audio; constructing a target language dictionary by using the training data, wherein the target language dictionary is used for generating a target language text sequence during decoding; cleaning the training data, and converting the training data into a format file required by an end-to-end speech translation model; initializing an end-to-end speech translation model, and training by using a data file in a specific format; in the inference stage, the size of the block is set, and the trained end-to-end speech translation model is used to dynamically encode the source speech, so that a target statement is generated in real time. According to the method, the model has the capability of real-time speech translation, and the decoding speed of the model is improved under the condition that the performance of the model is not reduced.

Description

technical field [0001] The invention relates to an end-to-end real-time speech translation method, in particular to a block-based self-attention real-time end-to-end speech translation method. Background technique [0002] Speech translation (Speech Translation) broadly refers to the process of translating the speech of a language into the speech or text corresponding to the target language. Speech translation usually refers to the process of translating speech into corresponding target language text, while Speech-to-speech Translation specifically refers to the process of translating speech into corresponding target language speech. Speech translation has a very wide range of application scenarios, such as subtitle generation, conference simultaneous interpretation, etc., and plays an important role in cross-language communication. [0003] In the past, speech translation was usually carried out in a cascading manner, that is, a speech recognition system was used to recogn...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/02G10L15/06G10L15/26G10L25/24

CPCG10L15/02G10L15/063G10L15/26G10L25/24

Inventor徐萍宁义明

Owner沈阳雅译网络技术有限公司

Block-based self-attention real-time end-to-end speech translation method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology