End-to-end sign language translation method and system

A sign language translation and sign language technology, applied in the field of sign language translation, can solve the problem of asynchrony of translated texts, and achieve the effects of enriching contextual semantic information, enhancing feature expression ability, and improving performance

Pending Publication Date: 2021-11-16
ZHEJIANG UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Latency is also critical for sign language translation, however, studies done so far in SLT have had to read a full video of sign language to start the translation, which would lead to significant desynchronization between the sign language speaker and the translated text generated by the model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • End-to-end sign language translation method and system
  • End-to-end sign language translation method and system
  • End-to-end sign language translation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0102] In this example, the SimulSLT model proposed by the present invention is evaluated on the RWTH-PHOENIX-Weather 2014T (PHOENIX14T) dataset, which is the only publicly available large-scale SLT dataset. Its data is collected from the weather forecasts of the German public television station PHOENIX, including parallel sign language videos, annotations and corresponding target text sequences. We follow the official dataset partition protocol, where the training set, validation set, and test set contain 7096, 519, and 642 samples, respectively. This dataset contains continuous sign language videos from 9 different sign language speakers and contains 1066 different sign language words. The text in the dataset is annotated for spoken German with a vocabulary size of 2887 distinct words.

[0103] In this embodiment, the number of hidden units, the number of heads, and the number of codec layers of the SimulSLT model are set to 512, 8, 3, and 3 respectively, and dropouts of 0....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an end-to-end sign language translation method, and belongs to the technical field of sign language translation. The method comprises the following steps: 1) acquiring a sign language video and a corresponding target annotation sequence and target text sequence; 2) establishing a sign language translation model; extracting visual features of the sign language video by a feature extractor and coding the same by a mask coder, and dividing a coding result into three branches for decoding; in the first branch, predicting a word boundary firstly by a boundary predictor, and then predicting an annotation sequence by an auxiliary annotation decoder in combination with an output result of the boundary predictor; in the second branch, linearly mapping the coding result and then using the coding result as the input of a CTC decoder, to generate a prediction annotation sequence; in the third branch, using the coding result as the input of a wait-k decoder, and outputting a prediction text sequence; and 3) carrying out feature extraction and coding on a sign language video to be translated by utilizing the trained sign language translation model, taking a coding result as the input of the wait-k decoder, and generating a prediction text sequence as a translation result.

Description

technical field [0001] The invention relates to the technical field of sign language translation, in particular to an end-to-end sign language translation method and system. Background technique [0002] Sign language is a visual language widely used by about 466 million hearing-impaired people, who use various methods to convey information, such as gestures, movements, mouth shapes, facial expressions, etc. However, it is common for people without hearing impairment to have received sign language education, which makes it difficult for them to understand the meaning of sign language expressions. Sign language translation (SLT) is the use of AI technology to convert sign language videos into spoken language (or text) so that more people can understand it. [0003] The study of sign language translation has a long history. In recent years, with the rise of deep learning, many people are trying to use neural network methods to deal with SLT tasks and have achieved good resul...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06N5/02G06N20/00
CPCG06N5/02G06N20/00
Inventor 赵洲程诗卓沈子栋尹傲雄
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products