All-end-to-end Chinese and English mixed air traffic control voice recognition method and device

A speech recognition and speech recognition model technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as scattered features, difficult pronunciation and scale of word meaning, poor speech signal quality, etc., to achieve pronunciation scale enhancement, enhanced learning, The effect of improving efficiency

Active Publication Date: 2021-02-26
SICHUAN UNIV
View PDF12 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to overcome the problems of poor voice signal quality and scattered features in the prior art, and at the same time, it is difficult to accurately determine the scale of pronunciation and word meaning in Chinese-English mixed recognition, and to provide a full-end-to-end Chinese-English mixed space Tube speech recognition method and device

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • All-end-to-end Chinese and English mixed air traffic control voice recognition method and device
  • All-end-to-end Chinese and English mixed air traffic control voice recognition method and device
  • All-end-to-end Chinese and English mixed air traffic control voice recognition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] Such as figure 1 As shown, a full-end-to-end Chinese-English mixed air traffic control speech recognition method includes the following steps:

[0070] a: collecting air traffic control voice and preprocessing the air traffic control voice; wherein, the air traffic control voice is audio data mixed in Chinese and English;

[0071] b: input the air traffic control voice into the pre-established Chinese and English mixed air traffic control voice recognition model;

[0072] c: output the command information corresponding to the air traffic control voice;

[0073] The Chinese-English mixed air traffic control speech recognition model includes a feature learning module and a speech recognition module; the feature learning module is used to extract the speech features of the air traffic control speech, and the speech recognition module is used to optimize model parameters and output corresponding instruction information.

[0074] Wherein, the training of the Chinese-Englis...

Embodiment 2

[0100] Such as figure 2 As shown, the present embodiment is the detailed training process of the Chinese-English mixed air traffic control speech recognition model described in Embodiment 1, and the specific steps are as follows:

[0101] Step 1: Preprocessing the speech recognition training samples, including the following process:

[0102] Step 1-1: First, use voice activity detection technology (voice activity detection, VAD) to divide the continuous original dialogue voice into individual audio files, each audio only contains the voice of a single speaker, that is, the content of a single control instruction, and removes the silence and noisy data.

[0103] Step 1-2: According to the air traffic control voice content involved in this scheme, use Chinese characters and English words to mark the readable instruction text corresponding to the audio, and output the unmarked original voice signal and the single voice signal after segmentation and marking.

[0104] Step 2: Bu...

Embodiment 3

[0138] The difference between this embodiment and Embodiment 1 and Embodiment 2 is that the Chinese-English mixed air traffic control speech recognition model also includes a Chinese-English instruction vocabulary.

[0139] From the perspective of pronunciation, Chinese characters use monosyllable pronunciation, while English words (partial letters) generally belong to polysyllable pronunciation. From a linguistic point of view, Chinese characters are the basic morphological units of Chinese, but Chinese phrases can express complete meanings; for English, letters are the basic morphological units, and English words are the smallest language units with complete meanings. Generally speaking, Chinese phrases generally contain 2-4 Chinese characters, while English words may contain as many as 10 letters. It can be seen that Chinese and English languages ​​are not on the same scale in terms of pronunciation or language; therefore, a new method for training and building Chinese and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of civil aviation air traffic control and voice recognition, in particular to an all-end-to-end Chinese and English mixed air traffic control voice recognition method and device. According to the invention, voice features are extracted in advance through a feature learning module, so that a Chinese and English mixed air traffic control voice recognition model canextract more discriminative voice features, and better adapts to voice signals in different scenes; in a processing normal form from an original voice signal to a readable instruction text, a unifiedframework is applied to solve the problem of Chinese and English mixed voice recognition, a language attribute judgment link in an existing independent recognition system can be avoided, the system architecture of mixed voice recognition is simplified, and the speech features can be more reasonably and effectively applied to the recognition of the model, so that pronunciation and meaning are accurately judged, and the recognition performance and practicability of the mixed speech are improved.

Description

technical field [0001] The invention relates to the field of civil aviation air traffic control and speech recognition, in particular to a full-end-to-end Chinese-English mixed air traffic control speech recognition method and device. Background technique [0002] In the field of civil aviation air traffic control, controllers and pilots conduct real-time communication and coordination through radio stations in the form of voice calls to ensure the safety of local air traffic operations. In the current control system, the voice of the control call is transmitted through VHF (Very High Frequency), and its reliability greatly affects the quality of the voice of the control call, which in turn affects the performance of speech recognition. In addition, due to limited communication resources, controllers generally communicate with multiple controllers in their control sector through the same communication frequency. Therefore, the speaker, communication equipment error, and tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/02G10L15/00G10L15/20G10L19/04G10L25/30
CPCG10L15/063G10L15/02G10L15/005G10L15/20G10L19/04G10L25/30Y02T10/40
Inventor 林毅杨波张建伟
Owner SICHUAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products