Fast language recognition method based on time delay neural network

A neural network and language recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as failure to meet the requirements of recognition performance, and achieve strong robustness

Inactive Publication Date: 2020-09-11
因诺微科技(天津)有限公司
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional i-vector method of full difference space analysis based on statistical model and the PRLM method based on phoneme language model cannot meet the recognition performance requirements in short spee

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast language recognition method based on time delay neural network
  • Fast language recognition method based on time delay neural network
  • Fast language recognition method based on time delay neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be further described below in conjunction with the accompanying drawings and examples. The following examples are only used to explain the content of the present invention, and are not intended to limit the protection scope of the present invention.

[0039] Such as Figure 1-Figure 3 As shown, it is a schematic diagram of the overall implementation process of a fast language recognition method based on a time-delay neural network of the present invention. Taking the use of FDLP (Frequency Domain Linear Prediction Coefficient Feature) as an example, the implementation process specifically includes the following steps:

[0040] Step 1. Splicing or cutting the input voice signal to obtain a fixed-length voice signal frame sequence. In this example, a fixed length of 1s is used, and the sampling frequency of the signal is 8000; set the fixed extraction window frame parameters: 25ms window, 10ms frame shift;

[0041] Step 2, extracting speech sig...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fast language recognition method based on a time delay neural network, and the method comprises the steps: 1, inputting a voice signal, processing the input voice signal, andobtaining a voice signal frame sequence with a fixed length; 2, extracting bottom acoustic features of a voice signal frame sequence according to frames; 3, inputting the underlying acoustic featuresinto a Real TDNN residual block structure for calculation processing to obtain M*64 abstract features; 4, carrying out Attention calculation; 5, carrying out global average pooling processing on theAttention features in a time frame dimension to obtain an Embedded vector; 6, carrying out two-layer DNN extraction on the Embedded vector to obtain a language vector; and 7, inputting the language vectors into an ArcFaceStatic loss function, and inputting the underlying acoustic features into the trained neural network to obtain the probabilities of all recognizable languages. The method has highrobustness in short voice, so that the language can be quickly and accurately recognized.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a method applied to language recognition. Background technique [0002] Since the 21st century, with the rapid development of pattern recognition, artificial intelligence and other disciplines, human development has entered the era of intelligence. As a key technology in the field of human-computer interaction, speech recognition has received great attention and has shown great practical value. According to the form, speech can be divided into speech recognition systems related to word information and paragraph content, speaker recognition systems related to the identity of the speaker contained in the paragraph, and language recognition systems related to the language category of the paragraph. . [0003] At present, in the field of language recognition, the recognition accuracy of long speech segments longer than 10s is good enough, but the complex test environment...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/00G10L15/04G10L15/06G10L15/10G10L15/16
CPCG10L15/005G10L15/04G10L15/10G10L15/16G10L15/063
Inventor 刘俊南江海王化刘文龙
Owner 因诺微科技(天津)有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products