Tibetan language speech recognition method based on HMM and DNN

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech recognition and Tibetan language technology, applied in speech recognition, speech analysis, instruments, etc., to achieve the effect of improving efficiency

Pending Publication Date: 2020-09-22

TIANJIN UNIV

View PDF7 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, currently on the market, there is no effective speech recognition system for Tibetan

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0043] Based on the HMM-DNN Tibetan speech recognition system, its construction includes the following steps:

[0044] Step 1: Record Tibetan speech data, and label the recorded Tibetan speech data to establish a database.

[0045] Step 2: Perform data preparation, organize several files required for training the model, extract MFCC, and perform cepstral mean variance normalization.

[0046] In a speech recognition system, the first step is feature extraction. Information such as the pitch of a voice can reflect a person's speech characteristics. A person's speech characteristics can be reflected in the shape of the vocal tract. If the shape can be accurately known , then we can accurately describe the generated phonemes. The shape of the vocal tract is displayed in the envelope of the short-term power spectrum of speech. MFCC is a feature that accurately describes this envelope.

[0047] First, pre-emphasize, frame and window the speech; then analyze each short time window, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the field of artificial intelligence, the invention provides a Tibetan language speech recognition system based on an HMM-DNN (hidden Markov model-deep neural network). According to the Tibetan language speech recognition method based on the HMM and the DNN, a deep learning training model is combined with Tibetan language which is a low-resource corpus, a Tibetan language-based establishment model is trained, Tibetan language speech is recognized, and the human-computer interaction efficiency of Tibetan people is improved, and the Tibetan language speech recognition method based on the HMM and the DNN comprises the following steps of 1, recording Tibetan language speech data; 2, carrying out data preparation; 3, constructing a language model and a pronunciation dictionary; 4, training a single-phoneme model; 5, training a three-tone sub-model; 6, performing linear discriminant analysis and maximum likelihood linear transformation, and performing decoding and alignment; 7, carrying out speaker adaptive training; and 8, carrying out model training. The method is mainly applied to Tibetan language speech automatic recognition occasions.

Description

technical field [0001] The present invention relates to the field of artificial intelligence, in particular to a training method and system for a speech recognition model of Tibetan with low-resource corpus. Background technique [0002] In today's society, artificial intelligence, virtual reality, wearable devices, etc. have become the frontiers and hotspots of technology industry research, and these fields inevitably require human-computer interaction, and speech recognition technology is undoubtedly the most advanced technology in human-computer interaction. The most convenient and direct application method, speech recognition technology is the process of allowing computers to understand human language and convert it into equivalent text. [0003] For a long time, the modeling of acoustic models in the field of speech recognition has used the GMM-HMM model (Gauss-Hidden Markov Model), which has reliable accuracy and has a mature maximum expectation algorithm (EM algorithm...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/00G10L15/06G10L15/14G10L15/16G10L25/24

CPCG10L15/063G10L15/005G10L15/144G10L15/16G10L25/24

Inventor韩智丞魏建国吕绪康

OwnerTIANJIN UNIV

Tibetan language speech recognition method based on HMM and DNN

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology