Deep learning-based unusual speech distinguishing method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep learning, abnormal technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of speaker information interference and recognition performance degradation

Active Publication Date: 2018-11-06

SOUTH CHINA UNIV OF TECH

View PDF7 Cites 30 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Another problem is that traditional speech recognition systems often use linear predictive cepstral coefficients and Mel-frequency cepstral coefficients. The main information in these underlying acoustic features is the pronunciation of text features, and speaker information is easily affected by this information, channel and noise. Information interference, which reduces the recognition performance of the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0104] A method for distinguishing abnormal speech based on deep learning, comprising the following steps:

[0105] Step 1: Obtain the input voice, and perform preprocessing such as resampling, pre-emphasis, frame division and windowing on the input voice to obtain the pre-processed voice;

[0106] Resampling is specifically: the input voice has different sampling frequencies and encoding methods. In order to facilitate data processing and analysis, the original input voice signal is resampled, and the sampling frequency and encoding method are unified; the sampling frequency is 22.05kHz, and the encoding method is wav Format.

[0107] The pre-emphasis is specifically: the power spectrum of the audio signal decreases with the increase of the frequency, and most of the energy is concentrated in the low-frequency range. In order to improve the high-frequency part of the original audio signal, the original input audio signal is pre-emphasized. order FIR high-pass filter, its tra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep learning-based unusual speech distinguishing method. The method comprises the following steps: input speech is acquired, resampling, pre-emphasis and framing and windowing preprocessing are carried out on the input speech, and preprocessed speech is obtained; a mel-frequency cepstral coefficient (MFCC) characteristic vector is extracted for the preprocessed speech; the speech segments with different frames are regularized to a fixed number of frames, and each speech segment obtains a corresponding mel-frequency cepstral coefficient characteristic vector; a convolutional depth confidence network is built; the mel-frequency cepstral coefficient characteristic vectors are inputted to the convolutional depth confidence network for training, and the states of input speech are classified; and according to a classification result, a hidden Markov model is called for template matching and a speech recognition result is obtained. Multiple nonlinear transform layers of the convolutional depth confidence network are used, the inputted MFCC characteristics are mapped to higher-dimensional space, the hidden Markov model is then used to carry out modeling on different states of speech, and the speech recognition accuracy is improved.

Description

technical field [0001] The invention relates to the field of intelligent speech processing research, in particular to a method for distinguishing abnormal speech based on deep learning. Background technique [0002] Speech is one of the important ways for humans and machines to interact. After decades of research, speech recognition technology has been greatly developed and has penetrated into our daily life. However, the existing research on speech recognition has the following problems: [0003] In real life, the speaker's abnormal health or other reasons will cause the input speech to shift from normal speech to abnormal speech, and will bring more noise interference. Abnormal speech generally refers to speech with complex background noise, speech with intentional changes in speaking methods or habits, speech with developmental organ lesions, etc. [0004] Another problem is that traditional speech recognition systems often use linear predictive cepstral coefficients and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/06G10L15/08G10L15/14G10L15/16G10L15/26G10L25/24G10L25/30

CPCG10L15/063G10L15/08G10L15/142G10L15/16G10L15/26G10L25/24G10L25/30

Inventor 奉小慧陈光科贺前华巫小兰李艳雄

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Deep learning-based unusual speech distinguishing method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology