Far-field speech recognition processing method and device

A speech recognition and processing method technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of poor denoising processing effect and high equipment cost investment, achieve the best denoising processing effect, low equipment cost investment, and realize Simple and convenient effect

Active Publication Date: 2017-01-11
BEIJING UNISOUND INFORMATION TECH +1
View PDF9 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention provides a far-field speech recognition processing method and device, which are used to solve the problems of high equipment cost and poor denoising processing effect in the far-field speech denoising processing existing in the prior art, and can solve the problems without increasing equipment In the case of investment, better far-field voice processing results are obtained

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Far-field speech recognition processing method and device
  • Far-field speech recognition processing method and device
  • Far-field speech recognition processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] Embodiment 1 of the present invention provides a far-field speech recognition processing method, the flow of which is as followsfigure 1 shown, including the following steps:

[0059] Step S101: receiving far-field voice.

[0060] The device used for far-field voice processing receives far-field voice through the set receiving module, and performs subsequent de-reverberation and de-noising processing to obtain better-quality voice.

[0061] Step S102: Input the received far-field speech into the pre-trained speech training model based on the neural network.

[0062] After receiving the far-field voice, input the far-field voice into the voice training model for de-reverberation and de-noising processing, where the voice training model can choose a pre-trained voice training model based on a neural network (Deep Neural Network, DNN) .

[0063] The training process of the voice training model is also a learning process. By recording near-field sounds, near-field audio f...

Embodiment 2

[0070] Embodiment 2 of the present invention provides the training process of the neural network-based speech training model in the above-mentioned far-field speech recognition processing method, and its flow is as follows figure 2 shown, including the following steps:

[0071] Step S201: Record near-field voice.

[0072] The training of the neural network-based speech training model is actually a learning process. First, the characteristics of the near-field speech are learned by recording the near-field speech.

[0073] Step S202: Obtain near-field audio features from the recorded near-field voice.

[0074] After the near-field sound is recorded, near-field audio features are extracted from the near-field sound to realize the learning of near-field speech features.

[0075] Step S203: adding the ambient sound of the far-field speech to the near-field speech to obtain the simulated far-field speech.

[0076] In the training process, after learning the audio characteristic...

Embodiment 3

[0089] Embodiment 3 of the present invention provides a specific implementation method for far-field speech recognition processing, the process of which is as follows Figure 4 shown, including the following steps:

[0090] Step S301: Receive far-field voice.

[0091] Step S302: Input the received far-field speech into the pre-trained speech training model based on the neural network.

[0092] The neural network-based speech training model in this embodiment is a speech training model that does not incorporate an acoustic model, and this model only realizes the processing from far-field speech to near-field speech.

[0093] Step S303: Obtain the audio features of the far-field speech and the near-field speech included in the speech training model.

[0094] Step S304: According to the acquired audio features, de-interference processing is performed on the audio features of the received far-field voice to obtain the processed far-field voice.

[0095] Step S305: Input the pro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a far-field speech recognition processing method and device. The method comprises the steps that a far-field speech is received; the far-field speech is input into a neural network-based speech training model obtained through pre-training; interference removing processing is conducted on audio features of the received far-field speech through audio features of far-field speeches and near-field speeches in the speech training model, and the processed far-field speech is obtained; the processed far-field speech is recognized. According to the method, optimization processing on the far-field speech can be achieved, a better processing result can be acquired, and equipment cost input is reduced.

Description

technical field [0001] The invention relates to the technical field of speech processing, in particular to a far-field speech recognition processing method and device based on a neural network model. Background technique [0002] Voice is a common way to carry information in daily life. With the development of voice technology, more and more voice recognition systems have appeared for voice recognition. According to the distance of the voice source, there can be far-field voice and near-field voice. When performing speech recognition, different processing strategies can be used for different speech to perform speech processing, so as to obtain clear and recognizable speech information. [0003] Especially for far-field speech, due to its longer transmission distance, the speech may contain more interference. In order to reduce the impact of these interferences, it is generally necessary to perform denoising, echo removal and other processing. [0004] In the prior art, the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/16G10L15/20G10L15/06G10L25/51
CPCG10L15/063G10L15/16G10L15/20G10L25/51
Inventor 江巍关海欣苏牧张军
Owner BEIJING UNISOUND INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products