Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech recognition method and apparatus, terminal, and computer readable storage medium

A speech recognition and speech technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of low recognition result confidence, high false recognition rate, and high recognition result confidence, so as to reduce the false recognition rate and avoid recognition as Effects of command words

Active Publication Date: 2018-01-30
北京如布科技有限公司
View PDF8 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In a noisy environment, it often occurs that the confidence of the correct recognition result is very low but the confidence of the wrong recognition result is very high, so that the false recognition rate is still high.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method and apparatus, terminal, and computer readable storage medium
  • Speech recognition method and apparatus, terminal, and computer readable storage medium
  • Speech recognition method and apparatus, terminal, and computer readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0025] figure 1 It is a flow chart of the speech recognition method provided in Embodiment 1 of the present invention. This embodiment is applicable to the situation of command word recognition. The method can be executed by a speech recognition device, and specifically includes the following steps:

[0026] Step 110, according to the acoustic characteristics of the collected speech, calculate the acoustic similarity probability between the speech and the phoneme sequence in the decoding network;

[0027] Wherein, the decoding network includes multiple groups of phoneme sequences; each group of phoneme sequences corresponds to a preset command word content or corresponding noise content; since the embodiment of the present invention is applied to the recognition of voice commands, any non-command word speech is In terms of command word recognition, it is all interference, so it is all noise, and the noise in the embodiment of the present invention refers to any non-command wor...

Embodiment 2

[0040] figure 2 It is a flow chart of the speech recognition method provided by Embodiment 2 of the present invention. This embodiment is applicable to command word recognition, and the method can be executed by a speech recognition device. In this embodiment, on the basis of the speech recognition method in Embodiment 1, a step of automatically adjusting decoding network parameters is added, so that the speech recognition method can dynamically modify parameters and continuously reduce the misrecognition rate. The voice recognition method provided in this embodiment includes:

[0041] Step 210, according to the acoustic characteristics of the collected speech, calculate the acoustic similarity probability between the speech and the phoneme sequence in the decoding network; wherein, the decoding network includes multiple sets of phoneme sequences; each set of phoneme sequences corresponds to a preset Command word content or corresponding noise content;

[0042] Step 220, ac...

Embodiment 3

[0051] image 3 It is a schematic structural diagram of the speech recognition device provided by Embodiment 3 of the present invention. The speech recognition device includes:

[0052] Calculation module 310, for calculating the acoustic similarity probability between the speech and the phoneme sequence in the decoding network according to the acoustic features of the collected speech; wherein, the decoding network includes multiple groups of phoneme sequences; each group of phoneme sequences corresponds to a Preset command word content or corresponding noise content;

[0053] A matching module 320, configured to obtain a matching probability between the speech and the phoneme sequence according to the acoustic similarity probability;

[0054] The recognition module 330 is configured to recognize the speech as the content corresponding to the phoneme sequence with the highest matching probability.

[0055] Preferably, said decoding network is constructed using weighted fin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech recognition method. The method comprises: according to collected acoustic speech features, acoustic similarity probabilities of the speech with phoneme sequences in adecoding network is calculated, wherein the decoding network includes a plurality of phoneme sequence groups and each phoneme sequence group corresponds to one preset command word content or a noise content; on the basis of the acoustic similarity probabilities, matching probabilities between the speech and the phoneme sequences are obtained; and the speech is recognized as a content correspondingto the phoneme sequence with the highest matching probability. Correspondingly, the invention also discloses a speech recognition apparatus, a terminal, and a computer readable storage medium. Therefore, a phenomenon that the noise is recognized as a command word is avoided; and calculation of a confidence level after speech recognition is not needed, so that the false recognition rate is reduced.

Description

technical field [0001] Embodiments of the present invention relate to speech recognition technology, and in particular to a speech recognition method, device, terminal and computer-readable storage medium. Background technique [0002] In speech command word recognition technology, misrecognition has always been a relatively difficult problem to solve. The reason why the misrecognition rate of command word recognition is relatively high is because the command word recognition method in the prior art is generally realized by constructing a decoding network, which includes multiple sets of phoneme sequences corresponding to preset command words. Inputting any speech will search for a most matching phoneme sequence from the decoding network according to the speech, thus causing misrecognition. [0003] The current solution to identifying noise as a command word is to calculate the confidence of the recognition result. When the confidence is greater than a preset threshold, it ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/197G10L15/26
Inventor 何金来雷宇
Owner 北京如布科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products