Speech recognition device and speech recognition method

A speech recognition and speech technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as misrecognition, non-verbal sounds cannot get high sound scores, and the value of useless information sound scores increases, and achieve practical value high effect

Inactive Publication Date: 2005-11-16
PANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA
View PDF0 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0025] However, in the above conventional example, there is the following problem: if there is a word sequence similar in sound to non-linguistic sounds such as (stammer) sounds in the vocabulary to be recognized, it will be misrecognized.
[0029] The reason for this is that the useless information sound model is learned from all the sound data of useless words including eating sounds, so the di

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition device and speech recognition method
  • Speech recognition device and speech recognition method
  • Speech recognition device and speech recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach 1

[0074] Fig. 3 is a block diagram showing the functional structure of the speech recognition device according to Embodiment 1 of the present invention. However, in Embodiment 1, a case where the target of non-language estimation is sound eating is described as an example.

[0075]Speech recognition device 1 is the computer device that uses speech recognition to operate television, and as shown in Figure 3, comprises feature amount calculation part 101, network dictionary storage part 102, route calculation part 103, candidate route storage part 104, recognition result output part 105. Language model storage unit 106, language score calculation unit 107, word sound model storage unit 108, word sound score calculation unit 109, unnecessary information sound model storage unit 110, useless information sound score calculation unit 111, non-linguistic sound inference unit 112 , and the unnecessary information sound score correction unit 113 and the like.

[0076] Wherein, each part...

Embodiment approach 2

[0131] Next, a speech recognition device according to Embodiment 2 of the present invention will be described.

[0132] FIG. 6 is a block diagram showing the functional structure of a speech recognition device according to Embodiment 2 of the present invention. However, in Embodiment 2, the case where the object of non-language estimation is laughter is described as an example. In addition, the parts corresponding to the speech recognition device 1 of Embodiment 1 are assigned the same reference numerals, and detailed description thereof will be omitted.

[0133] Speech recognition device 2 is the same as speech recognition device 1 and operates the computer device of TV set by speech recognition, as shown in Figure 6, except comprising feature amount calculation part 101, network dictionary storage part 102, route calculation part 103, candidate route storage unit 104, recognition result output unit 105, language model storage unit 106, language score calculation unit 107, w...

Embodiment approach 3

[0158] Next, a speech recognition device according to Embodiment 3 of the present invention will be described.

[0159] Fig. 8 is a block diagram of the functional structure of the voice recognition device according to Embodiment 3 of the present invention, Figure 9 It is a schematic diagram of a situation where a user faces a mobile phone with a camera and inputs an email by voice. However, in the third embodiment, a case will be described taking as an example a case where a mobile phone with a camera detects a smile or a cough using a camera image as an input, and corrects a unnecessary sound score for speech recognition. In addition, components corresponding to those of the speech recognition device 1 according to Embodiment 1 are given the same reference numerals, and description thereof will be omitted.

[0160] The voice recognition device 3 is a computer device such as a mobile phone that uses voice recognition to create emails. As shown in FIG. Output unit 105, lang...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The speech recognition apparatus ( 1 ) is equipped with the garbage acoustic model storage unit ( 110 ) storing the garbage acoustic model which learned the collection of the unnecessary words; the feature value calculation unit ( 101 ) which calculates the feature parameter necessary for recognition by acoustically analyzing the unidentified input speech including the non-language speech per frame which is a unit for speech analysis; the garbage acoustic score calculation unit ( 111 ) which calculates the garbage acoustic score by comparing the feature parameter and the garbage acoustic model; the garbage acoustic score correction unit ( 113 ) which corrects the garbage acoustic score calculated by the garbage acoustic score calculation unit ( 111 ) so as to raise it in the frame where the non-language speech is inputted; and the recognition result output unit ( 105 ) which outputs, as the recognition result of the unidentified input speech, the word string with the highest cumulative score of the language score, the word acoustic score, and the garbage acoustic score which is corrected by the garbage acoustic score correcting means.

Description

technical field [0001] The present invention relates to a speech recognition device and a speech recognition method for speech recognition of continuous word speech recognition of useless words that do not need to be distinguished in the allowable sense. Background technique [0002] In the past, there is a word speech recognition device that uses a sound model learned in advance from a collection of useless words—a useless information sound model to deal with useless words that do not need to be distinguished in meaning (for example, please refer to 2 people such as (Japan) Inokami Naoki, " ガ一ベジHMMを用いた自由虺虺のの广语设计技术(Using Useless Information HMM in Natural Speech Sentences to Process Useless Words)”, Journal of Electronics, Information and Communications Society A, Vol.J77-A, No.2 , pp. 215-222, February 1994). [0003] figure 1 It is a configuration diagram showing a conventional speech recognition device. [0004] Such as figure 1 As shown, the speech recognition device ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/20
Inventor 山田麻纪西崎诚中藤良久芳泽伸一
Owner PANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products