Speech recognition method and device

A technology of speech recognition and speech annotation, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as speech recognition errors, speech recognition errors, and failure to consider polyphonic characters, and achieve the effect of ensuring accuracy

Inactive Publication Date: 2018-02-16
BEIJING SINOVOICE TECH CO LTD
View PDF4 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In view of this, the present invention aims to propose a speech recognition method and device to solve the problem of speech recognition errors in the prior art due to the lack of consideration

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method and device
  • Speech recognition method and device
  • Speech recognition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] refer to figure 1 , which is a flow chart of a speech recognition method described in an embodiment of the present invention, may specifically include the following steps:

[0050] Step 101, perform preprocessing on the preset corpus; the preprocessing at least includes: phonetic annotation of polyphonic characters.

[0051] In the embodiment of the present invention, after obtaining a corpus, after preprocessing the data in the expected library such as cleaning, word segmentation, etc., the expected data in units of phrases is obtained, and the polyphonic characters that appear in many of the phrases are different according to the phrases. And the pronunciation is different, such as Figure 1A Described, where "De#1" means that the character "De" is a polyphonic character, and there is only one phonetic "De" after the breath, as follows 4-gram: "The breath of life" is changed to "The breath of life #1 ", after manually marking all the polyphonic characters in the exp...

Embodiment 2

[0068] refer to figure 2 , which is a flow chart of a speech recognition method described in an embodiment of the present invention, may specifically include the following steps:

[0069] Step 201, establishing a preset corpus according to the collected Chinese corpus data; the Chinese corpus is extracted from the same language field.

[0070] In the embodiment of the present invention, a corpus refers to a language material library in a popular sense. The corpus in the strict sense refers to a large-scale electronic text library with a certain capacity built by collecting naturally occurring continuous language use texts or discourse fragments according to certain linguistic principles and using random sampling methods. For the information field where the purpose of establishing a corpus is speech recognition, the language text for this specific field is selected, and after preprocessing such as sampling, a corpus for this field is generated.

[0071] It should be...

Embodiment 3

[0095] refer to image 3 , is a structural block diagram of a speech recognition device according to an embodiment of the present invention.

[0096] The corpus preprocessing module 301 is used to perform preprocessing in the preset corpus; the preprocessing includes at least: polyphone phonetic annotation;

[0097] A language model training module 302, configured to perform language model training according to the pre-processed preset corpus;

[0098] Pronunciation dictionary generating module 303, for adding the polyphonic word entry of described polyphonic word phonetic mark to preset dictionary, generates pronunciation dictionary;

[0099] The acoustic model composition generation module 304 is configured to generate an acoustic model composition after the speech recognition network is built according to the language model and the pronunciation dictionary.

[0100] refer to Figure 4 , is a schematic diagram of the relationship between modules in the embod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a speech recognition method and device. The method includes preprocessing a preset corpus, the preprocessing at least including polyphone speech annotation; performing languagemodel training according to the preprocessed preset corpus; adding polyphone vocabulary entries with polyphone speech annotation to a preset dictionary, and generating a pronunciation dictionary; andafter performing speech recognition network creation according to the language model and the pronunciation dictionary, generating acoustic model composition. The problem in the prior art that since the polyphone problem is not considered in speed recognition, the problems of complex speech recognition steps and polyphone recognition errors are caused, is solved.

Description

technical field [0001] The invention relates to the technical field of voice recognition, including a voice recognition method and device. Background technique [0002] Automatic Speech Recognition (ASR) is a technology that studies how to convert human speech recognition into text, which can be applied to services such as voice dialing, voice navigation, indoor device control, voice document retrieval, and simple dictation data entry. middle. [0003] The problem of polyphonic characters is involved in the process of network construction during speech recognition system training. The existing network construction technology cannot disambiguate polyphonic characters. Why is it not the breath of life?" Their pronunciations are then mapped through the pronunciation dictionary as: [0004] You ShenMeLiYouBuShiShengMing De QIXiDi Ne (first sentence) [0005] You ShenMeLiYouBuShiShengMing De QIXi Di Ne (second sentence) [0006] Therefore, in many cases, due to the lack of co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/18G10L15/14G10L15/06
CPCG10L15/063G10L15/142G10L15/18G10L15/26
Inventor 郑晓明李健
Owner BEIJING SINOVOICE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products