Pronunciation dictionary generation method and word speech recognition method and device

A technology of speech recognition and speech recognition model, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of accuracy impact, unsatisfactory effect, large number and so on

Pending Publication Date: 2020-12-04
BEIJING SINOVOICE TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the training of the neural network requires a large amount of data support, and the effect is not ideal when the amount of data is small.
In addition, the phonetic notation of a large number of pronunciation dictionaries found may not be uniform. Although the notation of pronunciation dictionaries is generally the International Phonetic Alphabet, there may also be other methods of notation. Pronunciation dictionaries obtained from different channels may have different notations. affect the accuracy
Therefore, for small languages, the existing pronunciation dictionaries have a small number of word samples. In this case, it is difficult to realize more and more word recognition, and the limitations of G2P technology are revealed.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pronunciation dictionary generation method and word speech recognition method and device
  • Pronunciation dictionary generation method and word speech recognition method and device
  • Pronunciation dictionary generation method and word speech recognition method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0050] refer to figure 1 , which shows a flow chart of the steps of an embodiment of a method for generating a pronunciation dictionary of the present invention, which may specifically include the following steps:

[0051] Step 101, obtaining a training corpus, the training corpus including a first phoneme sequence corresponding to one or more substantive words, and pronunciation rules corresponding to the language to which the substantive words belong;

[0052]In the embodiment of the present invention, the training corpus includes the existing words of a certain language, phoneme sequences corresponding to the words, and pronunciation rules based on language classification. Substantive words refer to existing words with actual ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a pronunciation dictionary generation method, a word speech recognition method, a word speech recognition device, electronic equipment and a storage medium. The pronunciation dictionary generation method comprises the steps: acquiring a training corpus which comprises a first phoneme sequence corresponding to one or more notional words, and a pronunciationrule corresponding to the language to which the notional word belongs; constructing one or more function words according to the pronunciation rule, wherein the function words have corresponding a second phoneme sequence; and generating a pronunciation dictionary by adopting the notional words, the first phoneme sequence, the function words and the second phoneme sequence. According to the method,the data volume of the pronunciation dictionary is ensured, and a pronunciation dictionary with sufficient words can be generated by using training corpora less than that for training a common pronunciation dictionary when facing unknown small languages, so that pronunciation of the to-be-identified words is accurately identified by increasing little corpora with large corpora.

Description

technical field [0001] The invention relates to the field of speech recognition, in particular to a method for generating a pronunciation dictionary and a device for generating a pronunciation dictionary, a method for recognizing a word and a device for recognizing a word, electronic equipment, and a storage medium. Background technique [0002] Pronunciation dictionary is one of the important links of speech recognition and an integral part of it. It represents the pronunciation of a word (generally in units of words) corresponding to a phoneme. It is generally the standard International Phonetic Alphabet, but in fact the phonetic symbol is just a symbol. , is a representation method, in fact, it only needs to ensure that the same pronunciation has the same symbol. For unknown small languages, the construction of pronunciation dictionaries is generally divided into two methods: G2P (grapheme-to-phoneme, grapheme-to-phoneme) and the method of direct construction according to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/18G10L15/14
CPCG10L15/06G10L15/142G10L15/18
Inventor 刘羽辰李健武卫东
Owner BEIJING SINOVOICE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products