Speech recognition method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech recognition and speech annotation, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as speech recognition errors, speech recognition errors, and failure to consider polyphonic characters, and achieve the effect of ensuring accuracy

Inactive Publication Date: 2018-02-16

BEIJING SINOVOICE TECH CO LTD

View PDF4 Cites 33 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] In view of this, the present invention aims to propose a speech recognition method and device to solve the problem of speech recognition errors in the prior art due to the lack of consideration of polyphonic characters in speech recognition

It solves the problem that the existing network construction technology cannot disambiguate polyphonic characters, resulting in errors in speech recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0049] refer to figure 1 , which is a flow chart of a speech recognition method described in an embodiment of the present invention, may specifically include the following steps:

[0050] Step 101, perform preprocessing on the preset corpus; the preprocessing at least includes: phonetic annotation of polyphonic characters.

[0051] In the embodiment of the present invention, after obtaining a corpus, after preprocessing the data in the expected library such as cleaning, word segmentation, etc., the expected data in units of phrases is obtained, and the polyphonic characters that appear in many of the phrases are different according to the phrases. And the pronunciation is different, such as Figure 1A Described, where "De#1" means that the character "De" is a polyphonic character, and there is only one phonetic "De" after the breath, as follows 4-gram: "The breath of life" is changed to "The breath of life #1 ", after manually marking all the polyphonic characters in the exp...

Embodiment 2

[0068] refer to figure 2 , which is a flow chart of a speech recognition method described in an embodiment of the present invention, may specifically include the following steps:

[0069] Step 201, establishing a preset corpus according to the collected Chinese corpus data; the Chinese corpus is extracted from the same language field.

[0070] In the embodiment of the present invention, a corpus refers to a language material library in a popular sense. The corpus in the strict sense refers to a large-scale electronic text library with a certain capacity built by collecting naturally occurring continuous language use texts or discourse fragments according to certain linguistic principles and using random sampling methods. For the information field where the purpose of establishing a corpus is speech recognition, the language text for this specific field is selected, and after preprocessing such as sampling, a corpus for this field is generated.

[0071] It should be...

Embodiment 3

[0095] refer to image 3 , is a structural block diagram of a speech recognition device according to an embodiment of the present invention.

[0096] The corpus preprocessing module 301 is used to perform preprocessing in the preset corpus; the preprocessing includes at least: polyphone phonetic annotation;

[0097] A language model training module 302, configured to perform language model training according to the pre-processed preset corpus;

[0098] Pronunciation dictionary generating module 303, for adding the polyphonic word entry of described polyphonic word phonetic mark to preset dictionary, generates pronunciation dictionary;

[0099] The acoustic model composition generation module 304 is configured to generate an acoustic model composition after the speech recognition network is built according to the language model and the pronunciation dictionary.

[0100] refer to Figure 4 , is a schematic diagram of the relationship between modules in the embod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech recognition method and device. The method includes preprocessing a preset corpus, the preprocessing at least including polyphone speech annotation; performing languagemodel training according to the preprocessed preset corpus; adding polyphone vocabulary entries with polyphone speech annotation to a preset dictionary, and generating a pronunciation dictionary; andafter performing speech recognition network creation according to the language model and the pronunciation dictionary, generating acoustic model composition. The problem in the prior art that since the polyphone problem is not considered in speed recognition, the problems of complex speech recognition steps and polyphone recognition errors are caused, is solved.

Description

technical field [0001] The invention relates to the technical field of voice recognition, including a voice recognition method and device. Background technique [0002] Automatic Speech Recognition (ASR) is a technology that studies how to convert human speech recognition into text, which can be applied to services such as voice dialing, voice navigation, indoor device control, voice document retrieval, and simple dictation data entry. middle. [0003] The problem of polyphonic characters is involved in the process of network construction during speech recognition system training. The existing network construction technology cannot disambiguate polyphonic characters. Why is it not the breath of life?" Their pronunciations are then mapped through the pronunciation dictionary as: [0004] You ShenMeLiYouBuShiShengMing De QIXiDi Ne (first sentence) [0005] You ShenMeLiYouBuShiShengMing De QIXi Di Ne (second sentence) [0006] Therefore, in many cases, due to the lack of co...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/18G10L15/14G10L15/06

CPCG10L15/063G10L15/142G10L15/18G10L15/26

Inventor 郑晓明李健

Owner BEIJING SINOVOICE TECH CO LTD

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech recognition method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology