Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for optimizing voice recognition acoustic model

An acoustic model and speech recognition technology, applied in the computer field, can solve the problem of low efficiency in optimizing the acoustic model of speech recognition, and achieve the effect of improving optimization efficiency and quality

Active Publication Date: 2013-06-19
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF5 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention provides a method and system for optimizing the acoustic model of speech recognition to solve the problem of low efficiency of the existing optimized acoustic model of speech recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for optimizing voice recognition acoustic model
  • Method and system for optimizing voice recognition acoustic model
  • Method and system for optimizing voice recognition acoustic model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] Embodiment 1. This embodiment provides a method for optimizing the acoustic model of speech recognition, which is applied to but not limited to voice search or voice input systems. See figure 1 shown, including the following steps:

[0023] S11. Recognize the input speech segment by using the speech recognition acoustic model to obtain a recognition result, and obtain an annotation script of the input speech segment.

[0024] In this embodiment, the user continuously inputs voice to perform a voice search operation, which includes several voice segments, and each voice segment includes voice data representing voice components and mute data representing noise (silence) components.

[0025] In this embodiment, take the processing of one voice segment as an example, other voice segments can perform the same processing, and will not go into details, for example: the user voice inputs a query sentence "how to change the WeChat interface", and the server receives and stores t...

Embodiment 2

[0033] Embodiment 2. This embodiment provides a method for optimizing the acoustic model of speech recognition, which is applied to but not limited to voice search or voice input systems. See figure 2 shown, including the following steps:

[0034] S21. Recognize the input speech segment by using the speech recognition acoustic model to obtain a recognition result, and obtain an annotation script of the input speech segment.

[0035] The specific description is consistent with S11 and will not be repeated here.

[0036] S22. Comparing the recognition result with the tagged script to obtain the wrongly recognized speech segment.

[0037] The specific description is consistent with that of S12 and will not be repeated here.

[0038] S23. Update the training data of the speech recognition acoustic model with the wrongly recognized speech segment.

[0039] In this embodiment, the speech segment acquired in step S22 is further filtered, and the training data of the speech recogn...

Embodiment 3

[0051] Embodiment 3. This embodiment provides a system for optimizing the acoustic model of speech recognition, which is applied to but not limited to the field of voice search or voice input. See Figure 4 As shown, it includes: an acquisition unit 31 , a comparison unit 32 , an update unit 33 and a training unit 34 .

[0052] Wherein, the obtaining unit 31 is configured to use the speech recognition acoustic model to recognize the input speech segment to obtain a recognition result, and obtain the annotation script of the input speech segment.

[0053] In this embodiment, the user continuously inputs voice to the system for optimizing the acoustic model of voice recognition to perform a voice search operation, which includes several voice segments, and each voice segment includes voice data representing audio components and voice data representing noise (silence) components. Silent data.

[0054] In this embodiment, take the processing of one voice segment as an example, ot...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and a system for optimizing a voice recognition acoustic model, and belongs to the field of computer technologies. The method and the system are used for solving the problem that existing optimization methods of a voice recognition acoustic model are low in efficiency. The method comprises that (1) input voice segments are recognized through the voice recognition acoustic model so that a recognition result is obtained and a labeled script of the input voice segment is obtained; (2) the recognition result is compared with the labeled script, and a voice segment which is recognized to be wrong is obtained; (3) the mistaken-recognized voice segment and a labeled script of the mistaken-recognized voice segment are used for updating training data of the voice recognition acoustic model;(4) and the updated training data are used for retraining the voice recognition acoustic model. The system comprises an acquisition unit, a comparison unit, an updating unit and a training unit. According to the method and the system for optimizing the voice recognition acoustic model, the training data of the voice recognition acoustic model are optimized, and the quality of the training data is improved so that optimization efficiency of the voice recognition acoustic model is improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method for optimizing an acoustic model for speech recognition and a corresponding system. Background technique [0002] Speech recognition is an interdisciplinary subject. Speech recognition is gradually becoming the key technology of man-machine interface in information technology. The combination of speech recognition technology and speech synthesis technology enables people to get rid of the keyboard and operate through voice commands. The application of voice technology has become a competitive new high-tech industry. Several basic methods of speech recognition include: methods based on vocal tract acoustics and speech knowledge, template matching methods, and methods using artificial neural networks. [0003] In the voice search or voice input system, users continuously input voice data, and the recognition results obtained by voice recognition sometimes have deviation...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/05
Inventor 苏丹
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD