Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice endpoint detection method based on real-time decoding

An endpoint detection and voice technology, which is applied in voice analysis, voice recognition, voice synthesis, etc., can solve the problems of targeted voice detection and low real-time performance that users cannot care about, and achieve short response time, good effect and high real-time performance Effect

Active Publication Date: 2013-03-20
ANHUI IFLYREC TECH CO LTD
View PDF3 Cites 35 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] The technology of the present invention solves the problem: overcomes the deficiencies of the prior art, and provides a voice endpoint detection method based on real-time decoding, which solves the problem that the real-time performance of the existing endpoint detection technology is not high when the voice recognition text is determined, and cannot be corrected. Targeted detection of voices that users care about

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice endpoint detection method based on real-time decoding
  • Voice endpoint detection method based on real-time decoding
  • Voice endpoint detection method based on real-time decoding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] The present invention is a new text-related endpoint detection method, taking viterbi decoding as an example of a decoding method (the present invention is not limited to viterbi decoding), the flow chart of the present invention is as follows figure 1 Shown:

[0060] Step 1: Enter text related to speech recognition and parse the text:

[0061] The input text is the reading content predetermined by the user, and it is also one of the basis for constructing the decoding network. This step mainly completes two tasks: first, it is necessary to uniformly convert the encoding format of the text, such as uniformly converting to UTF8 format. The advantage of this is that only one set of codes for parsing the text needs to be implemented; Granularity (such as words, syllables, and phonemes) for analysis (generally, it is better to use phonemes as modeling units, and the following descriptions use phonemes as examples), and generate a tree structure of analysis results, which i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A voice endpoint detection method based on real-time decoding includes the following steps: inputting texts related to voice recognition, analyzing the texts, constructing a decoding network according to text analyzing results, inputting voice, extracting acoustic characteristics in the voice, decoding the acoustic characteristics based on the constructed decoding network to obtain a decoded voice unit sequence, conducting voice endpoint judgment on the decoded voice unit sequence to judge whether voice endpoints exist or not, dividing voice endpoints into voice starting points and voice ending points, feeding voice endpoint information to an external application system on yes judgment and continuing step2 on no judgment. The voice starting point judgment is selectable in the third step, and voice starting point is not judged if the external application system does not care about the voice starting point. The voice endpoint detection method resolves the problem that targeted detection cannot be conducted on voice cared by users due to the fact that real time performance of the traditional endpoint detection technology is not high under the condition that voice recognition texts are not determined.

Description

technical field [0001] The invention relates to a method for detecting a speech end point based on a decoding result, in particular to a method capable of feeding back the speech end point in time. Background technique [0002] Speech endpoint detection is to determine the start and end of the speech, and exclude the silent segment from the speech signal. Whether the endpoint detection is correct or not has a great influence on the performance of speech recognition. In the voice evaluation system, the content of the user's recording has been determined by the text of the test paper. After the user reads the content of the test paper, the end point of the voice is given in time and the calculation is stopped, which is helpful to improve the system performance and evaluation effect. In the outer application system, the effect of endpoint detection directly affects the user experience. [0003] For example, in the speech learning software, the end point detection is carried o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/87G10L13/08G10L15/02
Inventor 吴玲王兵赵乾潘颂声何春江朱群
Owner ANHUI IFLYREC TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products