Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech pitch estimation method and device

A pitch period and speech technology, applied in the field of speech coding, can solve the problems that waveforms are susceptible to formants and noises, the pitch period has a large variation range, and are difficult, so as to overcome multiplication and half frequency errors, reduce computational complexity, improve The effect of anti-noise performance

Active Publication Date: 2016-05-11
广东广晟研究开发院有限公司
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] (1) The change of the speech signal is very complicated, the glottal excitation waveform is not a complete periodic pulse train, and the period of the speech waveform is time-varying
[0004] (2) At the beginning and end of the speech, there is no periodicity like vocal fold vibration. It is difficult to determine whether some transitional sounds such as unvoiced and voiced sounds belong to periodic or non-periodic signals, so it is impossible to estimate the pitch period
[0005] (3) To remove the influence of the vocal tract from the speech signal, it is difficult to directly extract information related to the vibration of the vocal cords
[0006] (4) The difficulty of defining the exact start and end of each pitch period in the voiced segment limits the reliable measurement of pitch, not only because the speech signal itself is quasi-periodic (that is, the pitch changes), but also because the waveform is easy to Affected by formants, noise, etc.
[0008] (6) The large range of pitch cycle changes also brings certain difficulties to accurate pitch detection
However, the ACF method in the general time domain is prone to "multiple frequency" and "half frequency" errors, and the AMDF method cannot effectively track rapid changes in speech frequency
The frequency domain method generally adopts the cepstrum method. Due to the introduction of logarithmic operation, the calculation amount is greatly increased, and it is easily affected by noise.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech pitch estimation method and device
  • Speech pitch estimation method and device
  • Speech pitch estimation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0047] figure 1 A flow chart of a speech pitch period estimation method 100 according to an embodiment of the present invention is shown. Such as figure 1 As shown, the speech pitch estimation method 100 includes:

[0048] In step 110, the speech signal is preprocessed by removing the DC component, perceptual weighting and signal downsampling.

[0049] In step 120, the normalized autocorrelation function value of the preprocessed speech signal is calculated. The present invention uses the following normalized autocorrelation function:

[0050] ρ ( τ ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a voice pitch period estimation method and device. The device comprises a signal preprocessing unit, a normalized autocorrelation function computing element and a pitch period postprocessing unit. The method includes the steps of firstly, conducting preprocessing including direct current component removal, perception weighting and under-signal sampling on voice signals; secondly, computing normalized autocorrelation function values of the processed voice signals; thirdly, determining the maximum of the normalized autocorrelation function values in the pitch period searching range, and determining a pitch period candidate value corresponding to the maximum to be a pitch period estimation value of the voice signals. According to the voice pitch period estimation method and device, frequency doubling errors and frequency halving errors in the pitch period estimation are well overcome, the noise resistance performance of the pitch period estimation method is improved, meanwhile, the algorithm complexity of an algorithm is lowered, and the corresponding digital audio / speech coding efficiency is improved. The voice pitch period estimation method and device can be applied to pitch searching of various voice coding and decoding algorithms and have a wide application range.

Description

technical field [0001] The present invention relates to speech coding technology, more specifically, to a speech pitch period estimation method and device. Background technique [0002] The pitch period refers to the period in which the vocal cords vibrate when a person makes a sound. Pitch period is an important issue in speech coding, and its accuracy will directly affect the coding quality and efficiency of the speech coder. Accurate pitch periodicity analysis can effectively remove redundancy in the speech coding process, reduce the number of coding bits, and realize low bit rate high-quality speech coding. However, due to the particularity of speech, the accurate search of the pitch period will face the following difficulties: [0003] (1) The change of the speech signal is very complicated, the glottal excitation waveform is not a complete periodic pulse train, and the period of the speech waveform is time-varying. [0004] (2) At the beginning and end of speech, th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L19/00G10L25/48
Inventor 闫建新张勇
Owner 广东广晟研究开发院有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products