Unlock instant, AI-driven research and patent intelligence for your innovation.

Pitch detection method, device and medium based on discrete logarithmic Fourier transform

A technology of Fourier transform and discrete logarithm, applied in speech analysis, instruments, etc., can solve problems affecting pitch accuracy and correlation

Inactive Publication Date: 2011-12-21
CANON KK
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the latter case ( Figure 4 frame i+1 in ), the absence of the last few harmonics affects the correlation between the analyzed wave segment and the template, thus affecting the accuracy of the detected pitch

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Pitch detection method, device and medium based on discrete logarithmic Fourier transform
  • Pitch detection method, device and medium based on discrete logarithmic Fourier transform
  • Pitch detection method, device and medium based on discrete logarithmic Fourier transform

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0035]As described in the Background of the Invention section, in the logarithmic frequency space, the distance from the pitch to a predetermined harmonic (such as the third harmonic) is constant. In order to avoid increasing the amount of computation, the frequency search window can be fixed to such a distance. However, since the pitch varies greatly, the fixed window does not work, or works well, when the pitch is too low or too high, because the actual pitch will be outside the frequency window, or some harmonic Waves can exceed the window and affect the correlation value, such as Figure 4 shown.

[0036] So, in theory, if you move up or down the fixed-size frequency search window appropriately, you can always cover the same number of harmonics (and fundamental tone).

[0037] Based on such an idea, in the first embodiment, the inventor proposes to move the window according to the latest valid pitch of the previously detected wave segment.

[0038] In this example, if ...

no. 2 example

[0045] In the second embodiment, the method may further include a step of calculating a score (S500), based on the correlation obtained in the step of performing a correlation between the spectrum and the template (S200) in the pitch detection of the last wave segment As a result, a score is calculated. Correspondingly, the frequency window modifier may further include a score calculator 310 for performing the above step S500.

[0046] In this case, the move operation is performed only when the score is within a certain range, thereby avoiding unnecessary move operations that increase the amount of computation. For example, when the present invention is applied to human speech recognition, a score value that is too low likely means that the current wave is not human speech, and therefore no action is required. That is, for example, the move operation is performed only if the score is above a first threshold.

[0047] The score reflects the confidence value of the detected pi...

no. 3 example

[0054] In the third embodiment, the moving means 314 (correspondingly, the step of moving the frequency window) in the second embodiment may be replaced by the expanding means 316 (correspondingly, the step of expanding the frequency window).

[0055] In this case, the expansion operation is performed only when the score is within a certain range, thereby avoiding unnecessary expansion operations that increase the amount of computation. In particular, if the score is high enough, there is no need to expand the frequency bin and a standard sized frequency bin can be used. For example, if the score is lower than the second threshold, indicating that the confidence value of the latest detected pitch is relatively low, the frequency range may be extended to cover more possible pitch values.

[0056] As a criterion for performing the step of expanding the frequency window (S600), the second threshold of the score depends on a specific method of calculating the score. For example, ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a fundamental tone detection method based on discrete logarithmic Fourier transform, its equipment and medium. Said method includes the following steps: calculating discrete logarithmic Fourier transform (DLFT) spectrum of input wave segment in a frequency window; executing correlation between said spectrum and standard template; and using correlated result to estimate fundamental tone. It is characterized by that it also includes the step of modifying the described frequency window before DLFT spectrum is calculated.

Description

technical field [0001] The present application generally relates to pitch detection in the analysis and modeling of speech intonation, and more particularly to pitch detection based on discrete logarithmic Fourier transform (DLFT). Background technique [0002] For the analysis and modeling of speech intonation, reliable pitch detection is critical. At the same time, it was also found that pitch is closely related to intonation features such as word stress, pitch, and sentence intonation, which provide important perceptual clues for human speech communication. Various pitch detection algorithms (pitch detection algorithms (PDA)) have been developed in the past, one of which is the DLFT method, which is described by Wang, C. and S. Seneeff in Robust Pitch Tracking for Prosodic Modeling of Telephone Speech, in Proc. Described in ICASSP'00, Istanbul, Turkey, pp.1143-1146. The DLFT method is based on frequency-domain analysis of speech signals and is designed to be particularl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L11/04G10L11/00G10L25/06
Inventor 李云飞
Owner CANON KK