Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for reconstructing vocoder residual spectrum amplitude parameters

A vocoder and parameter technology, applied in the field of vocoder residual spectrum amplitude parameter reconstruction, can solve the problem of low naturalness of synthesized speech, achieve the effect of improving naturalness and ensuring intelligibility

Active Publication Date: 2021-11-16
南京梧桐微电子科技有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The technical problem to be solved by the present invention is to overcome the defects of the prior art, provide a vocoder margin spectrum amplitude parameter reconstruction method and system, and solve the technical problem of low naturalness of synthesized speech in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for reconstructing vocoder residual spectrum amplitude parameters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0044] Embodiment: Utilize the voice training set to sample at 8KHz frequency, 16-bit quantization, refer to the method in the MELP vocoder to extract the residual spectrum amplitude parameter, the dimension is 10, and form the residual spectrum amplitude set.

[0045] (12) Use vector clustering technology to train and generate a residual spectrum amplitude codebook with a size of 1024 for the above residual spectrum amplitude set;

[0046] Embodiment: Using the residual spectrum amplitude set generated in step (11), the LBG algorithm is used to generate a residual spectrum amplitude codebook C with a size of 1024.

[0047] (13) Use the training speech set to extract parameters such as line spectrum frequency, bandpass voicing degree, pitch period, energy, and residual spectrum amplitude by frame;

[0048] Embodiment: Refer to the parameter extraction method in the MELP vocoder to extract parameters such as line spectrum frequency, bandpass voicing degree, pitch period, energy...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for reconstructing the residual spectrum amplitude parameters of a vocoder, which obtains the line spectrum frequency parameters, bandpass voicing degree parameters, pitch period parameters, and energy parameters input by the vocoder decoding end, and according to the acquired The parameters are used to obtain the preliminary synthesized speech; the preliminary synthesized speech is converted into an image matrix, which is input to the trained deep convolutional network to obtain a quantized index; the pre-generated residual spectrum amplitude parameter codebook is searched according to the quantized index, and the reconstructed Residual spectrum amplitude parameters; synthesize the reconstructed residual spectrum amplitude parameters and obtained line spectrum frequency parameters, bandpass voicing parameters, pitch period parameters, and energy parameters to obtain the final synthesized speech. Advantages: The residual spectrum amplitude parameters are not progressively encoded and transmitted. When the vocoder is working, the deep convolutional network generated by training reconstructs the residual spectrum amplitude parameters, which further improves the speech intelligibility while ensuring the speech intelligibility. Naturalness of synthesized speech.

Description

technical field [0001] The invention relates to a method and system for reconstructing a vocoder residual spectrum amplitude parameter, and belongs to the technical field of speech coding. Background technique [0002] Speech coding is widely used in communication systems, recording and playback systems, and consumer products with voice functions. In recent years, the International Telecommunication Union (ITU), 3GPP, some regional organizations and countries have successively formulated a series of speech compression coding standards. One of the important development trends is that the coding rate is getting lower and the synthetic voice quality is getting higher and higher. At present, low-rate high-quality speech compression coding algorithms are still in urgent demand in the fields of wireless communication, secure communication, underwater acoustic communication, satellite communication, etc., and have been extensively studied. Among various low-rate speech coding mode...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L19/087G10L13/04G10L25/30
CPCG10L13/04G10L19/087G10L25/30
Inventor 颜夕宏张生平王主磊吴子晧颜明
Owner 南京梧桐微电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products