Perception domain audio encoding method and system based on Gaussian mixed model

A Gaussian mixture model and audio coding technology, applied in the field of audio coding in the perceptual domain, can solve problems such as low coding efficiency and inability to perceive signal components, and achieve the effect of reducing the coding bit rate

Active Publication Date: 2014-04-30
WUHAN UNIV
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The human auditory system has limitations and cannot perceive all signal components in the sound it receives
After the traditional perceptual domain audio coding method transforms the audio signal into the perceptual domain, a large number of redundant pulse signals will be generated, and the coding efficiency is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Perception domain audio encoding method and system based on Gaussian mixed model
  • Perception domain audio encoding method and system based on Gaussian mixed model
  • Perception domain audio encoding method and system based on Gaussian mixed model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The technical solutions of the present invention will be further described below in conjunction with the drawings and specific embodiments.

[0039] See figure 1 , the Gaussian mixture model-based perceptual domain audio coding method provided by the present invention can adopt computer software technical means to automatically carry out the process, specifically including the following steps:

[0040] In step 1, an auditory filter is used to filter the input audio signal to obtain a sub-band signal.

[0041] The sampling rate of the input audio signal is 16kHz, divided into 65 sub-band channels, the center frequency of the first sub-band filter is 26.03Hz, the center frequency of the 65th sub-band filter is 7743Hz, each sub-band filter is a FIR filter. In this specific implementation, a gammatone filter is used to filter the input audio signal, and 65 sub-band signals are obtained.

[0042] Step 2, extract the Hilbert envelope of the sub-band signal, and perform smo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a perception domain audio encoding method and system based on a Gaussian mixed model. The perception domain audio encoding method includes the steps of (1) filtering an input audio signal through an auditory filter to obtain a sub-band signal, (2) extracting a Hilbert envelope of the sub-band signal, and carrying out smooth filtering on the Hilbert envelope to obtain a sub-band signal spectrum envelope, (3) obtaining an absolute masking threshold of the sub-band signal spectrum envelope through a psychoacoustics model, and carrying out auditory threshold judgment on the sub-band signal spectrum envelope according to the absolute masking threshold, (4) replacing the sub-band signal spectrum envelope with a multiplexing masking model, (5) carrying out Gaussian mixed model parameter fitting on the sub-band signal spectrum envelope through the Gaussian-Newton algorithm, and (6) quantizing and encoding fitted Gaussian mixed model parameters. The perception domain audio encoding method and system can be applied to high-quality medium-low code rate voice encoding, and the encoding rate can be greatly lowered.

Description

technical field [0001] The present invention relates to the field of perceptual domain audio coding, in particular to a Gaussian mixture model-based perceptual domain audio coding method and system. Background technique [0002] With the rapid development of computer technology, network technology and communication technology, human society has entered the digital age. Some important signals, such as digital versions of voice, music, and video, have huge data volumes and high transmission and storage costs. Moreover, with the continuous emergence of new technologies and applications, there may be sources with higher digital rates. The transmission and storage of these data is a big problem, and the coding technology is the solution to this problem. In these applications, audio coding technology, as one of the key technologies, has played a great role in promoting. The human auditory system has limitations and cannot perceive all signal components in the sound it receives....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L19/04
Inventor 高戈陈怡吕亚平张康杨玉红
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products