Response balance processing method for voice
A processing method and loudness technology, which is applied in the field of loudness equalization processing of speech, to achieve the effect of improving perceived quality, stabilizing perceived speech intensity, and eliminating unstable factors
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0037] The voice loudness equalization processing method of the present invention is mainly applied to voice output in telephone conferences, video conferences and VOIP, so as to solve the phenomenon that the output voice loudness is sometimes louder and sometimes smaller in practical applications.
[0038] This embodiment takes voice output in VOIP as an example. In this embodiment, loudness equalization is performed on the decoded output speech.
[0039] Such as figure 1 As shown, when the type is judged, the time-frequency transformation of the input signal is performed through the radix-two FFT transformation, and then two sub-bands are divided according to the psychoacoustic model, that is, the signal is divided into two frequency bands of high and low frequencies. The signal energy is calculated in the range of the high and low frequency bands respectively, and the ratio of the high and low frequency energy is calculated, and the ratio of the high and low frequency ener...
Embodiment 2
[0059] The signal type judgment in this embodiment is performed in the time domain, and the specific adjustment process after the type judgment is the same as that in Embodiment 1. The data segment is carried out in the time domain by calculating the short-term signal energy and the short-term zero-crossing rate.
[0060] Such as Figure 4 As shown, high-pass filtering is first performed on the input signal data segment to weaken the signal energy dominated by noise. Next, windowing is performed, and then the average energy of the frame is calculated, and then the short-term energy is used for the preliminary judgment of voice behavior detection (VAD). If the average energy is greater than the threshold, it is judged as the second type of data, and if the average energy is smaller than the threshold, it is judged as low-energy data. VAD smoothing is performed on frames judged as low-energy data, that is, refer to the situation of the first three frames: if the first three fr...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com