A loudspeaker-based audio-visual personalized reproduction method and device

A loudspeaker and dual-speaker technology, which is applied in the direction of speaker distribution signal, frequency/direction characteristic device, neural learning method, etc., can solve the problem of poor spatial perception effect of the listener, achieve improved spatial perception effect, and reduce computational complexity Effect of harmonic reconstruction error

Active Publication Date: 2018-12-21
WUHAN UNIV
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a loudspeaker-based personalized sound image reproduction method and device, which are used to solve the problem of the existing loudspeaker audio reconstruction technology due to the error of the HRTF personalization technology being amplified in the crosstalk cancellation module. Technical issue with poor spatial perception of the listener

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A loudspeaker-based audio-visual personalized reproduction method and device
  • A loudspeaker-based audio-visual personalized reproduction method and device
  • A loudspeaker-based audio-visual personalized reproduction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] This embodiment provides a loudspeaker-based method for personalized sound image reproduction, please refer to figure 1 , the method includes:

[0057] Step S1: Determine the orientation of the speakers and the target orientation, wherein the number of speakers is at least two, and the target orientation is an ideal reconstructed sound image orientation.

[0058] Specifically, the target azimuth is the azimuth of a sound image that is expected to be synthesized by the two speakers. For example, the azimuth where the two speakers are expected to be synthesized is A, and A is the target azimuth. The number of speakers can be set according to the actual situation, such as 2, 3, 4, etc. By setting multiple speakers, a small range of personalized sound image reproduction can be achieved, and a better azimuth rendering effect can be obtained.

[0059] In the specific implementation process, taking two loudspeakers as an example, an appropriate coordinate system can be establ...

Embodiment 2

[0109] This embodiment provides a loudspeaker-based personalized audio image reproduction device, please refer to image 3 , the device consists of:

[0110] The orientation determining module 301 is configured to determine the orientation of the speaker and the target orientation, wherein the number of the speakers is at least two, and the target orientation is the orientation of the ideal reconstructed sound image;

[0111] The first weight vector calculation module 302 is used to determine the corresponding HRTF according to the orientation of each loudspeaker and the target orientation, wherein the HRTF is stored in the HRTF database, and the HRTF and the corresponding complete human body parameters are recorded in the database, and based on the HRTF database, establishing the equation of the binaural signal of the virtual sound image and the binaural signal of the target sound image, and calculating the first weight vector corresponding to each loudspeaker;

[0112] The ...

Embodiment 3

[0128] Based on the same inventive concept, the present application also provides a computer-readable storage medium 400, please refer to Figure 4 , on which a computer program 411 is stored, and the method in Embodiment 1 is implemented when the program is executed.

[0129] Since the computer-readable storage medium introduced in the third embodiment of the present invention is the computer-readable storage medium used to implement the speaker-based personalized audio-image reproduction method in the first embodiment of the present invention, it is based on the first embodiment of the present invention With the method introduced, those skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, so details are not repeated here. All computer-readable storage media used in the method of Embodiment 1 of the present invention belong to the scope of protection of the present invention.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a loudspeaker-based audio-visual personalized reproduction method and device, The method includes: first, determining the orientation of the speaker and the orientation of the target, and then calculating a first weight vector corresponding to the multi-loudspeakers based on the HRTF database, then screening the key human body parameters, next, designing the neural network to establish the mapping relationship between the first weight vector and the key human body parameters, then measuring the key human body parameters selected by the listener, predicting the corresponding second weight vector based on the neural network model, and calculating the pre-frequency domain filter of each loudspeaker according to the second weight vector, finally, filtering the sound source signal by the pre-frequency domain filter and outputting through the two loudspeakers. The invention realizes the technical effect of improving the spatial perception effect of the listener.

Description

technical field [0001] The invention relates to the technical field of multimedia signal processing, in particular to a speaker-based method and device for personalized audio-image reproduction. Background technique [0002] Sound source localization is a necessary technology for realizing the immersive experience of virtual reality (Virtual Reality, VR). Based on amplitude panning technology (Amplitude Panning, AP), because of its simple implementation, it has been more commonly used in reproducing 3D audio from loudspeakers. Representative technologies of AP mainly include Vector Base Amplitude Panning (VBAP) and Multiple-Direction Amplitude Panning (MDAP). The basic idea of ​​this type of technology is that the loudspeaker and the listening point form a simple geometric model, and the corresponding gain value of each loudspeaker is obtained according to the principle of vector decomposition. Loudspeaker signals with different gains cause the listener to perceive a sound...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04S7/00H04R1/20H04R3/04H04R3/12G06N3/08
CPCH04R1/20H04R3/04H04R3/12H04S7/308G06N3/08
Inventor 涂卫平郑佳玺翟双星张雄沈晨
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products