Voice sound source space sound effect 3D vivid generation system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for generating systems and sound sources, applied to stereo systems, electrical components, etc., can solve problems such as effective control, difficulty in realization, and inability to perceive sound source distance information, so as to achieve the effect of restoration and sound source distance

Pending Publication Date: 2021-08-27

高小翎

View PDF3 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The description of the virtual sound image by three-dimensional audio includes direction information (horizontal angle x, height angle y) and distance information z, while the sound image reconstructed by traditional stereo and surround sound only has degrees of freedom in the horizontal direction and height , does not conform to the definition of three-dimensional audio, resulting in the inconsistency of audio and video experience when the 3D multimedia system is reconstructed, and cannot provide the audience with a real sense of immersion and envelopment in the sense of hearing, making it difficult to have an immersive feeling

[0008] First, the sound image reconstructed by traditional stereo and surround sound only has degrees of freedom in the horizontal direction and height, which does not conform to the definition of 3D audio, resulting in the inconsistency of audio and video experience in the reconstruction of the 3D multimedia system, which cannot provide The audience provides a real sense of immersion and envelopment in the auditory sense, and it is difficult to have an immersive feeling. Most of the existing research on 3D audio technology focuses on the restoration of the direction of the sound source, and a few studies on distance restoration are only Concentrated in the free sound field, but in the free sound field, relying on intensity cues can only provide the relative distance of the sound source, and cannot provide the listener with accurate absolute distance information of the sound source; the research on distance perception in the reverberant environment is only limited. Limited to collecting the listening position in the reconstructed sound field through the microphone array and then analyzing the relationship between the distance of the sound source and the factors affecting it, there has not been a general restoration of the sound source distance model so far, which can be restored according to the sound source to be restored. Distance to control the signal played by the speaker, the virtual sound image positioning parameters and distance parameters consistent with the original sound source cannot be obtained by controlling the signal of the speaker, and the virtual sound image consistent with the spatial information of the target object sound source in the 3D video cannot be constructed;

[0009] Second, when speakers are used as playback devices, although the restoration effect based on the physical sound field reconstruction technology is more accurate and the recovery area is large, it requires a large number of speakers and strict restrictions on the placement of speakers. Difficult to implement in practice

[0010] Third, the rapid development of the 3D film and television industry and the start of the standardization process of 3D audio by the Dynamic Image Expert Group have aroused widespread concern in the field of 3D audio technology. Because the 2D audio system developed based on traditional stereo or surround sound lacks the three-dimensional space of the sound source The expression of information seriously damages the audience's real and complete spatial experience of audio-visual events

Both Ambisonics and WFS technologies in the prior art require a large number of speakers and have strict restrictions on the arrangement of speakers. The sound image formed by the perception-based sound field reconstruction technology is only located on the spherical surface where the speakers are located, and the listener cannot perceive it. Sound source distance information outside the spherical surface, while HRTF mainly uses headphones for playback and is closely related to individual differences in people, which has great limitations;

[0011] Fourth, if it is necessary to more accurately and comprehensively simulate the reverberation effect in a specific room, the spatial sound effect reverberation model still has the following deficiencies: First, when the spatial sound effect reverberation model simulates reverberation, the delay parameters and attenuation of the 19th-order FIR filter The coefficients are still the parameters when simulating the Boston Symphony Hall, and it is not proposed to set relevant parameters for specific occasions; second, the feedback gain coefficients of the comb filter are all calculated based on the average reverberation time in the room, but due to the influence of air on the sound The energy absorption is affected by the frequency of the signal. The energy attenuation degree of the sound in different frequency bands is different in different environments, that is, the corresponding reverberation time is different, especially for high-frequency signals. Although the model adds low The pass filter can change the reverberation time of the high-frequency part of the sound, but it only roughly controls the reverberation time, and cannot accurately reflect the reverberation time of different frequency signals

[0012] Fifth, the existing technology lacks the recovery of the distance of the sound image in the non-free field, and no effective method is proposed to effectively control the energy ratio of the direct sound and the reverberation sound; the appropriate reverberation in the prior art can make the sound heard by the listener It is clearer and brighter, but if the reverberation is particularly large, it will bring a bad feeling to the listener; although the current VBAP technology can reconstruct the direction information of the original sound source, this method lacks the perception of the distance of the sound image. The listener's perception of distance seriously affects the listener's overall experience of watching 3D videos

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0063] The following is a further description of the technical solution of the three-dimensional lifelike generation system of speech sound source spatial sound effect provided by the present invention in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention and implement it.

[0064] The rapid development of the 3D film and television industry and the start of the standardization process of 3D audio by the Motion Image Expert Group have aroused widespread concern in the field of 3D audio technology. Because the two-dimensional audio system developed based on traditional stereo or surround sound lacks the expression of three-dimensional spatial information of the sound source, it seriously damages the audience's real and complete spatial experience of audio-visual events. Therefore, the 3D audio processing technology based on the same audiovisual perception has become an important breakthrough direction to reali...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

According to the method, the consistency of subjective auditory feeling obtained by a voice sound source in a 3D video in a reconstructed 3D sound field and subjective visual feeling obtained in a reconstructed 3D video scene is taken as a target, and under the known sound source space information, sound source signals and a reconstruction environment condition, based on a factor influencing human ear distance perception, an artificial 3D reverberation model considering air attenuation is constructed, the sound source signals are corrected and controlled, sound source distance information is added to the sound source signals, and then inverse filtering for removing room reverberation at a listening position is obtained to remove the propagation influence of a signal played by a loudspeaker in a reverberation environment, so that the consistency of the input signal of the loudspeaker and the signal received by the listener is ensured, and the distance recovery of the sound source is completed; and gain factor distribution on loudspeaker group signals is performed to complete recovery of the sound source direction so as to complete recovery of the 3D space azimuth information of the sound source. According to the invention, vivid recovery of the 3D space information of a voice sound source can be well realized.

Description

technical field [0001] The invention relates to a three-dimensional generation system of sound source spatial sound effect, in particular to a three-dimensional realistic generation system of voice sound source spatial sound effect, belonging to the technical field of three-dimensional generation of spatial sound effect. Background technique [0002] The huge success of 3D movies at the box office has brought the film and television industry into the era of 3D multimedia. In recent years, the output of 3D movies has also increased. in front of the audience. The mature development of 3D video technology makes the audience's visual three-dimensional experience very obvious, which can provide the audience with a better visual experience, and there are also a large number of devices supporting 3D visual effects on the market. However, as the sound system of 3D video, the products currently on the market still use stereo or surround sound technology, which cannot provide the aud...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): H04S7/00

CPCH04S7/305H04S7/303H04S2400/01

Inventor 高小翎刘勇

Owner 高小翎

Voice sound source space sound effect 3D vivid generation system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology