A loudspeaker audio rendering method, apparatus, device and medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By introducing compensation dimension and function fitting into the speaker audio rendering, and adjusting the speaker gain coefficient, the problems of inaccurate sound image positioning and auditory discontinuity in the existing technology are solved, achieving a high-precision, highly compatible, and naturally sounding audio rendering effect.

CN122205352APending Publication Date: 2026-06-12MALANSHAN AUDIO & VIDEO LABORATORY

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: MALANSHAN AUDIO & VIDEO LABORATORY
Filing Date: 2026-03-27
Publication Date: 2026-06-12

Application Information

Patent Timeline

27 Mar 2026

Application

12 Jun 2026

Publication

CN122205352A

IPC: H04S7/00; H04R1/40

AI Tagging

Application Domain

Frequency/directions obtaining arrangements Stereophonic systems

Technical Efficacy Phrases

avoid distortion High precision

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

A compression detection device for a new material plastic product
CN121499202Bavoid distortionshorten test timePipe fitting Structural engineering
An optimization control method for a wheel hub liquid die forging forming process
CN122274068Aavoid distortionEliminate installation errorsMetallic materials Mechanics
Method for forming a composite material flange and die therefor
CN121946890BQuite a compaction effectContinuous arrangementFiber Aviation
A device for laser quenching the surface of a die steel
CN224337614UReduce the temperatureTake away quicklyLaser quenchingElectric machine
Motorcycle rocker arm
CN224491381Ustable job Improve structural strength Control theory Cantilever

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing speaker spatial audio rendering methods suffer from problems such as inaccurate sound image localization, discontinuous hearing, and sound image traction distortion in complex layouts. In particular, in panoramic sound environments containing overhead speakers, it is difficult to achieve high-precision and highly compatible audio rendering.

Method used

By determining the distance from the sound source to each speaker, calculating the initial gain coefficient, and introducing a compensation dimension for function fitting, the total gain coefficient of the speakers is adjusted, including gain, delay, and phase adjustment, thereby optimizing the audio rendering effect.

Benefits of technology

It significantly improves the accuracy and auditory realism of audio rendering, enhances the adaptability to different speaker layouts and sound-emitting object properties, and avoids the sound image pulling distortion caused by traditional normalization processing.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122205352A_ABST

Patent Text Reader

Abstract

The application discloses a loudspeaker audio rendering method and device, equipment and medium, and relates to the technical field of computers. The method comprises the following steps: determining the distance from a sound source to each loudspeaker, calculating an initial gain coefficient corresponding to each loudspeaker based on the distance; determining a compensation dimension corresponding to each loudspeaker, and performing function fitting based on a preset key point and a corresponding compensation gain value to obtain a compensation gain coefficient or a compensation method of each loudspeaker in the compensation dimension; determining a total gain coefficient of each loudspeaker according to the initial gain coefficient of each loudspeaker and the compensation gain coefficient or the compensation method, and performing overall gain fitting on the total gain coefficient adjusted in each compensation dimension to adjust the total gain coefficient, so that the audio signal is rendered according to the adjusted total gain coefficient. Thus, high-precision, strong-compatibility and natural-audible loudspeaker spatial audio rendering can be realized.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer technology, and in particular to a speaker audio rendering method, apparatus, device, and medium. Background Technology

[0002] In recent years, high-end audio playback devices such as surround sound and immersive sound have been widely used in automotive, home theater, and professional cinema scenarios, leading to higher demands from users for the accuracy of sound spatial positioning. Accurately representing sound image positions in complex speaker layouts has become a core challenge for spatial audio rendering technology. Current mainstream rendering methods mainly include distance-based amplitude panning (DBAP) and vector-based amplitude panning (VBAP). DBAP allocates gain by calculating the distance between the sound source and each speaker, offering good layout compatibility and smooth transitions. However, in immersive sound environments containing overhead speakers, the sound image pulling effect of the overhead speakers can easily cause the overall sound to be biased upwards. VBAP, on the other hand, selects the speaker closest to the sound source for vector calculation. Although the computational load is small, it also suffers from the problem of biased sound images, and it only supports point sources, only representing situations where the sound source is located on the surface of the sound field. It cannot represent the interior of the sound field, meaning that even within the sound field, it cannot represent distances, resulting in discontinuous sound when the sound source moves. Furthermore, it has poor compatibility with non-standard layouts. More importantly, both methods employ gain normalization, a process that often amplifies loudspeakers with low gain, leading to sound image distortion, especially when the sound source is located outside the sound field, making accurate positioning impossible.

[0003] As can be seen from the above, how to achieve high-precision, highly compatible, and naturally sounding speaker spatial audio rendering is an urgent problem to be solved. Summary of the Invention

[0004] In view of this, the purpose of this invention is to provide a speaker audio rendering method, apparatus, device, and medium, capable of achieving high-precision, highly compatible, and naturally audible speaker spatial audio rendering. The specific solution is as follows: In a first aspect, this application provides a speaker audio rendering method, including: Determine the distance from the sound-emitting object to each loudspeaker, and calculate the initial gain coefficient corresponding to each loudspeaker based on the distance; The compensation dimension corresponding to each speaker is determined, and a function is fitted based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method for each speaker under the compensation dimension; the compensation method includes gain adjustment, delay adjustment, phase adjustment and effects processing. The total gain coefficient of each speaker is determined based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method. The total gain coefficient after adjustment of each compensation dimension is then subjected to overall gain fitting to adjust the total gain coefficient so as to render the audio signal according to the adjusted total gain coefficient.

[0005] Optionally, the compensation dimension is one or any combination of the following: the three-dimensional coordinates of the sound source, the three-dimensional rotation angle of the sound source, the size of the sound source, the orientation of the sound source, and the distance from the sound source to the loudspeaker.

[0006] Optionally, the function fitting based on preset key points and corresponding compensation gain values includes: An elementary function is selected to fit the preset key points, so that the error between the value of the fitted function at the preset key points and the compensation gain value is within a preset range. The preset key points are connected by straight lines, convex curves, concave curves, or S-curves.

[0007] Optionally, the calculation of the initial gain coefficient corresponding to each speaker based on the distance includes: Based on the distance, the initial gain coefficient corresponding to each speaker is calculated using distance-based amplitude rendering or vector-based amplitude rendering.

[0008] Optionally, determining the total gain coefficient of each loudspeaker based on its initial gain coefficient and the compensated gain coefficient or the compensation method includes: Multiply the initial gain coefficient of each loudspeaker by the corresponding compensation gain coefficient to obtain the total gain coefficient of each loudspeaker; Alternatively, the initial gain coefficient can be scaled or mapped according to the compensation method to obtain the total gain coefficient of the loudspeaker.

[0009] Optional, The step of performing overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient includes: Determine the overall compensation dimension, and perform function fitting based on the preset overall key points and corresponding overall compensation values to obtain the overall compensation coefficient; The overall gain coefficient of all speakers is adjusted using the overall compensation coefficient.

[0010] Optionally, when the compensation dimension is the height difference between the sound-emitting object and the loudspeaker, the height difference is calculated based on the three-dimensional coordinates of the sound-emitting object, and the compensation gain coefficient is negatively correlated with the height difference; When the compensation dimension is the depth of the sound-emitting object engulfing the loudspeaker, the depth of the engulfment is calculated based on the size of the sound-emitting object and the distance from the sound-emitting object to the loudspeaker, and the compensation gain coefficient is positively correlated with the depth of the engulfment. When the compensation dimension is the distance from the sound source to the sound field, the distance is calculated based on the three-dimensional coordinates of the sound source and the spatial distribution of all loudspeakers, and the total gain coefficient is attenuated as a whole based on the negative correlation with the distance.

[0011] Secondly, this application provides a speaker audio rendering apparatus, comprising: An initial gain coefficient determination module is used to determine the distance from the sound-emitting object to each speaker, and calculate the initial gain coefficient corresponding to each speaker based on the distance; The compensation determination module is used to determine the compensation dimension corresponding to each speaker, and to perform function fitting based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method for each speaker under the compensation dimension; the compensation method includes gain adjustment, delay adjustment, phase adjustment and effects processing; The rendering module is used to determine the total gain coefficient of each speaker based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method, and to perform overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient so as to render the audio signal based on the adjusted total gain coefficient.

[0012] Thirdly, this application provides an electronic device, comprising: Memory, used to store computer programs; A processor is used to execute the computer program to implement the aforementioned speaker audio rendering method.

[0013] Fourthly, this application provides a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the aforementioned speaker audio rendering method.

[0014] This application provides a loudspeaker audio rendering method, which involves determining the distance from the sound-emitting object to each loudspeaker, calculating an initial gain coefficient corresponding to each loudspeaker based on the distance, determining a compensation dimension corresponding to each loudspeaker, and performing function fitting based on preset key points and corresponding compensation gain values to obtain a compensation gain coefficient or compensation method for each loudspeaker under the compensation dimension, determining a total gain coefficient for each loudspeaker based on the initial gain coefficient and the compensation gain coefficient or the compensation method, and rendering the audio signal based on the total gain coefficient.

[0015] As can be seen from the above, this application corrects the initial gain coefficient by introducing a compensation dimension and obtains the compensation gain coefficient or compensation method by function fitting based on preset key points. This allows for adaptive adjustment of the total gain of each speaker according to the actual spatial relationship between the sound source and the speaker, effectively avoiding the sound image pulling distortion caused by traditional normalization processing, significantly improving the accuracy and auditory realism of audio rendering, and enhancing adaptability to different speaker layouts and sound source properties. Thus, it achieves high-precision, highly compatible, and naturally sounding speaker spatial audio rendering. Attached Figure Description

[0016] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0017] Figure 1 This is a flowchart of a speaker audio rendering method disclosed in this invention; Figure 2 This is a schematic diagram of a speaker audio rendering device disclosed in this invention; Figure 3 This is a structural diagram of an electronic device disclosed in this invention. Detailed Implementation

[0018] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0019] In recent years, high-end audio playback devices such as surround sound and immersive sound have been widely used in automotive, home theater, and professional cinema scenarios, leading to higher demands from users for the accuracy of sound spatial positioning. Accurately representing sound image positions in complex speaker layouts has become a core challenge for spatial audio rendering technology. Current mainstream rendering methods mainly include distance-based amplitude rendering (DBAP) and vector-based amplitude rendering (VBAP). DBAP allocates gain by calculating the distance between the sound source and each speaker, offering good layout compatibility and smooth transitions. However, in immersive sound environments containing overhead speakers, the sound image pulling effect of the overhead speakers can easily cause the overall sound to be biased upwards. VBAP, on the other hand, selects the speaker closest to the sound source for vector calculation. While this involves less computation, it also suffers from the problem of biased sound images upwards. Furthermore, it only supports point sources and can only represent situations where the sound source is located on the surface of the sound field; it cannot represent the interior of the sound field. This means that even within the sound field, it cannot represent distances, resulting in discontinuous sound when the sound source moves. It also has poor compatibility with non-standard layouts. More importantly, both methods employ gain normalization, a process that often amplifies speakers with inherently low gain, leading to sound image distortion, especially when the sound source is located outside the sound field, making accurate positioning impossible. Therefore, this application provides a speaker audio rendering method, apparatus, device, and medium capable of achieving high-precision, highly compatible, and naturally audible speaker spatial audio rendering.

[0020] See Figure 1 As shown in the figure, this application discloses a speaker audio rendering method, including: Step S11: Determine the distance from the sound-emitting object to each speaker, and calculate the initial gain coefficient corresponding to each speaker based on the distance.

[0021] In this embodiment, the position coordinates of the sound-emitting object (i.e., the virtual sound source) in three-dimensional space, as well as the spatial position coordinates of each physical speaker, are first obtained. The sound-emitting object can be a point sound source or a volume sound source of a certain size. A distance set is obtained by calculating the Euclidean distance between the sound-emitting object and each speaker. After obtaining the distances, an existing amplitude panning algorithm is used to calculate the initial gain coefficient. Specifically, the calculation of the initial gain coefficient corresponding to each speaker based on the distance can include: calculating the initial gain coefficient corresponding to each speaker using distance-based amplitude panning or vector-based amplitude panning based on the distance. For example, in one specific implementation, the distance-based amplitude panning (DBAP) method can be used to calculate the initial gain coefficient. Alternatively, the vector-based amplitude panning (VBAP) method can be used, which selects the 2-3 closest speakers based on the direction vector of the sound-emitting object, establishes vector equations, solves for the gain coefficients of these speakers, and sets the gain of the remaining speakers to zero. It should be noted that the calculation method for the initial gain coefficient is not limited to the two methods mentioned above. Any algorithm that can allocate gain according to the spatial relationship between the sound-emitting object and the loudspeaker can be used as the basis for this application.

[0022] Step S12: Determine the compensation dimension corresponding to each speaker, and perform function fitting based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method of each speaker under the compensation dimension.

[0023] In this embodiment, the compensation dimension refers to a feature quantity that affects the sound and image rendering effect, which can be selected according to actual needs. Specifically, the compensation dimension includes, but is not limited to, one or any combination of the following: the three-dimensional coordinates of the sound source, the three-dimensional rotation angle of the sound source, the size of the sound source, the orientation of the sound source, and the distance from the sound source to the speaker. For example, in a panoramic sound environment, the height difference is a key compensation dimension; when the sound source is large, its size and depth of penetration become important compensation dimensions; when the sound source is located outside the sound field formed by the speaker array, the distance from the sound source to the sound field becomes the compensation dimension.

[0024] In this embodiment, firstly, for the selected compensation dimension, several key points (i.e., specific values on that dimension) are calibrated, and a corresponding compensation gain value (or compensation rule) is preset for each key point. For example, in a specific implementation, in a panoramic sound environment containing overhead speakers, the traditional DBAP algorithm tends to cause the overall sound image to be biased upwards. Compensation is performed by introducing a height difference dimension to correct the sound image height. First, the initial gain coefficient G0 of each speaker is calculated using the DBAP algorithm. Then, the compensation dimension is determined as the height difference Δh between the sound source and the speaker, where Δh is calculated based on the three-dimensional coordinates of the sound source and the speaker (i.e., the absolute value of the Z-axis coordinate difference). The preset key points and compensation gain values are as follows: For example, key points can be calibrated as follows: when the height difference Δh is 0, the compensation gain coefficient G1 is set to 1; when the height difference Δh is 0.5, the compensation gain coefficient G1 is set to 0.5; and when the height difference Δh is 1, the compensation gain coefficient G1 is set to 0. Furthermore, a linear function is used to fit the above points to obtain the relationship between the compensation gain coefficient and the height difference: G1 = 1 - Δh (Δh ranges from 0 to 1, and can be 0 when Δh > 1). The initial gain coefficient of each speaker is multiplied by the corresponding compensation gain coefficient to obtain the total gain coefficient g. Since the top-mounted speakers typically have a large height difference, their compensation gain coefficient is small, thus suppressing the gain of the top-mounted speakers and restoring the sound image height to normal. Finally, g can be normalized as needed.

[0025] In another specific implementation, when the sound-emitting object has a certain volume (such as a large object like an airplane or train), if the sound-emitting object engulfs a speaker, theoretically the volume of that speaker should be increased to simulate the feeling of being surrounded by the object. This can be compensated for by introducing a depth dimension. Depth is defined as the degree to which the sound-emitting object engulfs the speaker, and the calculation formula is: the depth is the difference between the normalized radius r of the sound-emitting object (assuming the maximum sound field range is -1 to 1) and the normalized distance d from the sound-emitting object to the speaker, and satisfies that the normalized radius of the sound-emitting object is greater than or equal to the normalized distance from the sound-emitting object to the speaker. When the normalized radius of the sound-emitting object is less than the normalized distance from the sound-emitting object to the speaker, the depth is 0 (i.e., not engulfed). First, the DBAP algorithm is used to calculate the initial gain coefficient G0. The preset key points and compensation gain values are as follows: For example, key points can be calibrated: when the depth is 0, the compensation gain coefficient G2 is set to 1; when the depth is 1, the compensation gain coefficient G2 is set to 2. It's important to note that to prevent the gain from increasing infinitely due to an excessively large sound source (e.g., (rd) > 1), a convex curve can be used for fitting. For example, a sine function can be chosen: G2 = sin(π·(rd) / 2) + 1. When (rd) > 1, G2 is always equal to 2. Multiplying the initial gain coefficient by the compensation gain coefficient yields the total gain coefficient. In this way, speakers with a greater depth of indentation will obtain higher gain, thus enhancing the sense of immersion.

[0026] In another specific implementation, when the sound source is located outside the sound field formed by the speaker array (usually a polyhedron enclosed by all the speakers), traditional normalization processing amplifies the gain, which should be very small, leading to positioning errors. The overall gain can be attenuated by introducing a distance dimension from the sound source to the sound field. First, the shortest distance D from the sound source to the sound field is calculated. The sound field can be defined as a convex polyhedron formed by the positions of all speakers, and distance D is the shortest Euclidean distance from the sound source to this polyhedron. Preset key points and normalization values N are as follows: for example, key points can be calibrated as follows: when distance D is 0, the normalization value N is set to 1; when distance D is 5, the normalization value N is set to 0. Using linear function fitting, N = 1 - D / 5 (N=0 when D>5). Then, the overall gain coefficient is adjusted according to N. One approach is to multiply the initial gain coefficient G0 by N to obtain the overall gain coefficient g; another approach is to scale the gain of all speakers according to N and directly use it as the overall gain without further normalization. In this way, when the sound source moves away from the sound field, the overall gain decreases, which conforms to the physical law that sound decays with distance and avoids the false amplification caused by normalization.

[0027] It is worth mentioning that the three implementation methods described above can be used independently or in combination depending on the actual scenario. For example, when both height and large objects are present, height compensation and size compensation can be applied sequentially, i.e., g = G0 × G1 × G2, and then an overall attenuation can be performed based on the external field conditions.

[0028] Furthermore, elementary functions (such as linear functions, power functions, trigonometric functions, etc.) are used to fit the aforementioned key points to obtain the functional relationship between the compensation gain coefficient and this dimension. Specifically, the function fitting based on preset key points and corresponding compensation gain values may include: selecting elementary functions to fit the preset key points, such that the error between the value of the fitted function at the preset key points and the compensation gain value is within a preset range; wherein, the preset key points are transitioned using straight lines, convex curves, concave curves, or S-curves. That is, the fitting must meet two conditions: first, the error between the value of the fitted function at the key points and the preset compensation gain value is within an allowable range; second, the transition method between key points can be selected as a straight line, convex curve, concave curve, or S-curve, etc., as needed. Through fitting, the compensation gain coefficient of each speaker in this dimension (related to the speaker's own properties) or a general compensation method (such as an overall scaling rule) can be obtained.

[0029] Step S13: Determine the total gain coefficient of each speaker based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method, and perform overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient so as to render the audio signal according to the adjusted total gain coefficient.

[0030] In this embodiment, determining the total gain coefficient of each speaker based on its initial gain coefficient and the compensation gain coefficient or the compensation method can include: multiplying the initial gain coefficient of each speaker by its corresponding compensation gain coefficient to obtain the total gain coefficient of each speaker; this method is suitable for situations where the compensation gain coefficient acts independently on each speaker, such as height compensation or size compensation. Alternatively, scaling or mapping the initial gain coefficient according to the compensation method to obtain the total gain coefficient of the speaker. This method is suitable for situations where the compensation method is not applied individually to each speaker, but rather uniformly adjusted for all speakers, such as overall attenuation in field compensation.

[0031] After independently adjusting each compensation dimension, to further optimize the overall auditory effect or adapt to specific acoustic environment requirements, an overall gain fitting can be performed on the total gain coefficient. This overall gain fitting aims to perform a secondary adjustment on the total gain of all speakers through a unified compensation dimension, in order to correct the deviation that may be introduced by the superposition of multiple compensation dimensions, or to achieve a smooth change in overall volume. Specifically, the overall gain fitting of the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient may include: First, determining the overall compensation dimension. The overall compensation dimension can be a parameter related to the overall sound field, such as the distance from the sound source to the entire speaker array, the ambient reverberation time, the overall volume requirement of the listening area, etc., or it can be a comprehensive statistical characteristic of the aforementioned multiple compensation dimensions. Second, for this overall compensation dimension, several overall key points (i.e., specific values on this dimension) are calibrated, and a corresponding overall compensation value is preset for each overall key point. For example, the overall compensation value can be calibrated to be 1 when the sound source is located at the center of the sound field, and the overall compensation value decreases according to a certain rule when the sound source is far away from the edge of the sound field. Then, elementary functions are used to fit the overall key points to obtain the functional relationship between the overall compensation coefficient and the overall compensation dimension. The fitting method can also be a straight line, convex curve, concave curve, or S-curve, ensuring that the error of the fitted function at the key points is within the allowable range. Finally, the overall gain coefficient of all speakers is adjusted using the obtained overall compensation coefficient, for example, by multiplying the overall gain coefficient of each speaker by the overall compensation coefficient, or by overall scaling according to the overall compensation method.

[0032] Furthermore, after obtaining the overall gain coefficient of each speaker after overall gain fitting, normalization can be performed as needed. Normalization can keep the total power constant and avoid volume overload or distortion caused by gain adjustment. However, normalization is not a necessary step and can be determined based on the application scenario and auditory effect requirements. For example, when the sound source is located outside the sound field, normalization can be canceled to retain the natural attenuation. The specific method of normalization can be found in existing technologies and will not be elaborated here.

[0033] As can be seen from the above, the embodiments of this application correct the initial gain coefficient by introducing a compensation dimension, and obtain the compensation gain coefficient or compensation method by function fitting based on preset key points. It can adaptively adjust the total gain of each speaker according to the actual spatial relationship between the sound source and the speaker, and achieve fine compensation for different scenarios such as height deviation, sense of immersion of large objects, and external field positioning. It effectively avoids the sound image pulling distortion caused by traditional normalization processing, significantly improves the accuracy of audio rendering and auditory realism, and enhances the adaptability to different speaker layouts and sound source properties.

[0034] See Figure 2As shown in the figure, this application discloses a speaker audio rendering apparatus, including: The initial gain coefficient determination module 11 is used to determine the distance from the sound-emitting object to each speaker and calculate the initial gain coefficient corresponding to each speaker based on the distance.

[0035] The compensation determination module 12 is used to determine the compensation dimension corresponding to each speaker and perform function fitting based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method for each speaker under the compensation dimension. The compensation method includes gain adjustment, delay adjustment, phase adjustment, and effects processing. The compensation dimension is one or any combination of the following: the three-dimensional coordinates of the sound source, the three-dimensional rotation angle of the sound source, the size of the sound source, the orientation of the sound source, and the distance from the sound source to the speaker. When the compensation dimension is the height difference between the sound source and the speaker, the height difference is calculated based on the three-dimensional coordinates of the sound source, and the compensation gain coefficient is negatively correlated with the height difference. When the compensation dimension is the depth to which the sound source engulfs the speaker, the depth is calculated based on the size of the sound source and the distance from the sound source to the speaker, and the compensation gain coefficient is positively correlated with the depth. When the compensation dimension is the distance from the sound source to the sound field, the distance is calculated based on the three-dimensional coordinates of the sound source and the spatial distribution of all speakers, and the total gain coefficient is attenuated overall based on the negative correlation with the distance.

[0036] The rendering module 13 is used to determine the total gain coefficient of each speaker based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method, and to perform overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient so as to render the audio signal based on the adjusted total gain coefficient.

[0037] In some specific embodiments, the initial gain coefficient determination module 11 may specifically include: An initial gain coefficient determination unit is used to calculate the initial gain coefficient corresponding to each speaker based on the distance using distance-based amplitude rendering or vector-based amplitude rendering.

[0038] In some specific embodiments, the compensation determination module 12 may specifically include: The function fitting unit is used to select an elementary function to fit the preset key points, so that the error between the value of the fitted function at the preset key points and the compensation gain value is within a preset range; wherein, the preset key points are transitioned by a straight line, a convex curve, a concave curve or an S-curve. In some specific embodiments, the rendering module 13 may specifically include: The first total gain coefficient determination unit is used to multiply the initial gain coefficient of each loudspeaker by the corresponding compensation gain coefficient to obtain the total gain coefficient of each loudspeaker. The second total gain coefficient determination unit is used to scale or map the initial gain coefficient according to the compensation method to obtain the total gain coefficient of the loudspeaker.

[0039] The overall compensation coefficient determination unit is used to determine the overall compensation dimension. It performs function fitting based on the preset overall key points and the corresponding overall compensation values to obtain the overall compensation coefficient. The total gain coefficient adjustment unit is used to adjust the total gain coefficient of all speakers using the overall compensation coefficient.

[0040] Furthermore, embodiments of this application also disclose an electronic device, Figure 3 This is a structural diagram of an electronic device 20 according to an exemplary embodiment. The content of the diagram should not be construed as limiting the scope of this application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input / output interface 25, and a communication bus 26. The memory 22 stores a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the speaker audio rendering method disclosed in any of the foregoing embodiments. Furthermore, the electronic device 20 in this embodiment may specifically be a computer.

[0041] In this embodiment, the power supply 23 is used to provide operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows can be any communication protocol applicable to the technical solution of this application, and is not specifically limited here; the input / output interface 25 is used to acquire external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs, and is not specifically limited here.

[0042] In addition, the memory 22, as a carrier for resource storage, can be a read-only memory, random access memory, disk or optical disk, etc. The resources stored thereon can include operating system 221, computer program 222, etc., and the storage method can be temporary storage or permanent storage.

[0043] The operating system 221 is used to manage and control the various hardware devices on the electronic device 20 and the computer program 222, which may be Windows Server, Netware, Unix, Linux, etc. In addition to including a computer program capable of performing the speaker audio rendering method executed by the electronic device 20 as disclosed in any of the foregoing embodiments, the computer program 222 may further include computer programs capable of performing other specific tasks.

[0044] Furthermore, this application also discloses a computer-readable storage medium for storing a computer program; wherein, when the computer program is executed by a processor, it implements the aforementioned speaker audio rendering method. Specific steps of this method can be found in the corresponding content disclosed in the foregoing embodiments, and will not be repeated here.

[0045] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to in the method section.

[0046] Those skilled in the art will further recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both. To clearly illustrate the interchangeability of hardware and software, the components and steps of the various examples have been generally described in terms of functionality in the foregoing description. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0047] The steps of the methods or algorithms described in conjunction with the embodiments disclosed herein can be implemented directly by hardware, a software module executed by a processor, or a combination of both. The software module can be located in random access memory (RAM), main memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art.

[0048] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0049] The technical solutions provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A speaker audio rendering method, characterized in that, include: Determine the distance from the sound-emitting object to each loudspeaker, and calculate the initial gain coefficient corresponding to each loudspeaker based on the distance; The compensation dimension corresponding to each speaker is determined, and a function is fitted based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method for each speaker under the compensation dimension; the compensation method includes gain adjustment, delay adjustment, phase adjustment and effects processing. The total gain coefficient of each speaker is determined based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method. The total gain coefficient after adjustment of each compensation dimension is then subjected to overall gain fitting to adjust the total gain coefficient so as to render the audio signal according to the adjusted total gain coefficient.

2. The speaker audio rendering method according to claim 1, characterized in that, The compensation dimension is one or any combination of the following: the three-dimensional coordinates of the sound source, the three-dimensional rotation angle of the sound source, the size of the sound source, the orientation of the sound source, and the distance from the sound source to the loudspeaker.

3. The speaker audio rendering method according to claim 1, characterized in that, The function fitting based on preset key points and corresponding compensation gain values includes: An elementary function is selected to fit the preset key points, so that the error between the value of the fitted function at the preset key points and the compensation gain value is within a preset range. The preset key points are connected by straight lines, convex curves, concave curves, or S-curves.

4. The speaker audio rendering method according to claim 1, characterized in that, The calculation of the initial gain coefficient corresponding to each speaker based on the distance includes: Based on the distance, the initial gain coefficient corresponding to each speaker is calculated using distance-based amplitude rendering or vector-based amplitude rendering.

5. The speaker audio rendering method according to claim 1, characterized in that, Determining the total gain coefficient of each loudspeaker based on its initial gain coefficient and the compensated gain coefficient or the compensation method includes: Multiply the initial gain coefficient of each loudspeaker by the corresponding compensation gain coefficient to obtain the total gain coefficient of each loudspeaker; Alternatively, the initial gain coefficient can be scaled or mapped according to the compensation method to obtain the total gain coefficient of the loudspeaker.

6. The speaker audio rendering method according to claim 1, characterized in that, The step of performing overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient includes: Determine the overall compensation dimension, and perform function fitting based on the preset overall key points and corresponding overall compensation values to obtain the overall compensation coefficient; The overall gain coefficient of all speakers is adjusted using the overall compensation coefficient.

7. The speaker audio rendering method according to claim 2, characterized in that, When the compensation dimension is the height difference between the sound-emitting object and the loudspeaker, the height difference is calculated based on the three-dimensional coordinates of the sound-emitting object, and the compensation gain coefficient is negatively correlated with the height difference. When the compensation dimension is the depth of the sound-emitting object engulfing the loudspeaker, the depth of the engulfment is calculated based on the size of the sound-emitting object and the distance from the sound-emitting object to the loudspeaker, and the compensation gain coefficient is positively correlated with the depth of the engulfment. When the compensation dimension is the distance from the sound source to the sound field, the distance is calculated based on the three-dimensional coordinates of the sound source and the spatial distribution of all loudspeakers, and the total gain coefficient is attenuated as a whole based on the negative correlation with the distance.

8. A speaker audio rendering device, characterized in that, include: An initial gain coefficient determination module is used to determine the distance from the sound-emitting object to each speaker, and calculate the initial gain coefficient corresponding to each speaker based on the distance; The compensation determination module is used to determine the compensation dimension corresponding to each speaker, and to perform function fitting based on preset key points and corresponding compensation gain values to obtain the compensation gain coefficient or compensation method for each speaker under the compensation dimension; the compensation method includes gain adjustment, delay adjustment, phase adjustment and effects processing; The rendering module is used to determine the total gain coefficient of each speaker based on the initial gain coefficient of each speaker and the compensation gain coefficient or the compensation method, and to perform overall gain fitting on the total gain coefficient after adjustment of each compensation dimension to adjust the total gain coefficient so as to render the audio signal based on the adjusted total gain coefficient.

9. An electronic device, characterized in that, include: Memory, used to store computer programs; A processor for executing the computer program to implement the speaker audio rendering method as described in any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that, Used to store a computer program, wherein the computer program, when executed by a processor, implements the speaker audio rendering method as described in any one of claims 1 to 7.