Intelligent voice mixing method and device for multi-party voice communication
A voice call and voice channel technology, applied in the multimedia field, can solve the problems of low voice, interference, and difficulty for the audience to identify the content of the speech and the identity of the speaker.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0099] An embodiment of the present invention provides an intelligent mixing method for multi-party voice calls, see figure 1 .
[0100] Wherein, the method flow includes:
[0101] 101: During a voice call, obtain the current frame data of each active voice channel except the local end;
[0102] 102: Obtain the voice activity detection results of the current frame data of each active voice channel and the short-term average energy of each active voice channel;
[0103] 103: According to the voice activity detection results of the current frame data of each active voice channel, the short-term average energy of each active voice channel, the number of voice channels of effective voice, and the corresponding gating identifiers of each active voice channel, select the voice mixing process. Voice channel; the strobe mark is the selection result recorded for each active voice channel during the last voice channel selection;
[0104] 104: Perform superimposed sound mixing process...
Embodiment 2
[0113] An embodiment of the present invention provides an intelligent mixing method for multi-party voice calls, see image 3 .
[0114] 301: During a voice call, obtain the current frame data of each active voice channel except the local end.
[0115] When the voice data sent by each active voice channel is received, this step starts to be executed for each frame of data of each active voice channel. The voice data sent by each active voice channel is divided into frames to obtain the current frame data.
[0116] Wherein, step 301 can be realized through the following process:
[0117] Obtain the voice data stream of each active voice channel except the local end, and perform frame division processing on the voice data stream of each active voice channel to obtain the current frame data in the voice data stream of each active voice channel.
[0118] 302: Acquire the voice activity detection results of the current frame data of each active voice channel and the short-term a...
Embodiment 3
[0166] An embodiment of the present invention provides an intelligent mixing method for multi-party voice calls, see Figure 4 .
[0167] 401: During a voice call, obtain the current frame data of each active voice channel except the local end.
[0168] When the voice data sent by each active voice channel is received, this step starts to be executed for each frame of data of each active voice channel. The voice data sent by each active voice channel is divided into frames to obtain the current frame data.
[0169] Wherein, step 401 can be realized through the following process:
[0170] Obtain the voice data stream of each active voice channel except the local end, and perform frame division processing on the voice data stream of each active voice channel to obtain the current frame data in the voice data stream of each active voice channel.
[0171] 402: Obtain the voice activity detection results of the current frame data of each active voice channel and the short-term a...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com