Multi-channel double-speaker separation method and system
A speaker separation and speaker technology, applied in speech analysis, instruments, etc., can solve the problems of speech separation performance degradation, interfering with the speaker's voice, and the inability to know the specific location of the target speaker in advance, etc.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0055] Based on the above-mentioned ideal sound source position estimation network and speaker masking estimation network, an embodiment of the present application provides a multi-channel dual-speaker separation method, the method comprising:
[0056] S31. Obtain the mixed voice audio including the voices of two speakers, perform frame division, windowing and Fourier transform processing on the mixed voice audio, and obtain the frequency spectrum of each frame of audio.
[0057] In a feasible implementation manner, the following steps S311-S313 are included:
[0058] S311, divide the mixed speech and audio to be separated into frames, each frame is 25 milliseconds, and the frame is shifted by 6.25 milliseconds;
[0059] S312, add a window to each frame, and the window function is a Hamming window;
[0060] S313. Perform a 512-point Fourier transform on each frame of audio to obtain a frequency spectrum of each frame of audio.
[0061] S32, input the frequency spectrum of ea...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


