Speech enhancement method and system based on generative adversarial network
A speech enhancement and network technology, applied in biological neural network models, speech analysis, neural learning methods, etc., can solve the problem of not taking into account the timing characteristics of speech, and achieve high quality and intelligibility, high speech quality, The effect of high and short-term intelligibility
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0037] This embodiment provides a speech enhancement method based on a generative adversarial network;
[0038] Speech enhancement methods based on generative adversarial networks, including:
[0039] S101: obtain a speech signal with noise;
[0040] S102: Input the noisy speech signal into the trained generative adversarial network, and output the enhanced speech signal;
[0041] Wherein, the generative adversarial network includes two generators and two discriminators;
[0042] In the generative adversarial network, the ability of the generator to approach the target signal is improved through the mutual game between the two generators and the two discriminators during the training process.
[0043] The training process of the two generators is to minimize the following loss function:
[0044]
[0045] The training process of the two discriminators is to minimize the following loss function:
[0046]
[0047] During training, the training input to the generator is ...
Embodiment
[0085] Example: Determine the optimizer to be RMSProp.
[0086] (1.3) Change the size of random noise and clean speech, and implement pre-emphasis in the range of 0.9 to 1 at the same time.
[0087] Example: Change the range of random noise and clean speech to -1 to 1 to prevent problems such as gradient explosion, and implement a pre-emphasis of 0.95 to make its high-frequency characteristics have better performance
[0088] (1.4) Put random noise and clean speech into the queue, and take out the required batch of enhanced speech and clean speech each time.
[0089] Example: batch size is 50, 16384 frames long;
[0090] Considering the multi-generators cooperate to generate speech in multiple stages, the training method of two generators is adopted to reconstruct and generate clean speech.
[0091] Further, the first generator and the second generator initialization steps; specifically include:
[0092] (2.1) Take out the random noise adjustment dimension separately.
[0...
Embodiment 2
[0120] This embodiment provides a speech enhancement system based on a generative adversarial network;
[0121] Speech enhancement systems based on generative adversarial networks, including:
[0122] an acquisition module, which is configured to: acquire a speech signal with noise;
[0123] a speech enhancement module, which is configured to: input the noisy speech signal into the trained generative adversarial network, and output the enhanced speech signal;
[0124] Wherein, the generative adversarial network includes two generators and two discriminators;
[0125] In the generative adversarial network, the ability of the generator to approach the target signal is improved through the mutual game between the two generators and the two discriminators during the training process.
[0126] It should be noted here that the above acquisition module and speech enhancement module correspond to steps S101 to S102 in the first embodiment, and the examples and application scenarios ...
PUM

Abstract
Description
Claims
Application Information

- R&D
- Intellectual Property
- Life Sciences
- Materials
- Tech Scout
- Unparalleled Data Quality
- Higher Quality Content
- 60% Fewer Hallucinations
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2025 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com