Audio event detection method and system
A technology of event detection and audio, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of long training time and poor recognition effect, and achieve the effect of eliminating interference and improving recognition rate
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0037] Using the GMM model to detect audio events actually does supervised audio event recognition.
[0038] Such as figure 1 As shown, the training is performed first, and the classified training data is used to train the GMM model in turn. For example, the ringtone-type audio event segment (the cut audio segment, to ensure that the audio content is ringtone, so that the trained GMM model is the description of the ringtone feature space), according to the process of framing and MFCC feature extraction, and then use MFCC features perform model operations. Label the resulting GMM model as a ringtone model.
[0039] By using different types of audio event segments to train the GMM model, a model that can recognize multiple types of audio events is obtained.
[0040] The traditional GMM model training often uses the EM algorithm, which is prone to fall into a local optimal solution, so the training of the model takes a long time and the recognition effect is poor.
[0041] Th...
Embodiment 2
[0077] This embodiment provides an audio event detection system, including:
[0078] The audio collection terminal is used to collect the voice stream of the user in real time and send the audio data to the recognition module. After the recognition module receives the audio data, it recognizes the audio data based on the above method. If an audio event occurs in the current voice stream, the type of each audio event will be identified, and the user will be notified by voice or other means, providing a way for the user to identify the content of a piece of audio.
[0079] Since the implementation network structure of this embodiment is end-to-end, the recognition result is directly output, the recognition speed is fast, and it is suitable for smart devices with low power consumption.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 

