Unlock instant, AI-driven research and patent intelligence for your innovation.

Audio event detection method and system

A technology of event detection and audio, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of long training time and poor recognition effect, and achieve the effect of eliminating interference and improving recognition rate

Pending Publication Date: 2020-11-20
XIAMEN KUAISHANGTONG TECH CORP LTD
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The traditional neural network model used to detect audio events takes a long time to train and the recognition effect is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio event detection method and system
  • Audio event detection method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] Using the GMM model to detect audio events actually does supervised audio event recognition.

[0038] Such as figure 1 As shown, the training is performed first, and the classified training data is used to train the GMM model in turn. For example, the ringtone-type audio event segment (the cut audio segment, to ensure that the audio content is ringtone, so that the trained GMM model is the description of the ringtone feature space), according to the process of framing and MFCC feature extraction, and then use MFCC features perform model operations. Label the resulting GMM model as a ringtone model.

[0039] By using different types of audio event segments to train the GMM model, a model that can recognize multiple types of audio events is obtained.

[0040] The traditional GMM model training often uses the EM algorithm, which is prone to fall into a local optimal solution, so the training of the model takes a long time and the recognition effect is poor.

[0041] Th...

Embodiment 2

[0077] This embodiment provides an audio event detection system, including:

[0078] The audio collection terminal is used to collect the voice stream of the user in real time and send the audio data to the recognition module. After the recognition module receives the audio data, it recognizes the audio data based on the above method. If an audio event occurs in the current voice stream, the type of each audio event will be identified, and the user will be notified by voice or other means, providing a way for the user to identify the content of a piece of audio.

[0079] Since the implementation network structure of this embodiment is end-to-end, the recognition result is directly output, the recognition speed is fast, and it is suitable for smart devices with low power consumption.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an audio event detection method and system. According to the method, a GMM model based on fuzzy clustering is used. The construction method of the GMM model comprises the stepsthat the Gaussian mixture number of the GMM model is M (M is a positive integer), fuzzy clustering is conducted on training data (which is described in the specification) to obtain a codebook set X ={x1, x2,..., xi}, wherein i = 1, 2,..., M, and the sample space size of the ith Gaussian atom Xi is di; and the mean vector of the ith Gaussian atom Xi is enabled to serve as the initial value mu i of an ith Gaussian component of the GMM model, the variance vector of the ith Gaussian atom Xi is enabled to serve as the initial variance sigma i of the ith Gaussian component, and the model parameters are iteratively optimized till the training of the GMM model is completed. According to the invention, the training time of a neural network model for detecting an audio event can be reduced, and the recognition effect is improved.

Description

technical field [0001] The invention relates to the technical field of audio recognition, in particular to an audio event detection method and system. Background technique [0002] An audio event refers to an audio segment with a specific semantic or content, which is divided into classification and detection from the way of audio event processing. The detection of audio events includes the location and identification of audio events. Usually, the segmentation algorithm is used to locate the position of the audio event, and then the type of the audio event is identified through the neural network model. Traditional neural network models for detecting audio events take a long time to train and have poor recognition effects. Contents of the invention [0003] In order to solve the above problems, the present invention provides an audio event detection method and system, which reduces the training time of the neural network model used to detect audio events, and improves the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/30G10L25/33G10L25/87G10L25/78G10L15/04
CPCG10L25/30G10L25/33G10L25/87G10L25/78G10L15/04G10L2025/783
Inventor 陈剑超肖龙源李稀敏刘晓葳叶志坚
Owner XIAMEN KUAISHANGTONG TECH CORP LTD