Acoustic scene identification method based on data enhancement

An identification method and scene technology, applied in the field of audio signal processing and deep learning, can solve the problems of high cost of manual labeling, high cost of manual labeling data, insufficient labeled training data, etc.
CN109978034AActive Publication Date: 2019-07-05SOUTH CHINA UNIV OF TECH

Patent Information

Authority / Receiving Office
CN Β· China
Current Assignee / Owner
SOUTH CHINA UNIV OF TECH
Publication Date
2019-07-05

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses an acoustic scene identification method based on data enhancement. The method comprises the following steps: firstly, collecting and marking audio samples of different sound scenes; then preprocessing is carried out, and pre-emphasis, framing and windowing processing are carried out on the audio samples; data enhancement is then performed, extracting a harmonic source and an impact source of each audio sample to obtain more sufficient audio samples, extracting logarithmic Mel filter bank characteristics from the audio samples and the harmonic sources and the impact sources of the audio samples, stacking the three characteristics into a three-channel high-dimensional characteristic, and constructing more abundant training samples by adopting a hybrid enhancement technology; and finally, inputting the three-channel high-dimensional features into an Xception network for judgment, and identifying the sound scene corresponding to each audio sample. According to the data enhancement method, the generalization capability of the Xception network classifier can be effectively improved, and the training process of the network is stabilized. When the acoustic scene isidentified, the method can obtain a better identification effect.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The present invention relates to the technical fields of audio signal processing and deep learning, in particular to a sound scene recognition method based on data enhancement. Background technique

[0002] Audio signals contain rich information and have the advantages of being non-contact and natural. A sound scene is a high-level representation of an audio signal at the semantic level. The task of acoustic scene recognition is to associate semantic labels with audio streams to identify categories of sound-producing environments. This technology enables smart devices to perceive the surrounding environment based on sound, so as to make appropriate decisions. At present, there is a massive increase in audio data. Due to the time-consuming and labor-intensive manual labeling of data, there are very few audio samples with accurate labels. Unlabeled audio samples cannot be directly used to train a classifier. How to construct more diverse training dat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More