Note starting point detection method based on data driving
A data-driven, detection method technology, applied in metadata audio data retrieval, audio data retrieval, digital data information retrieval and other directions, can solve the problem of poor accuracy, false detection and missed detection, poor performance of singing scene recognition, etc. To improve the recognition effect, reduce false detection and missed detection, and improve the accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0026] A method for constructing a vocal a cappella note start point detection and analysis data set comprises the following steps:
[0027] Step 1: Build a list of tracks. The list includes different types of songs. The types of songs are shown in Table 1. Each song can be a segment ranging from 10-60 seconds. The choice of songs includes most of the music elements as much as possible, that is, the track The list has good coverage and completeness.
[0028] Table 1 Song Type
[0029] track type Chinese style, country, pop, nursery rhymes, folk songs, rap, rock, dance
[0030] Step 2: Construct the vocal a cappella audio collection module. A group of professional and amateur singers will sing and record 10-60 seconds of audio according to the track list. The audio format is a wav file and the audio sampling rate is 16kHz. Record the track type, song number, song name, singer type, lyrics content and other related information and save the audio to the database...
Embodiment 2
[0061] The convolutional layer and pooling layer are used to extract the log mel spectral feature information of the audio, and the bidirectional long-short-term memory recurrent neural network (BiLSTM) directly classifies the log mel spectral segments, so as to achieve the purpose of identifying the starting point of the note. Include the following steps:
[0062] Step S1, preprocessing the original audio and label data;
[0063] Further, the process of step S1 is as follows image 3 , specifically including:
[0064] Step S101: load the audio file according to the set sampling rate, and perform denoising processing;
[0065] Step S102: Calculate the log mel spectrum of the audio signal according to the parameters such as the set mel spectrum order, sampling rate, short-time Fourier transform window size, and overlap rate. In this example, the preferred parameters are as follows;
[0066] Table II
[0067]
[0068]
[0069] Step S103: Statize the mean and variance i...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


