Note starting point detection method based on data driving

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data-driven, detection method technology, applied in metadata audio data retrieval, audio data retrieval, digital data information retrieval and other directions, can solve the problem of poor accuracy, false detection and missed detection, poor performance of singing scene recognition, etc. To improve the recognition effect, reduce false detection and missed detection, and improve the accuracy

Inactive Publication Date: 2021-02-02

JINAN UNIVERSITY

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In the Chinese invention patent with the notification number CN1963919B, an energy-based note segmentation method is disclosed. This method calculates the energy characteristics of the audio signal, and uses the energy threshold to distinguish the picking note segmentation point. This method is simple but less accurate. , only applicable to scenes with strong energy salience. For the sound spectrum characteristics of the audio signal, it is necessary to determine the starting point of the note according to the comparison between the first speech spectrum parameters and the second speech spectrum parameters of each frequency band

This method picks the starting point based on the peak point of the speech spectrum parameter curve, and it is difficult to identify the false starting point from the peak point. At the same time, the recognition effect on singing scenes such as vibrato and portamento is not good, and there will be false detection and missed detection.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0026] A method for constructing a vocal a cappella note start point detection and analysis data set comprises the following steps:

[0027] Step 1: Build a list of tracks. The list includes different types of songs. The types of songs are shown in Table 1. Each song can be a segment ranging from 10-60 seconds. The choice of songs includes most of the music elements as much as possible, that is, the track The list has good coverage and completeness.

[0028] Table 1 Song Type

[0029] track type Chinese style, country, pop, nursery rhymes, folk songs, rap, rock, dance

[0030] Step 2: Construct the vocal a cappella audio collection module. A group of professional and amateur singers will sing and record 10-60 seconds of audio according to the track list. The audio format is a wav file and the audio sampling rate is 16kHz. Record the track type, song number, song name, singer type, lyrics content and other related information and save the audio to the database...

Embodiment 2

[0061] The convolutional layer and pooling layer are used to extract the log mel spectral feature information of the audio, and the bidirectional long-short-term memory recurrent neural network (BiLSTM) directly classifies the log mel spectral segments, so as to achieve the purpose of identifying the starting point of the note. Include the following steps:

[0062] Step S1, preprocessing the original audio and label data;

[0063] Further, the process of step S1 is as follows image 3 , specifically including:

[0064] Step S101: load the audio file according to the set sampling rate, and perform denoising processing;

[0065] Step S102: Calculate the log mel spectrum of the audio signal according to the parameters such as the set mel spectrum order, sampling rate, short-time Fourier transform window size, and overlap rate. In this example, the preferred parameters are as follows;

[0066] Table II

[0067]

[0068]

[0069] Step S103: Statize the mean and variance i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a note starting point detection method based on data driving. The method comprises the following steps: 1, carrying out coding and time window sliding fragmentation on preprocessed log Mel frequency spectrum data, and generating a time slice characteristic sample; 2, loading a time slice sample, and carrying out one-dimensional convolution and maximum pooling operation; 3,inputting a result of a pooling layer into a BiLSTM layer, and extracting sample semantic information; 4, inputting the result of the BiLSTM layer into the action layer, and enhancing the learning ofthe model for the key time sequence sample; 5, inputting the result of the action layer into a softmax classification layer to be judged; and 6, combining discrimination results of the softmax layer by referring to a time threshold value, and outputting a note starting point sequence. According to the invention, multiple audios can be recorded for multiple times, comparison is carried out, tremolo, slippery sound and the like are screened, the singing scene recognition effect is improved, false detection and missing detection are reduced, and the note starting point detection accuracy is improved.

Description

technical field [0001] The invention relates to the technical field of computer applications, in particular to a method for detecting the starting point of a note driven by data. Background technique [0002] Note onset detection is a process of locating the start of an event in an audio signal, that is, finding the onset of all notes in a music signal. It is a fundamental research work for many advanced music analysis tasks such as beat detection, tempo estimation, pitch extraction and automatic transcription, such as figure 1 shown. [0003] In the Chinese invention patent with the notification number CN1963919B, an energy-based note segmentation method is disclosed. This method calculates the energy characteristics of the audio signal, and uses the energy threshold to distinguish the picking note segmentation point. This method is simple but less accurate. , is only applicable to scenes with strong energy salience. For the spectral characteristics of the audio signal, i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L25/51G10L25/87G06F16/65G06F16/68G06F16/683G06K9/00G06K9/62G06N3/04G06N3/08G10L25/18G10L25/30

CPCG10L25/51G10L25/87G10L25/18G10L25/30G06N3/049G06N3/084G06F16/686G06F16/65G06F16/683G06N3/044G06N3/045G06F2218/12G06F18/2415

Inventor 雷小林蒋文颉胡健张震郑婧

Owner JINAN UNIVERSITY

Note starting point detection method based on data driving

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology