Audio duplicate checking method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of audio and audio data, applied in the field of audio plagiarism check, can solve problems such as difficult to achieve efficient and common audio content plagiarism check, and achieve the effect of increasing speed

Pending Publication Date: 2021-01-19

北京晴数智慧科技有限公司

View PDF9 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] At present, there are at least the following problems: the current text content carried by the audio needs to have a speech recognition model for the corresponding language. Even if you have a corresponding speech recognition model, you need to convert the speech to text first, and then check the text, which is difficult to achieve. Efficient and versatile audio content plagiarism check

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0054] refer to figure 1 , showing a schematic flow chart of an audio plagiarism checking method provided in an embodiment of the present application, the audio plagiarism checking method includes:

[0055] S101: Obtain target audio data.

[0056] S102: Divide the target audio data into multiple audio segments by using the utterance detection module.

[0057] Wherein, the voice detection module can perform voice marking on the target audio data, and segment the audio data according to the marking.

[0058] S103: Extract bottleneck features from the audio clips by using a bottleneck feature extractor.

[0059] Specifically, S103 may be to perform frame-level bottleneck feature extraction on the audio clip through the deep neural network, and use the bottleneck layer of the deep neural network or the combination of the output layer and the first two layers as the bottleneck feature of the audio clip.

[0060] It can be understood that the bottleneck features of different speech...

Embodiment 2

[0075] refer to figure 2 , showing a schematic flow chart of another audio plagiarism checking method provided by the embodiment of the present application, the audio plagiarism checking method includes:

[0076] S201: Select a plurality of languages with large pronunciation phoneme differences to build a pronunciation dictionary.

[0077] Optionally, the multiple languages can be Mandarin Chinese, English, etc.

[0078] S202: According to the pronunciation dictionary, use the labeled audio data to train the forced phoneme alignment model, so as to obtain the audio data marked with the pronunciation state.

[0079] Wherein, the pronunciation state may include but not limited to monophone, diphone and triphone.

[0080] S203: Train a bottleneck feature extractor using the audio data marked with the pronunciation state.

[0081] Specifically, the trained model may include but not limited to deep learning structures such as DNN, TDNN, LSTM, and CNN.

[0082] S204: Obtain...

Embodiment 3

[0107] refer to image 3 , which shows a schematic structural diagram of an audio plagiarism checking device provided in an embodiment of the present application. The audio plagiarism checking device 30 includes:

[0108] The embodiment of the present application provides an audio plagiarism checking device 30, which is characterized in that it includes:

[0109] An acquisition module 301, configured to acquire target audio data;

[0110] Segmentation module 302, for target audio data is segmented into a plurality of audio segments by speech detection module;

[0111] The extraction module 303 is used to extract the bottleneck feature to the audio clip by the bottleneck feature extractor;

[0112] Dimensionality reduction module 304, is used for carrying out dimensionality reduction processing to bottleneck feature and obtains the feature sequence of each audio segment;

[0113] Calculation module 305, is used for calculating the similarity of target audio data and the audi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The embodiment of the invention discloses an audio duplicate checking method and device. The method comprises the steps of obtaining target audio data; segmenting the target audio data into a plurality of audio clips through an utterance detection module; extracting bottleneck features from the audio clips through a bottleneck feature extractor; performing dimension reduction processing on the bottleneck features to obtain a feature sequence of each audio clip; calculating the similarity between the target audio data and audio data in a database according to the feature sequence; and judging whether the target audio data is repeated with the audio data in the database or not according to a calculation result of the similarity. In the embodiment of the invention, duplicate checking of the audio content can be achieved through extraction of the bottleneck features without decoding the audio data into the text, so that the duplicate checking speed of the audio content is increased, and the method and device can be universally applied to other languages without a voice recognition technology for duplicate checking of the audio content.

Description

technical field [0001] The present application belongs to the technical field of speech recognition, and in particular relates to a method and device for audio plagiarism checking. Background technique [0002] With the development of science and technology, especially the development of communication, Internet big data and artificial intelligence, the application of Automatic Speech Recognition (ASR) technology in various fields has been promoted. An efficient and reliable speech recognition system is inseparable from high-quality speech training data corresponding to the scene. Usually, training data is obtained in two ways: manual recording and web crawling. Whether it is manual recording or web crawling, audio data duplication is unavoidable in a series of operations. In order to ensure the quality of training data, it is necessary to carry out text content carried by the audio data involved in training the speech recognition model. Duplicate check. [0003] According ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/683

CPCG06F16/683

Inventor 张晴晴朱冬贾艳明

Owner 北京晴数智慧科技有限公司

Audio duplicate checking method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology