Unlock instant, AI-driven research and patent intelligence for your innovation.

Audio duplicate checking method and device

A technology of audio and audio data, applied in the field of audio plagiarism check, can solve problems such as difficult to achieve efficient and common audio content plagiarism check, and achieve the effect of increasing speed

Pending Publication Date: 2021-01-19
北京晴数智慧科技有限公司
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, there are at least the following problems: the current text content carried by the audio needs to have a speech recognition model for the corresponding language. Even if you have a corresponding speech recognition model, you need to convert the speech to text first, and then check the text, which is difficult to achieve. Efficient and versatile audio content plagiarism check

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio duplicate checking method and device
  • Audio duplicate checking method and device
  • Audio duplicate checking method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] refer to figure 1 , showing a schematic flow chart of an audio plagiarism checking method provided in an embodiment of the present application, the audio plagiarism checking method includes:

[0055] S101: Obtain target audio data.

[0056] S102: Divide the target audio data into multiple audio segments by using the utterance detection module.

[0057] Wherein, the voice detection module can perform voice marking on the target audio data, and segment the audio data according to the marking.

[0058] S103: Extract bottleneck features from the audio clips by using a bottleneck feature extractor.

[0059] Specifically, S103 may be to perform frame-level bottleneck feature extraction on the audio clip through the deep neural network, and use the bottleneck layer of the deep neural network or the combination of the output layer and the first two layers as the bottleneck feature of the audio clip.

[0060] It can be understood that the bottleneck features of different speech...

Embodiment 2

[0075] refer to figure 2 , showing a schematic flow chart of another audio plagiarism checking method provided by the embodiment of the present application, the audio plagiarism checking method includes:

[0076] S201: Select a plurality of languages ​​with large pronunciation phoneme differences to build a pronunciation dictionary.

[0077] Optionally, the multiple languages ​​can be Mandarin Chinese, English, etc.

[0078] S202: According to the pronunciation dictionary, use the labeled audio data to train the forced phoneme alignment model, so as to obtain the audio data marked with the pronunciation state.

[0079] Wherein, the pronunciation state may include but not limited to monophone, diphone and triphone.

[0080] S203: Train a bottleneck feature extractor using the audio data marked with the pronunciation state.

[0081] Specifically, the trained model may include but not limited to deep learning structures such as DNN, TDNN, LSTM, and CNN.

[0082] S204: Obtain...

Embodiment 3

[0107] refer to image 3 , which shows a schematic structural diagram of an audio plagiarism checking device provided in an embodiment of the present application. The audio plagiarism checking device 30 includes:

[0108] The embodiment of the present application provides an audio plagiarism checking device 30, which is characterized in that it includes:

[0109] An acquisition module 301, configured to acquire target audio data;

[0110] Segmentation module 302, for target audio data is segmented into a plurality of audio segments by speech detection module;

[0111] The extraction module 303 is used to extract the bottleneck feature to the audio clip by the bottleneck feature extractor;

[0112] Dimensionality reduction module 304, is used for carrying out dimensionality reduction processing to bottleneck feature and obtains the feature sequence of each audio segment;

[0113] Calculation module 305, is used for calculating the similarity of target audio data and the audi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention discloses an audio duplicate checking method and device. The method comprises the steps of obtaining target audio data; segmenting the target audio data into a plurality of audio clips through an utterance detection module; extracting bottleneck features from the audio clips through a bottleneck feature extractor; performing dimension reduction processing on the bottleneck features to obtain a feature sequence of each audio clip; calculating the similarity between the target audio data and audio data in a database according to the feature sequence; and judging whether the target audio data is repeated with the audio data in the database or not according to a calculation result of the similarity. In the embodiment of the invention, duplicate checking of the audio content can be achieved through extraction of the bottleneck features without decoding the audio data into the text, so that the duplicate checking speed of the audio content is increased, and the method and device can be universally applied to other languages without a voice recognition technology for duplicate checking of the audio content.

Description

technical field [0001] The present application belongs to the technical field of speech recognition, and in particular relates to a method and device for audio plagiarism checking. Background technique [0002] With the development of science and technology, especially the development of communication, Internet big data and artificial intelligence, the application of Automatic Speech Recognition (ASR) technology in various fields has been promoted. An efficient and reliable speech recognition system is inseparable from high-quality speech training data corresponding to the scene. Usually, training data is obtained in two ways: manual recording and web crawling. Whether it is manual recording or web crawling, audio data duplication is unavoidable in a series of operations. In order to ensure the quality of training data, it is necessary to carry out text content carried by the audio data involved in training the speech recognition model. Duplicate check. [0003] According ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/683
CPCG06F16/683
Inventor 张晴晴朱冬贾艳明
Owner 北京晴数智慧科技有限公司