A method of audio stream replacement in video based on speech recognition

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology in speech recognition and video, applied in speech recognition, speech analysis, television, etc., can solve problems such as out-of-sync sound and picture, heavy traces of video editing, etc.

Active Publication Date: 2022-04-29

ZHEJIANG UNIV OF TECH

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In order to solve the problem of audio stream replacement, academia and industry have proposed many solutions, among which the technical solutions closer to the present invention are: In the invention patent with the patent publication number of CN 110019961A, the speech characteristics of the audio stream are obtained through speech recognition and speech synthesis method to modify the audio stream in the video stream, but in this patent, the synthesized audio is not modified, which may lead to heavy clipping traces of the video, out-of-sync sound and picture on a single word, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0042] The specific implementation of the present invention will be described in detail below in conjunction with the examples, but the protection scope of the invention is not limited thereto.

[0043] The audio stream replacement method in the video based on speech recognition of the present invention specifically comprises the following steps:

[0044] Step 1: Extract the audio in the video to be processed, and perform endpoint detection and noise reduction on the extracted audio, specifically:

[0045] Step 1.1: First divide the audio into frames according to the duration and sampling rate, calculate the duration of each frame according to formula (1), and finally multiply each frame by the Hamming window;

[0046]

[0047] Among them, T represents the duration of the audio frame, n represents the number of sampling points corresponding to an AAC frame, and v represents the sampling frequency;

[0048] Step 1.2: Calculate the energy value of each frame according to formu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for replacing an audio stream in a video based on speech recognition. The method is as follows: firstly, the front and rear endpoints of the people's speech in the audio are obtained by detecting the endpoints of the audio, then noise reduction is performed on the audio to extract feature values, and then speech recognition is performed through a sound model and a language model, and then according to the characteristics of the recognized words It is worth calculating the start and end time of the word, which is synthesized by calculating the speech characteristics of the speaker and the audio synthesized by the machine, so as to realize the audio stream replacement process in the video. The present invention can obtain the start time and end time of each word in the audio in the speech recognition result, obtain the start time and end time of each word in the audio in the recognition result, and make the audio stream replacement in the video more scientific through calculation It is accurate and can play a huge role in the field of speech recognition effect detection and video production.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and relates to a method for replacing an audio stream in a video based on speech recognition. Specifically, the start and end time of each word is calculated through audio analysis, so that when the audio stream part in the video changes, the newly generated audio can be seamlessly The method to replace the corresponding audio in the original video. Background technique [0002] In recent years, with the development of natural language processing technology, intelligent speech recognition and speech synthesis technology has gradually been put into production and life. However, the development of speech recognition technology is more common in the recognition of different languages, different recognition methods, and various methods to achieve better recognition effects, faster recognition speed and wider recognition range. For a video containing dialogues, speeches, etc., it is very d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): H04N21/43H04N21/439H04N5/262H04N5/04G10L15/26G10L25/24G10L25/57G10L25/51G10L21/043G10L21/0208

CPCH04N21/4307H04N21/439H04N21/4394H04N5/262H04N5/04G10L25/24G10L25/57G10L25/51G10L21/043G10L21/0208

Inventor徐浩然沈童潘晨高张鑫晟王英钒高飞

OwnerZHEJIANG UNIV OF TECH

A method of audio stream replacement in video based on speech recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology