Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method of audio stream replacement in video based on speech recognition

A technology in speech recognition and video, applied in speech recognition, speech analysis, television, etc., can solve problems such as out-of-sync sound and picture, heavy traces of video editing, etc.

Active Publication Date: 2022-04-29
ZHEJIANG UNIV OF TECH
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to solve the problem of audio stream replacement, academia and industry have proposed many solutions, among which the technical solutions closer to the present invention are: In the invention patent with the patent publication number of CN 110019961A, the speech characteristics of the audio stream are obtained through speech recognition and speech synthesis method to modify the audio stream in the video stream, but in this patent, the synthesized audio is not modified, which may lead to heavy clipping traces of the video, out-of-sync sound and picture on a single word, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method of audio stream replacement in video based on speech recognition
  • A method of audio stream replacement in video based on speech recognition
  • A method of audio stream replacement in video based on speech recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The specific implementation of the present invention will be described in detail below in conjunction with the examples, but the protection scope of the invention is not limited thereto.

[0043] The audio stream replacement method in the video based on speech recognition of the present invention specifically comprises the following steps:

[0044] Step 1: Extract the audio in the video to be processed, and perform endpoint detection and noise reduction on the extracted audio, specifically:

[0045] Step 1.1: First divide the audio into frames according to the duration and sampling rate, calculate the duration of each frame according to formula (1), and finally multiply each frame by the Hamming window;

[0046]

[0047] Among them, T represents the duration of the audio frame, n represents the number of sampling points corresponding to an AAC frame, and v represents the sampling frequency;

[0048] Step 1.2: Calculate the energy value of each frame according to formu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for replacing an audio stream in a video based on speech recognition. The method is as follows: firstly, the front and rear endpoints of the people's speech in the audio are obtained by detecting the endpoints of the audio, then noise reduction is performed on the audio to extract feature values, and then speech recognition is performed through a sound model and a language model, and then according to the characteristics of the recognized words It is worth calculating the start and end time of the word, which is synthesized by calculating the speech characteristics of the speaker and the audio synthesized by the machine, so as to realize the audio stream replacement process in the video. The present invention can obtain the start time and end time of each word in the audio in the speech recognition result, obtain the start time and end time of each word in the audio in the recognition result, and make the audio stream replacement in the video more scientific through calculation It is accurate and can play a huge role in the field of speech recognition effect detection and video production.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and relates to a method for replacing an audio stream in a video based on speech recognition. Specifically, the start and end time of each word is calculated through audio analysis, so that when the audio stream part in the video changes, the newly generated audio can be seamlessly The method to replace the corresponding audio in the original video. Background technique [0002] In recent years, with the development of natural language processing technology, intelligent speech recognition and speech synthesis technology has gradually been put into production and life. However, the development of speech recognition technology is more common in the recognition of different languages, different recognition methods, and various methods to achieve better recognition effects, faster recognition speed and wider recognition range. For a video containing dialogues, speeches, etc., it is very d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): H04N21/43H04N21/439H04N5/262H04N5/04G10L15/26G10L25/24G10L25/57G10L25/51G10L21/043G10L21/0208
CPCH04N21/4307H04N21/439H04N21/4394H04N5/262H04N5/04G10L25/24G10L25/57G10L25/51G10L21/043G10L21/0208
Inventor 徐浩然沈童潘晨高张鑫晟王英钒高飞
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products