A device and method for synchronously calibrating voice and video of a portrait

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for voice, video, and calibration devices, which is applied in image communication, computing, and selective content distribution. It can solve the problems of voice information and video information being out of sync, unrecognizable, and unable to judge motion without sound, so as to reduce information storage capacity and the effect of improving computing performance

Active Publication Date: 2022-05-17

JIANGSU UNIV

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

During the recording process, due to hardware or network problems, the voice information and video information will be out of sync.

Traditional audio and video synchronization calibration generally uses manual playback of audio and video files frame by frame. When an error is found, the method of manual calibration requires a lot of work; some synchronization methods that add time stamps can only recognize voice information with time stamps and Video information cannot identify voice information and video information that have not been added with a time stamp; there are also some methods that match the characteristics of the motion amplitude in the video frame with the characteristics of the voice information, which requires movement to produce changes in the sound information, and cannot be judged. movement that produces sound

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0065] Embodiment 1: to the synchronous voice and video detection process

[0066] Step S1, read the audio and video header file information, obtain the total time length of the audio and video 72, the unit is second, a certain moment of the audio and video is t, 1≤t≤72;

[0067] Step S2, set the dynamic lips array P[k], 1≤k≤72, set the initial value of all elements in the array P to 0, set the vocal array S[f], 1≤f≤72, set the array S The initial value of all elements in is set to 0;

[0068] Step S3, sequentially extracting the picture frames at time t of the video file, image 3 It is the binary image of the image frame extracted at the 32nd second of the video file, Image 6 It is the binary image of the picture frame extracted in the 31st second of the video file, using face recognition technology to recognize the i human face area M in the picture frame at a certain moment t,i , 1≤i≤I, I=1, Figure 4 From image 3 A face region M extracted from 32,1 , Figure 7 Fr...

Embodiment 2

[0077] Embodiment 2: To asynchronous voice and video detection and calibration process

[0078] Step S1, read the audio and video header file information, obtain the total length of the audio and video time 58, the unit is second, a certain moment of the audio and video is t, 1≤t≤58;

[0079] Step S2, set the dynamic lips array P[k], 1≤k≤58, set the initial value of all elements in the array P to 0, set the vocal array S[f], 1≤f≤58, set the array S The initial value of all elements in is set to 0;

[0080] Step S3, sequentially extracting the picture frames at time t of the video file, Figure 11 is the binary image of the image frame extracted from the 19th video file, Figure 14 It is the binary image of the picture frame extracted from the 18th second of the video file, and the i face area M in the picture frame at a certain moment is recognized by face recognition technology t,i , 1≤i≤I, I=3, Figure 12 From Figure 11 The three face regions M extracted from 19,1 ,M ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a device and method for synchronously calibrating voice and video of a portrait, using the existing mature face recognition technology, dynamic lip recognition technology, human voice extraction technology, etc., through the design of information means and hardware equipment, to realize the portrait voice Video synchronization calibration function. The invention only adopts left shift, right shift and XOR calculation with low time complexity, which improves the calculation performance, and does not need to add time stamp information in voice and video files, reducing the amount of information storage. The invention can be applied to the synchronous detection of voice and video of portraits and the calibration of asynchronous voice and video.

Description

technical field [0001] The invention belongs to the technical field of multimedia information processing, and in particular relates to a voice and video synchronous calibration device and method for portraits. Background technique [0002] With the popularity and development of multimedia and the Internet, portrait voice and video applications are used in various fields, such as talk entertainment programs, network anchor programs, and large-scale open online courses. The voice information and video information used in portrait audio and video are generally recorded separately by different hardware, and then comprehensively processed by a computer to synthesize a voice and video file that can be played directly. During the recording process, due to hardware or network problems, the voice information and video information will be out of sync. Traditional audio and video synchronization calibration generally uses manual playback of audio and video files frame by frame. When a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): H04N21/43H04N21/8547G06V40/16

CPCH04N21/4307H04N21/8547G06V40/171

Inventor 陈潇君苟建平詹天明成科扬陈小波詹永照毛启容柯佳汪满容

Owner JIANGSU UNIV

A device and method for synchronously calibrating voice and video of a portrait

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology