Method and terminal for positioning voice region in song video
A technology in area positioning and video, which is applied in the direction of instruments, character and pattern recognition, electrical components, etc. It can solve the problems of accompaniment interference and the inability to accurately locate the song area, etc., and achieve high accuracy and good effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0127] Please refer to figure 1 and figure 2 , a method for vocal area location in a song video, comprising steps:
[0128] S1. Obtain a video frame image corresponding to the song video, and determine the subtitle area of the video frame image;
[0129] Wherein, use the Robert operator to extract the image edge of the video frame, and refine and binarize the extracted image edge;
[0130] Count the total number of pixels in each row and the total number of pixels in each column of the edge of the image after thinning and binarization;
[0131] judging whether there is a first pixel block, in the first pixel block, the total number of pixels in each row is greater than a first preset value, and the height of the first pixel block is greater than the first preset height;
[0132] judging whether there is a second pixel block, in the second pixel block, the total number of pixels in each column is greater than a second preset value, and the width of the second pixel block ...
Embodiment 2
[0154] The difference between this embodiment and embodiment one is:
[0155] The step S2 is:
[0156] S2. Perform the following steps S21 and S22 in parallel or successively:
[0157] S21. Identify the position where the subtitle advances in the subtitle area;
[0158] S22. Segment the boundaries of all characters in the subtitle area, and record the positions of the left and right boundaries of each character, and the positions of the left and right boundaries constitute the character area of each character;
[0159] Use OCR technology to identify the word corresponding to the word area of each word;
[0160] The step S3 is:
[0161] Determine the start time and end time of each word according to the position where the subtitle advances and the word area of each word;
[0162] The step S4 is:
[0163] Locate the vocal area of each word in the song video according to the start time and end time of each word;
[0164] This embodiment realizes the detection of the...
Embodiment 3
[0166] Please refer to image 3 , a terminal 1 for positioning vocal regions in a song video, comprising a memory 2, a processor 3, and a computer program stored in the memory 2 and operable on the processor 3, the processor 3 executing the The steps in the first embodiment are realized when the computer program is described.
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


