Method of locating sound source from video
A video and sound source technology, applied in the field of cross-modal learning, can solve the problems of blurred edges of positioning and low positioning accuracy, and achieve the effect of high accuracy and high application value
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0039] The present invention proposes a method for locating a sound source from a video, which will be further described in detail below in conjunction with specific embodiments.
[0040] The present invention proposes a method for locating a sound source from a video, comprising the following steps:
[0041] (1) Training stage;
[0042] (1-1) Obtain training samples; obtain J segments of video from any channel as training samples, each training sample length is 10 seconds, there is no special requirement for the content of the training sample video, the video needs to contain a variety of different object categories, each The object categories in the training sample videos are manually labeled;
[0043] The video source of training sample in the present embodiment is the video of 10 categories in the Audioset data set, (comprising automobile, motorcycle, helicopter, yacht, speech, dog, cat, pig, alarm clock, guitar), present embodiment selects altogether J = 32469 video clips...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


