Unlock instant, AI-driven research and patent intelligence for your innovation.

Video content positioning method based on feature fusion and cascade learning

A feature fusion and video content technology, applied in the field of machine vision and deep learning, can solve the problems of loss of complementary high-level semantic information, difficulty in realizing accurate positioning of video content, and reducing the accuracy of video content positioning

Active Publication Date: 2019-07-16
PEKING UNIV
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In the existing video content localization method, the feature extraction module of the neural network simply concatenates the feature vector of the video image and the feature vector of the video sound as the feature vector of the video, which will lose the complementary high-level semantics contained in the image and sound features. information, greatly reducing the accuracy of video content positioning, making it difficult to achieve precise positioning of video content

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video content positioning method based on feature fusion and cascade learning
  • Video content positioning method based on feature fusion and cascade learning
  • Video content positioning method based on feature fusion and cascade learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0043] The invention provides a video content location method based on feature fusion. Based on feature fusion and cascaded learning, cascaded neural networks are used to perform video feature extraction, feature fusion, and content location, which can solve the problem of video images and sounds. Complementary high-level semantic information loss problem, to achieve precise positioning of video content.

[0044] Such as figure 1 As shown, the video content positioning method based on feature fusion and cascade learning according to the present invention is used to accurately position the video. The video includes features of multiple modalities, such as image, sound, and optical flow. Assume that only sound and RGB image are used this time, so the following n is 2; the specific implementation incl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a video content positioning method based on feature fusion. The method comprises a video feature extraction process, a feature fusion process and a video content positioning process. The feature fusion comprises pre-fusion and post-fusion; the pre-fusion comprises steps of cascading low-level features of each dimension of the video, and fusing the low-level features througha neural network shown in the specification; and the post-fusion comprises steps of splicing the outputs of the neural network shown in the specification, and fusing the high-level semantic featuresof each dimension of the video through the neural network CF (.). According to the method and the device, the problem of loss of complementary high-level semantic information contained in video imagesand sounds can be solved, and accurate positioning of video contents is realized.

Description

technical field [0001] The invention belongs to the technical field of machine vision and deep learning, and relates to a video content positioning technology, in particular to a video content positioning method using a cascaded neural network based on front and back fusion of features. Background technique [0002] Various videos contain a large number of events and actions, and the core content of the video is in these events. However, most videos are redundant and long-running, and viewers need to find useful information in a large amount of redundant information, so a technology that can automatically locate the required content segments is urgently needed. This kind of video content positioning is very helpful for subsequent content analysis and classification, and has great application space in the fields of security, education, and film and television. [0003] In the prior art about video content location methods, the neural network method is adopted, but the inform...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/00G06K9/46G06K9/62G06N3/04G06N3/08G10L15/02G10L15/06
CPCG06N3/08G10L15/02G10L15/063G06V20/41G06V10/56G06N3/048G06N3/045G06F18/253
Inventor 赵祈杰单开禹王勇涛汤帜
Owner PEKING UNIV