Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Video dense event description method based on multi-mode heterogeneous feature fusion

A technology of event description and feature fusion, applied in computer parts, character and pattern recognition, biological neural network models, etc., to achieve the effect of ensuring rationality, good effect, and improving generalization ability

Pending Publication Date: 2022-04-15
COSCO SHIPPING TECH CO LTD +2
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In order to effectively solve the disadvantages of frequent generation of co-occurrence descriptions and other meaningless descriptions in the video description method of long videos and dense events, and meet the requirements of multi-scene applications, the present invention provides a video dense video based on multi-modal heterogeneous feature fusion Event description method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Video dense event description method based on multi-mode heterogeneous feature fusion
  • Video dense event description method based on multi-mode heterogeneous feature fusion
  • Video dense event description method based on multi-mode heterogeneous feature fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only Embodiments of some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0052] In order to facilitate the understanding of the technical solution of the present invention, the technical terms involved in the present invention will be described in detail below.

[0053] The video-intensive event description task is a further refinement of the video description task, which just solves the problem of locating...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of computer vision, in particular to a video event description method based on multi-mode heterogeneous feature fusion. According to the method, an I3D convolutional network is used for editing a video and extracting dynamic visual features, and a VGGish model is used for extracting audio rhythm features; performing semantic representation on the scene object information, generating a scene map, obtaining an entity code, an attribute code and a relation code, and performing map embedding on the feature vector through map convolution; performing triple multi-mode cycle fusion on the three extracted features; adaptive multi-modal data balance enables dynamic vision and audio rhythm features to be mutually matched, and ensures reasonable event extraction; and the description reconstruction decoder is used for detecting the video event by utilizing a description reconstruction algorithm and generating description of the video scene event according to the pre-training language dictionary. The problem that meaningless descriptions such as co-occurrence descriptions are frequently generated in a video description method is effectively solved, and the relation of scene events is effectively mined by effectively utilizing multi-modal information.

Description

technical field [0001] The invention relates to computer vision and to the field of video description, in particular to a description method for video-intensive events based on fusion of multimodal heterogeneous features. Background technique [0002] The scope and scale of computer vision applications are currently the most extensive and common in artificial intelligence applications, and have already penetrated into many aspects of daily life and work, involving network security, system evaluation, monitoring, intelligent machines, etc. It plays an important role in promoting the development and progress of society. In the image recognition of computer vision artificial intelligence, it is mainly divided into static images and dynamic images. Static images mainly include pictures and other content, and dynamic images mainly include video and other content. For video description tasks, identifying the relationship between these events in the video and describing all events...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06V20/40G06V10/762G06V10/774G06V10/82G06V10/80G06K9/62G06N3/04
CPCY02D10/00
Inventor 刘晋龚沛朱张喜亮吴中岱王骏翔郭磊胡蓉韩冰朱晓蓉
Owner COSCO SHIPPING TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products