Video understanding method based on deep learning

A deep learning and video technology, applied in the field of video understanding, to achieve the effect of improving accuracy and improving accuracy
CN107909014AInactive Publication Date: 2018-04-13TIANJIN UNIV

Patent Information

Authority / Receiving Office
CN ยท China
Current Assignee / Owner
TIANJIN UNIV
Publication Date
2018-04-13
Estimated Expiration
Not applicable ยท inactive patent

Smart Images

  • Figure 1
    Figure 1
Patent Text Reader

Abstract

The invention provides a video understanding method based on deep learning. The method comprises the steps that 1 a model based on an LSTM network is acquired through training; a C3D algorithm is usedto acquire image features; a PCA algorithm is used to reduce dimensions; the dimension of a feature vector is reduced from 4096 to 128; time-domain aliasing and normalization are carried out to acquire a normalized feature vector; an MSR-VTT database is used to train in the LSTM network to acquire the LSTM network model; and 2 through the LSTM network-based model, the statement information of a video image sequence to be detected is acquired; a C3D algorithm is used to acquire the feature vector of the video image sequence to be detected; a PCA algorithm is used for dimension reduction, and time domain aliasing and normalization are carried out to acquire a normalized feature vector; and through the LSTM network-based model, a statement output by the video image sequence to be detected isacquired. According to the invention, the accuracy of the existing model can be improved, and an original model can be further optimized based on new data.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a video understanding method. In particular, it involves a deep learning-based method for video understanding. Background technique

[0002] With the rapid development of the Internet, human beings have gradually entered the era of big data. There is a large amount of picture and video data on the Internet. The sources of these data are also different, and most of the data do not have relevant text descriptions. In this way, when we process these data on a large scale, there are considerable difficulties. It is easy for humans to write a corresponding descriptive text based on the content of a picture or video, but it is quite difficult for a computer to perform such a task. The topic of image / video caption (image / video caption) has thus entered people's field of vision. This is a comprehensive problem that combines computer vision, natural language processing and machine learning. It is similar to translating a picture / vi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More