Video recommendation method based on multimodal video content and multi-task learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of multi-task learning and video content, applied in the field of video recommendation based on multi-modal video content and multi-task learning, can solve the problems of dependence, cold start, video inaccuracy, etc., and achieve the effect of reducing the scale of parameters

Inactive Publication Date: 2021-05-25

SOUTH CHINA UNIV OF TECH +1

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] At present, the short video recommendation technology is facing two important challenges: (1) Most of the current recommendation algorithms are based on user preferences and user behavior to make recommendations, ignoring the content of items, and there is also a serious cold start problem, which leads to large Most videos are ignored, and even traditional content-based recommendation methods do not perform well because they rely on metadata rather than original video content

However, the metadata of micro-videos is uploaded by users, which may be inaccurate for videos, how to effectively utilize the multi-modal information of videos becomes an important challenge for video recommendation

(2) The single-task recommendation model cannot meet the current needs for multi-tasks. In video recommendation, it is not only necessary to predict whether the user will watch the video, but also predict the user's rating of the video, whether to like it, whether to forward it, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0042] figure 1 The flow chart of the video recommendation method based on multi-modal video content and multi-task learning disclosed in the present invention is given, which specifically includes the following steps:

[0043] T1, video multi-modal feature extraction:

[0044] a. The extraction of video frames, through the opencv video reading class cv2.VideoCapture to intercept video frame pictures, save them in the path folder, the number of frames starts from 0, considering the short and precise characteristics of short videos, intercept each frame of video The picture is intercepted without skipping frames.

[0045] b. Video static feature extraction, adjust the size of each frame of the video to [299, 299] and then input it into the pre-trained Inception-V3 network, as attached figure 1 As shown, the input is mapped to a 2048-dimensional feature vector as the static original feature vector of the video frame. In order to preserve the information of each frame of the vide...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a video recommendation method based on multi-modal video content and multi-task learning, comprising the steps of: extracting visual, audio, and text features of short videos through a pre-trained model; State features are fused; the deep walk method is used to learn the feature representation of the user's social relationship; a deep neural network model based on the attention mechanism is proposed to learn multi-domain feature representation; the feature embedding generated based on the above steps is used as the shared layer of the multi-task model, and then The prediction results are generated by multi-layer perceptron respectively. The present invention uses the attention mechanism combined with user features to fuse multi-modal features of video, making the whole recommendation richer and more personalized; at the same time, considering the importance of interactive features in recommendation learning for multi-domain features, a The deep neural network model based on the attention mechanism enriches the learning of high-level features and provides users with more accurate personalized video recommendations.

Description

technical field [0001] The invention relates to the technical field of network video and recommendation systems, in particular to a video recommendation method based on multi-modal video content and multi-task learning. Background technique [0002] With the rapid popularization of smart mobile terminals and the development of multimedia technology, video has gradually become the carrier of information dissemination. In recent years, short videos have risen rapidly. Video has become a main way of entertainment for people, and users' interests have also shown wider. The rapid increase in the number of short videos has brought about a serious problem of information overload. How to find videos of interest to users from massive amounts of data has become a hot topic and research object. A good recommendation system can not only help consumers find interesting or even potentially interesting videos faster and more conveniently, but also help content providers increase profits an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): H04N21/25H04N21/466G06F16/783G06N3/04G06N3/08

CPCH04N21/251H04N21/4666H04N21/4668G06F16/7844G06F16/7834G06F16/783G06N3/084G06N3/045

Inventor 史景伦邓丽梁可弘傅钎栓林阳城

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Video recommendation method based on multimodal video content and multi-task learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology