Multi-modal label recommendation model construction method and device of multi-level attention mechanism

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A construction method and attention technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problem of poor recommendation effect, insufficient feature mining, and failure to consider the spatial correlation of image features themselves. Text features are related to time series. It can achieve the effect of rich scenes, satisfying real production environment and simple structure

Active Publication Date: 2020-07-28

NORTHWEST UNIV(CN)

View PDF3 Cites 11 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to provide a multi-modal label recommendation model construction method and device with a multi-level attention mechanism, which is used to solve the problems in the prior art recommendation methods and devices that do not consider the spatial correlation and text features of the image features themselves The time series correlation itself leads to insufficient feature mining, which leads to the problem of poor recommendation effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0058] In this embodiment, a multi-modal label recommendation model construction method of a multi-level attention mechanism is provided. The multi-level attention model constructed by the model construction method provided by the present invention introduces multi-modal bilinear fusion And the hierarchical attention mechanism can not only fully integrate multi-modal features, but also mine more effective information in the features, and achieve the effect of improving the recommendation performance.

[0059] In the recommendation model provided by the present invention, two different levels of attention layers are included to calculate attention factors, and the low-level attention layers (second attention layer and third attention layer) are used to calculate image and text unit values respectively. The features of the modality, and the high-level attention layer (the first attention layer) is used to calculate the multi-modal features after fusion. The two-level attention ...

Embodiment 2

[0126] A multi-modal tag recommendation method based on a multi-level attention mechanism. The recommendation method is performed in the following steps:

[0127] Step A, obtaining a set of data to be recommended, the data includes image data and text data;

[0128] Step B. Input the data to be recommended into the multi-modal tag recommendation model obtained by the multi-modal tag recommendation model construction method of the multi-level attention mechanism in Embodiment 1, and obtain the recommendation result.

[0129] In this embodiment, the provided label recommendation method is tested, the experiment uses the Linux 7.2.1511 operating system, the CPU model is Intel(R) Xeon(R) E5-2643, the memory size is 251GB, and the graphics card is two GeForce GTX1080Ti Graphics card, the size of a single video memory is 11GB, and the deep learning framework is Keras version 2.0. The dataset used is based on the Tweet dataset of real users, which contains 334,989 pictures and corre...

Embodiment 3

[0146] In this embodiment, a multi-modal label recommendation model construction device with a multi-level attention mechanism, the device includes a data acquisition module, a data preprocessing module and a network construction module;

[0147] The data acquisition module is used to obtain multiple sets of data and the labels corresponding to each set of data, and obtain data sets and label sets;

[0148] Each set of data includes image data and text data;

[0149] The data preprocessing module is used to preprocess the data set to obtain the preprocessed data set;

[0150] The preprocessing includes unifying the size of the image data;

[0151] The network building block is used to train the neural network with the preprocessed data set as input and the label set as reference output;

[0152] The neural network includes a feature extraction module, a feature fusion layer, a first attention layer, and a classification layer set in series;

[0153] The feature extraction m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-modal label recommendation model construction method and device of a multi-level attention mechanism. The method comprises steps of extracting features of the image, performing bilinear fusion on the features of the image by using an outer product, obtaining an attention factor of each region in the image through an attention network layer, and performing element-by-element product on the attention factors and the original features to obtain final image feature expression; performing word embedding on the text, extracting text features by using a Bi-LSTM network, and then multiplying the text features by an attention network layer to obtain final text information expression, then, fusing the image features and the text features through a bilinear fusion layer, inputting the fused features into a high-level attention layer to obtain final joint feature expression, and finally, sending the final joint feature expression into a classification layer for label classification and recommendation. Under the condition of multi-modal information processing, the recommendation accuracy is improved through the method of combining the hierarchical attention mechanism.

Description

technical field [0001] The present invention relates to a recommendation method and device, in particular to a method and device for constructing a multi-modal tag recommendation model with a multi-level attention mechanism. Background technique [0002] Multimodal label recommendation is a very popular research direction in the fields of artificial intelligence and recommendation systems in recent years. Its purpose is to use multi-modal information to improve the accuracy of tag recommendation. Multi-modal tag recommendation has a wide range of research needs and application scenarios in both industry and academia. If it is possible to realize low-cost and high-performance multi-modal tag recommendation with the help of artificial intelligence and deep learning technology, it will be of great significance to both academia and industry. [0003] Traditional recommendation algorithms only consider single-modal (image or text) information. In recent years, with the developm...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62G06N3/04G06N3/08G06F16/9535G06F16/958

CPCG06N3/08G06F16/9535G06F16/958G06N3/044G06N3/045G06F18/241G06F18/25G06F18/214

Inventor 李展徐宝胜王凯凯田晓杰赵国英章勇勤王珺李斌杨溪彭进业

Owner NORTHWEST UNIV(CN)

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-modal label recommendation model construction method and device of multi-level attention mechanism

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology