Multi-modal label recommendation model construction method and device of multi-level attention mechanism

A construction method and attention technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve the problem of poor recommendation effect, insufficient feature mining, and failure to consider the spatial correlation of image features themselves. Text features are related to time series. It can achieve the effect of rich scenes, satisfying real production environment and simple structure

Active Publication Date: 2020-07-28
NORTHWEST UNIV(CN)
View PDF3 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a multi-modal label recommendation model construction method and device with a multi-level attention mechanism, which is used to solve the problems in the prior art recommendation methods and devices that do not consider the spatial correlation and text features of the image features themselves The time series correlation itself leads to insufficient feature mining, which leads to the problem of poor recommendation effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-modal label recommendation model construction method and device of multi-level attention mechanism
  • Multi-modal label recommendation model construction method and device of multi-level attention mechanism
  • Multi-modal label recommendation model construction method and device of multi-level attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] In this embodiment, a multi-modal label recommendation model construction method of a multi-level attention mechanism is provided. The multi-level attention model constructed by the model construction method provided by the present invention introduces multi-modal bilinear fusion And the hierarchical attention mechanism can not only fully integrate multi-modal features, but also mine more effective information in the features, and achieve the effect of improving the recommendation performance.

[0059] In the recommendation model provided by the present invention, two different levels of attention layers are included to calculate attention factors, and the low-level attention layers (second attention layer and third attention layer) are used to calculate image and text unit values ​​respectively. The features of the modality, and the high-level attention layer (the first attention layer) is used to calculate the multi-modal features after fusion. The two-level attention ...

Embodiment 2

[0126] A multi-modal tag recommendation method based on a multi-level attention mechanism. The recommendation method is performed in the following steps:

[0127] Step A, obtaining a set of data to be recommended, the data includes image data and text data;

[0128] Step B. Input the data to be recommended into the multi-modal tag recommendation model obtained by the multi-modal tag recommendation model construction method of the multi-level attention mechanism in Embodiment 1, and obtain the recommendation result.

[0129] In this embodiment, the provided label recommendation method is tested, the experiment uses the Linux 7.2.1511 operating system, the CPU model is Intel(R) Xeon(R) E5-2643, the memory size is 251GB, and the graphics card is two GeForce GTX1080Ti Graphics card, the size of a single video memory is 11GB, and the deep learning framework is Keras version 2.0. The dataset used is based on the Tweet dataset of real users, which contains 334,989 pictures and corre...

Embodiment 3

[0146] In this embodiment, a multi-modal label recommendation model construction device with a multi-level attention mechanism, the device includes a data acquisition module, a data preprocessing module and a network construction module;

[0147] The data acquisition module is used to obtain multiple sets of data and the labels corresponding to each set of data, and obtain data sets and label sets;

[0148] Each set of data includes image data and text data;

[0149] The data preprocessing module is used to preprocess the data set to obtain the preprocessed data set;

[0150] The preprocessing includes unifying the size of the image data;

[0151] The network building block is used to train the neural network with the preprocessed data set as input and the label set as reference output;

[0152] The neural network includes a feature extraction module, a feature fusion layer, a first attention layer, and a classification layer set in series;

[0153] The feature extraction m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-modal label recommendation model construction method and device of a multi-level attention mechanism. The method comprises steps of extracting features of the image, performing bilinear fusion on the features of the image by using an outer product, obtaining an attention factor of each region in the image through an attention network layer, and performing element-by-element product on the attention factors and the original features to obtain final image feature expression; performing word embedding on the text, extracting text features by using a Bi-LSTM network, and then multiplying the text features by an attention network layer to obtain final text information expression, then, fusing the image features and the text features through a bilinear fusion layer, inputting the fused features into a high-level attention layer to obtain final joint feature expression, and finally, sending the final joint feature expression into a classification layer for label classification and recommendation. Under the condition of multi-modal information processing, the recommendation accuracy is improved through the method of combining the hierarchical attention mechanism.

Description

technical field [0001] The present invention relates to a recommendation method and device, in particular to a method and device for constructing a multi-modal tag recommendation model with a multi-level attention mechanism. Background technique [0002] Multimodal label recommendation is a very popular research direction in the fields of artificial intelligence and recommendation systems in recent years. Its purpose is to use multi-modal information to improve the accuracy of tag recommendation. Multi-modal tag recommendation has a wide range of research needs and application scenarios in both industry and academia. If it is possible to realize low-cost and high-performance multi-modal tag recommendation with the help of artificial intelligence and deep learning technology, it will be of great significance to both academia and industry. [0003] Traditional recommendation algorithms only consider single-modal (image or text) information. In recent years, with the developm...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04G06N3/08G06F16/9535G06F16/958
CPCG06N3/08G06F16/9535G06F16/958G06N3/044G06N3/045G06F18/241G06F18/25G06F18/214
Inventor 李展徐宝胜王凯凯田晓杰赵国英章勇勤王珺李斌杨溪彭进业
Owner NORTHWEST UNIV(CN)
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products