Unlock instant, AI-driven research and patent intelligence for your innovation.

Diversified recommendation method and system based on reinforcement learning and storage medium

A technology for reinforcement learning and recommendation methods, applied in neural learning methods, biological neural network models, instruments, etc., can solve problems such as difficulty in achieving global optimality, difficulty in training samples, and difficulty in scoring formulas, and achieve the effect of maximizing long-term benefits

Pending Publication Date: 2022-01-28
CHINA TOBACCO ZHEJIANG IND
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although this method is simple, it is very difficult to define a general scoring formula. It is often necessary to manually adjust the parameters according to the usage environment, and it is difficult to achieve the global optimum.
There are also supervised learning methods to carry out diversified recommendations, but it is very difficult to obtain sufficient training samples. Even if they are obtained, there may be a large difference between them and the actual running samples. In addition, the diversified evaluation indicators cannot be directly used to guide the training process.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Diversified recommendation method and system based on reinforcement learning and storage medium
  • Diversified recommendation method and system based on reinforcement learning and storage medium
  • Diversified recommendation method and system based on reinforcement learning and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0053] As a preferred implementation manner, step S1 specifically includes:

[0054] Input the labeled training sample set, which contains supervised samples; determine and initialize the algorithm parameters, including determining the recommendation list length T, exploring the probability decay coefficient ξ, the supervision loss function coefficients λ and τ, and initializing each parameter.

[0055] As a preferred implementation manner, in step S1, the method for obtaining training samples includes:

[0056] Based on LSTM to generate a recommendation list, the process is as follows:

[0057] a) Input a user's interest feature vector and candidate item set, and initialize the LSTM hidden state and decision sequence;

[0058] b) Input user interest vector to LSTM as state;

[0059] c) Process the candidate items one by one, and calculate the selection probability of each item. When the maximum selection probability is less than the exploration probability, the random colle...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a diversified recommendation method based on reinforcement learning, and the method is characterized in that the method comprises the following steps: S1, obtaining a training sample, and determining and initializing network parameters; S2, executing a strategy generation action; S3, evaluating and optimizing the strategy; S4, supervising the loss through the comment network; and S5, updating the exploration probability. According to the method, the acquisition of the optimal recommendation sequence can be driven through the reward in the long-stage operation process, the high reward is obtained from the good recommendation action through trial and error, and finally the optimal recommendation list in various states is learned to maximize the long-term income.

Description

technical field [0001] The invention relates to the field of e-commerce item recommendation, in particular to a reinforcement learning-based diversified recommendation method, system and storage medium. Background technique [0002] Today's e-commerce platforms widely use the recommendation system, which predicts the user's preference for certain items by analyzing the user's historical behavior information, and recommends a group of items to them to achieve the purpose of information filtering among massive items. The early methods of the recommendation system are mainly content-based recommendation or collaborative filtering recommendation. There are many specific implementation methods, some use the similarity between users, some use the similarity between items, and some use user characteristics. match with product characteristics. The early method has great limitations, and a large number of improved algorithms have been proposed in the follow-up, and the evaluation of...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9535G06F16/9536G06K9/62G06N3/04G06N3/08
CPCG06F16/9535G06F16/9536G06N3/08G06N3/044G06F18/241G06F18/2415
Inventor 高扬华楼卫东陆海良郁钢
Owner CHINA TOBACCO ZHEJIANG IND