A multi-user sharing-oriented multimedia network video recommendation method

A network video and recommendation method technology, which is applied in the field of big data multimedia network video applications, can solve problems such as user resentment and recommendation system accuracy reduction, so as to improve utilization, realize incremental update, and increase computing speed and computing resources. The effect of utilization

Pending Publication Date: 2021-10-01
NANJING UNIV OF POSTS & TELECOMM
0 Cites 0 Cited by

AI-Extracted Technical Summary

Problems solved by technology

However, random or excessive exploration may lead to a decrease in the accuracy of the ...
View more

Method used

Step 5: based on the time-invariant LinUCB algorithm, set up the article quality model based on the separated multi-user association information according to the separated multi-user mixed behavior log record, and the article quality model is used to guarantee the program quality;
[0117] In this embodiment, as shown in FIG. 3, the online recommendation module proposed by the present invention consists of four parts: feature extraction, user interest mining model, item quality model, and the final cross-weighted integration strategy. The specific process is as follows: The feature extraction process mainly involves two important features, namely the one-hot encoding of the program theme and the one-hot encoding of the program itself, which are the input features of the user interest mining model and the item quality model respectively. Then, according to the personal information of the target user, the time-varying LinUCB algorithm is used to construct a user interest mining model (see step 4 below). Using the time-invariant LinUCB algorithm, an item quality model based on separated multi-user association information is established (see step 5 below). Finally, the present invention integrates the item quality model into the user interest model in a cross-weighted manner (see step 6 below), which helps the online recommendation module reduce the risk during the exploration process.
[0170] On the right side of the formula, the first half improves the degree of utilization of the known interests of the target user u, and the second half realizes the guaranteed exploration of the unknown interests of the target user u. It is obtained by normalizing pvu, t. The normalization process is to enable the recommendation system to adaptively adjust the proportion of exploration, so as to achieve a personalized balance between exploration and utilization. Finally, the attention score su,t of the target user u is called to enhance the adaptability to the interests of the target user u to ensure the accuracy of the recommendation system.
[0218] As shown in Figure 8a, compared with LinUCB-1, LinUCB-3 has a great performance improvement in terms of accuracy, which shows that the present invention proposes to calculate and control the exploration ratio through the attention mechanism, which improves significantly the accuracy of t...
View more

Abstract

The invention discloses a multi-user sharing-oriented multimedia network video recommendation method, which comprises the following steps: firstly, constructing multi-user characteristics by utilizing collected program information in a multi-user sharing environment, and constructing a leading user label according to the similarity of the program characteristics and the continuity of user watching behaviors, so that separation of multi-user mixed logs is realized; performing periodic multi-user identification prediction of future sessions; secondly, building a user interest mining model based on a time-varying LinUCB algorithm to learn interest changes of a user for each program theme, and enhancing the personalized ability and efficiency of a recommendation system from three aspects of parallel calculation, adaptive control of an exploration coefficient and incremental updating based on LSTM; and finally, establishing an article quality model based on a non-time-varying LinUCB algorithm to further ensure the program quality, and integrating the two algorithms into a final recommendation system model by adopting a cross weighting strategy to form a final program recommendation list. The novelty and accuracy of the recommendation result are ensured.

Application Domain

Digital data information retrievalSpecial data processing applications

Technology Topic

PersonalizationEngineering +7

Image

  • A multi-user sharing-oriented multimedia network video recommendation method
  • A multi-user sharing-oriented multimedia network video recommendation method
  • A multi-user sharing-oriented multimedia network video recommendation method

Examples

  • Experimental program(1)

Example Embodiment

[0089] The specific embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.
[0090] In this embodiment, the data set comes from the data of a certain operator's IPTV set-top box. Among them, 1100 user data were selected from the IPTV video system for a period of three months to watch records, which involved a total of 498,309 log records and 2830 programs.
[0091] like figure 1 As shown, the present invention provides a kind of multimedia network video recommending method facing multi-user sharing, and the method comprises the following steps:
[0092] Step 1: Collect the multi-user mixed behavior log record data of multi-user watching online video, and process the multi-user mixed behavior log record data, including data cleaning, data integration and data resampling.
[0093] (1-1) IPTV set-top boxes are used to collect multi-user mixed behavior log record data of multi-user watching online video at an interval of 5 minutes. The collected multi-user mixed behavior log record data includes the following data fields: collection time collect_time, user ID user_id, program name program_name, program ID program_id, service start time start_time and service end time end_time.
[0094] (1-2) Data cleaning: For two or more completely repeated records in the user behavior log records of the same user ID, only the first user behavior log record is kept, and the rest of the user behavior log records are deleted;
[0095] (1-3) Data integration: Merge the continuous user behavior log records of each user;
[0096] (1-4) Data resampling: process the time data in units of hours, split the user behavior log records across hours, and obtain the following fields: resampled_start_time after resampling, resampled_end_time after resampling end time and watched_time, where the watched time is the time difference between the start time after resampling and the end time after resampling.
[0097] Step 2: Crawl the text description information of all programs in the multi-user mixed behavior log record data, perform text information processing on the text description information of all programs crawled, so as to construct program topic tags, and then use the program topic tags to process the processed Multi-user mixed behavior logging is used to construct multi-user feature labels.
[0098] (2-1) Crawl the text description information of all programs in the multi-user mixed behavior log record data. The text description information specifically includes the following fields: program total duration program_full_time, program profile program_description, program director program_director, program country program_country and program type progtam_type.
[0099] (2-2) Integrate the text description information of all programs crawled into a program file, and perform text segmentation and text information processing for removing stop words;
[0100] (2-3) Use the topic classification function of the latent Dirichlet distribution LDA model, and use the Gibbs sampling algorithm to learn the LDA model. The text description information of the program processed in step (2-2) is used as the input of the LDA model, and according to its output As a result, the topic with the highest probability value in the topic distribution of each program file is selected, and it is used as the program topic label of the program to realize the program topic classification. The specific method of using the LDA model to obtain the program topic label of each program is as follows:
[0101] A. In the initial stage, each word in the program document is randomly assigned a topic. Count the number of word segments appearing under each topic z and the number of words appearing in topic z under each document m.
[0102] B. Excluding the topic assignment of the current word, estimate the probability distribution of the current word belonging to each topic z according to the topic assignment of all other words, and then sample a new topic for the word according to this probability distribution.
[0103] C. Use the same method to continuously update the topic of the next word until the topic distribution of each program document is found and word distributions under each topic Convergence, the algorithm stops, and the parameters to be estimated are output and In the process of model training, the number of topics in the LDA model is set to K=45, and finally the topic distribution of each program document is a 1×45-dimensional LDA vector, and the parameters of each dimension indicate the probability of belonging to each topic.
[0104] D. Select the topic with the highest probability value in the topic distribution of each program file, and use it as the program topic label of the program to realize program topic classification.
[0105] (2-4) From the collected multi-user mixed behavior log record data, identify the model of the viewing network video device used by each log record, extract the log records belonging to the same device, and construct a corresponding log record for each log record Multi-user feature tags, multi-user feature tags include first user tags, second user tags, leading user tags and interest spanning tags, such as figure 2 shown. In this step, in order to simplify the complexity of the multi-user environment, in the multi-user environment, the time step is set to one hour, the viewing sequence in a time step is defined as a session, and the user switching unit is a session, the The duration of the session is one hour; the specific steps for constructing a corresponding multi-user feature label for each log record are as follows:
[0106] a, in the initial stage, the program theme label in the step (2-3) is regarded as the initial label of the user, that is, the preliminary user identity label, i.e. user label 1;
[0107] b. Reorganizing user identity tags according to the continuity of log records, merging multiple first user tags with continuous log records into one second user tag. When the time interval between two log records is less than 2 minutes, it is a continuous log record; during the time period of continuous log records, the first user tag with the largest number of log records or the longest viewing time is marked as the continuous log record Users within the time period of log records, users within the time period of continuous log records are the second user tags;
[0108] c. Considering that a user may have multiple discrete continuous log records, set a dominant user label for each session. Mark the second user label with the largest number of log records or the longest viewing time in a session as the dominant user label of the session, and the dominant user label also represents the dominant user of the session; in the subsequent modeling process , the dominant user of the session is the target user of the multimedia network video recommendation system;
[0109] d. According to the log records of a single session, count the number of categories of the program topics watched by the dominant user in the session, and mark the category numbers of the program topics as the user's interest spanning degree label.
[0110]Step 3: Execute the offline periodic multi-user identification prediction method, which is used to predict the target user who will send a request to the recommendation system in the future, and record data from the processed multi-user mixed behavior log according to the multi-user feature label of the target user Extract the behavior log records of the target user, obtain the user behavior log record set of the target user, and realize the separation of multi-user mixed behavior log records.
[0111] (3-1) Collect the multi-user mixed behavior logging data of the latest M sessions of a device for watching online video, and extract time features and sliding window features; where the time features include hours, weeks and whether it is a weekend, sliding The window feature is the dominant user label within the sliding time.
[0112] Further, the time feature and the sliding window feature are described in detail. In the time feature, the three time information of hour, week, and whether it is a weekend are added to the feature of a specific date. In the sliding window feature, the medium and long-term change trend of the time series can be effectively reflected by the sliding window method. Since the present invention sets the time step as 1 hour in the experiment, therefore, the sliding window is mainly selected as 1 hour and 2 hours. Subsequently, in order to further expand the scope of information perception and remember the long-term changes of dominant users, the sliding window also selected the dominant user information of 1 day ago, namely 24 hours and 25 hours.
[0113] (3-2) Time features and sliding window features are used as the input of the time series classification prediction model, and the dominant user labels of M sessions are used as the output, and the XGBoost algorithm is used to train the time series classification prediction model, and M=3×7×24 is set.
[0114] (3-3) Use the trained time series classification prediction model to predict the dominant user label at each time step in the next N hours. In order to obtain enough training information to ensure the accuracy of the prediction results, let N<
[0115] (3-4) After the time slides forward for N hours, repeat the above steps (3-1)~(3-3), and execute the multi-user forecasting method with a period of N hours, that is, use every N hours The multi-user mixed behavior log records of the most recent M sessions predict the dominant user label at each time step in the next N hours, and judge who will send a request to the recommendation system in the future, while adapting to changes in the multi-user composition.
[0116] (3-5) Based on the dominant user label among the multi-user feature labels constructed in step (2-4), extract the user behavior log records of the target user u from the multi-user mixed behavior log record data, and obtain at time step t, The user behavior log record set M of the online video watched by the target user u u,t , realize the separation of multi-user mixed behavior log records, and provide the identity label and log records of the target user u for the recommendation system.
[0117] In this example, if image 3 As shown, the online recommendation module proposed by the present invention consists of four parts: feature extraction, user interest mining model, item quality model, and the final cross-weighted integration strategy. The specific process is as follows: The feature extraction process mainly involves two important features, namely the one-hot encoding of the program theme and the one-hot encoding of the program itself, which are the input features of the user interest mining model and the item quality model respectively. Then, according to the personal information of the target user, the time-varying LinUCB algorithm is used to construct a user interest mining model (see step 4 below). Using the time-invariant LinUCB algorithm, an item quality model based on separated multi-user association information is established (see step 5 below). Finally, the present invention integrates the item quality model into the user interest model in a cross-weighted manner (see step 6 below), which helps the online recommendation module reduce the risk during the exploration process.
[0118] Step 4: Based on the time-varying LinUCB algorithm, a user interest mining model is established according to the user behavior log record set of the target user extracted in step 3, and the user interest mining model is used to explore the potential interest of the user;
[0119] (4-1) Through the fields obtained by data resampling in step 1 and the fields in the description information of the programs crawled in step 2, the program theme tags constructed and the user's interest spanning degree tags, the user interest mining model is further generated parameters required in the . The required parameters include the one-hot encoding of the program theme, the reward value obtained by each program and the interest span of the user watching the sequence in a session. The specific method of generating the parameters is as follows:
[0120] (1), the one-hot encoding that the program theme label is carried out to obtain the one-hot encoding of program theme;
[0121] (2) The reward value obtained by each program is represented by the ratio of the viewing time to the total program time;
[0122] (3) Use the user's interest spanning degree tag to indicate the interest spanning degree of the user's viewing sequence in a session.
[0123] (4-2) Use a parallel matrix instead of multiple serial vectors in traditional LinUCB to calculate the topic reward vector for each program It is a coefficient to be learned, consisting of d element composition, Indicates the parameter of the kth program theme, and the dimension is d×1. is calculated as follows:
[0124]
[0125]
[0126] in, by m u,t indivual matrix of elements with dimension m u,t ×d, is D u,t transpose. At time step t, the user behavior log records of the online videos watched by the target user u form a set M u,t. m u,t represents the set M u,t The number of user behavior log records in . represents the set M u,t The one-hot encoding of the theme of the program corresponding to the jth user behavior log record in . by m u,t r t,a A vector of reward values ​​consisting of a dimension of represents the set M u,t The reward value obtained by the jth program in . A u,t is a diagonal matrix with dimension d×d, and each of its diagonal elements represents, before time step t, the target user u watched M u,t The accumulated times of each theme program in the b u,t is a vector with a dimension of d×1, and its elements represent the cumulative rewards obtained by each type of program theme. A u,t , b u,t The initial values ​​are I d and 0 d.
[0127] (4-3) Still adopting the idea of ​​parallel matrix, calculate the expected value of feedback revenue E[r of the target user u in time step t u,t |X t ]:
[0128]
[0129] in, by n t r t,a Reward value vector composed of elements, dimension is n t ×1. At time step t, all programs form a candidate set C t. no t Indicates the selection set C t length. is the program set C t The feedback yield of the i-th program in . by n t indivual A matrix of vector elements with dimensions nt×d. is the program set C t The one-hot encoding of the topic of the i-th program in .
[0130] (4-4) Use the attention mechanism to calculate the parameter α that controls the exploration ratio in the LinUCB algorithm u,t.
[0131] ①. Calculate the attention score vector s of the target user u for each program u,t :
[0132]
[0133] Among them, s u,t is the dimension n t ×1 vector, its row element is the attention score of the target user u to each program. by n t d a matrix of vectors with dimension n t × d. where row element d a i Indicates the program set C t The LDA vector of the i-th program in (that is, the LDA vector output by the LDA model in step (2-3)), the dimension is d×1. by m u,t d a matrix of vectors with dimension m u,t × d. where row element d a j represents the set M u,t The LDA vector of the jth program in .
[0134] In this example, if Figure 4 as shown, A row vector in the matrix, denoting the candidate pool C t Each program in is separately associated with the set M u,t The similarity weight of each program in . User reward vector c u,t right The row elements in are weighted and summed to obtain the user's attention score for each program.
[0135] ②. Calculate the personalized parameter α of dynamic control exploration and utilization ratio u,t :
[0136]
[0137] Among them, α u,t is the dimension n t x1 vector. δ u,t Indicates the span of interest of the target user u watching the sequence at time step t, m u,t represents the set M u,t The number of user behavior log records in , so the first half reflects the current personalized needs of the target user u for exploration.
[0138] In the traditional LinUCB algorithm, the estimated income of the same type of programs is the same, and the differences between different programs in the same theme are not considered. In this example, s u,t It not only reflects the attention degree of the target user u to each program, but also distinguishes different programs in the same type of programs. Therefore, the parameter α u,t Not only can it track the interest changes of the target user u, and use the attention of the target user u to each program to realize adaptive dynamic adjustment of the exploration ratio, but also identify the differences between programs in the same topic, so that more accurate Recommended to a specific program.
[0139] (4-4) According to the UCB criterion, at time step t, for the target user u, the estimated revenue p brought by all the programs in the candidate set due to their subject categoriesv u,t :
[0140]
[0141] Among them, p v u,t is the dimension n t ×1 vector, each row element of which represents the estimated revenue of each program due to its theme at time step t. is to take the matrix A vector of diagonal elements with dimension n t ×1.
[0142] Step 5: Based on the time-invariant LinUCB algorithm, an item quality model based on the separated multi-user association information is established according to the separated multi-user mixed behavior log records, and the item quality model is used to ensure the program quality;
[0143] (5-1) Divide the program collection into two categories: the programs that the target user u has watched and the programs that the target user u has not watched. Among them, the quality of the programs that the target user u has watched is determined by the target user u itself, and the quality of the programs that the target user u has not watched is determined by the user behavior log records of other users who have watched the program and the target user u itself. Topic preference decision.
[0144] (5-2) Supplementing the parameters required in the item quality model: performing feature encoding on the program ID to obtain the one-hot encoding of the program itself.
[0145] (5-3) According to the user behavior log records of the target user u itself, the time-invariant LinUCB algorithm is used to learn the quality of the programs that the target user u has watched. The specific process is as follows:
[0146] [1], calculate the reward weight vector of the program that the target user u has watched among them Indicates the program set C t The reward parameter of the k-th program in , whose dimension is n t ×1, the calculation formula is:
[0147]
[0148] Among them, A' u,t is the dimension n t ×n t The diagonal matrix of is used to record the cumulative number of times target user u watched each program before time step t, b′ u,t Indicates the cumulative reward value of each program.
[0149] [2], combined with the LinUCB criterion, calculate all the program quality representations p′ obtained according to the user behavior logs generated by the target user u before time step t u,t :
[0150]
[0151] in, is the dimension n t ×n t matrix, Indicates the program set C t The one-hot encoding of the ith program itself in .
[0152] [3], according to the target user u's own user behavior log records, calculate the target user u's score p for each watched program at time step t iv u,t :
[0153] p iv u,t =w u,t ⊙p' u,t
[0154]
[0155] Among them, w u,t is determined by the weighting factor w u,t,a The weight vector consisting of, w u,t,a 1 means target user u has watched program a, w u,t,a 0 indicates that target user u has not watched program a.
[0156] (5-3) According to the user behavior log records of the separated multi-user set U, use the time-invariant LinUCB algorithm to learn the quality of programs that the target user u has not watched. In this step, U is used to denote other multi-user sets except the target user u. The specific process of using the time-invariant LinUCB algorithm to learn the quality of programs that the target user u has not watched is as follows:
[0157] i. Calculate the reward weight vector of the program that the target user u has not watched (the program that has been watched by the multi-user set U)
[0158]
[0159] Among them, A' U,t is the dimension n t ×n t The diagonal matrix of is used to record the cumulative number of times that the multi-user set U watches each program before time step t, b′ U,t Indicates the cumulative reward value of each program.
[0160] ii. Combined with the LinUCB criterion, calculate the average quality score p′ of all the programs watched by the multi-user set U before time step t U,t :
[0161]
[0162] iii. Combining weight vector 1-w u,t , get the score p of the target user u for each unwatched program at time step t iv U,t :
[0163] p iv U,t =(1-w u,t )⊙p′ U,t ,
[0164] Among them, the weighting vector 1-w u,t will p iv U,t The score of the programs watched by the target user u is set to 0. .
[0165] In this example, if Figure 5 As shown, in order to make comprehensive use of the user's interest changes and program ratings in the exploration process, the present invention adopts a double-layer LinUCB cross-weighting method to fully integrate the scoring results of each program from the time-varying LinUCB algorithm and the non-time-varying LinUCB algorithm . The specific process is as follows step 6.
[0166] Step 6: Use the cross-weighting method to combine the scoring results of the user interest mining model and the item quality model for each program to obtain a weighted score, and form a recommendation list based on the weighted score.
[0167] (6-1) Cross-weight the ratings of the quality of the programs that the target user u has watched and the programs that have not been watched, and the user’s interest changes, and obtain the target user u’s estimated income value vector p for each program at time step t u,t ,p u,t Scoring refers to weighted scoring:
[0168]
[0169] in, is p v u,t normalized form of .
[0170] On the right side of the formula, the first half improves the utilization of the known interests of the target user u, and the second half realizes the guaranteed exploration of the unknown interests of the target user u. yes to p v u,t It is obtained through normalization processing. The normalization process is to enable the recommendation system to adaptively adjust the proportion of exploration, so as to achieve a personalized balance between exploration and utilization. Finally, call the attention score s of the target user u u,t To enhance the adaptability to the interest of the target user u to ensure the accuracy of the recommendation system.
[0171] (6-2) According to p u,t Score, select the top L programs with the highest estimated revenue to form the final recommendation list, that is
[0172]
[0173] Among them, list[a t ] is the final recommendation list; Represents the program set A composed of all programs t out of p u,t,a The L programs with the largest value are the L programs recommended to the target user u; p u,t,a is the estimated revenue value of target user u to program a at time step t, which constitutes p u,t row elements in .
[0174] Step 7: Update the parameters in the user interest mining model and item quality model in real time for use in the multimedia network video recommendation system in the next time step.
[0175] (7-1) Get the latest data from the new user behavior log record of the target user u, which is formed by the target user u watching the programs in the recommendation list. The latest data obtained include: the one-hot coding matrix D of the program theme u,t , the one-hot coding matrix D′ of the program itself u,t , D' U,t , user reward vector c u,t with c U,t.
[0176] In this example, if Image 6 As shown, for A in the user interest model u,t , b u,t , the present invention proposes an incremental update mechanism based on the LSTM memory module, and attempts to explore the possibility of combining the long-short-term memory in the LSTM with the incremental update of LinUCB. Since the recommendation system needs the output based on the cell state, the present invention only introduces the forget gate and the memory gate in the incremental update process, and discards the output gate in the LSTM. The specific process is as follows step (7-2).
[0177] (7-2) Combining the long short-term memory in LSTM with the incremental update of LinUCB to update the parameters in the user interest mining model, these parameters include the diagonal elements for the target user u to watch M u,t The matrix A of the cumulative times of each theme program in u,t and the cumulative reward vector b obtained by each program theme u,t. Among them, A u,t , b u,t The initial values ​​are I d and O d.
[0178] a) Set the weight of the LSTM memory gate, and dynamically correct it with the changing time interval, so as to calculate the weight i of the memory gate with the bottom of e and the function of the time interval as the index u,t :
[0179]
[0180] Among them, T u,t means, for the target user u, the hour corresponding to the actual time point represented by time step t, T u,t-1 The hour corresponding to the actual time point represented by the previous time step t-1.
[0181]b) Add a "peephole connection" to the forget gate, so that the gate can see the state of the cell, and set the forget gate function:
[0182] f u,t =tanh(T u,t -T u,t-1 ),
[0183] c), at time step t, A u,t , b u,t Incremental update of:
[0184]
[0185]
[0186] in,
[0187] A' u,t-1 =(1-f u,t )A u,t-1 ,
[0188]
[0189] b' u,t-1 =(1-f u,t )b u,t-1 ,
[0190]
[0191] in, is the dimension m u,t ×d matrix, its row vector is the one-hot encoding of the program theme, is D u,t transpose. c u,t is the dimension m u,t ×1 vector of reward values.
[0192] (7-3) Update the parameters in the item quality model, which include the diagonal elements viewing C for the target user u t The matrix A' of the cumulative times of each program in u,t and the accumulative reward vector b′ obtained by each program u,t , the diagonal elements are multi-user set U viewing C t The matrix A' of the average cumulative times of each program in U,t and the average cumulative reward vector b′ obtained by each program U,t. Due to the fixed quality of the program, A′ u,t , b' u,t with A' U,t , b' U,t The incremental update process adopts the principle of sampling average, and the specific incremental update process is as follows:
[0193] 1), update the parameter A' when the learning target user u has watched the program quality u,t , b' u,t , the initial values ​​are and A' u,t , b′ u,t The iteration formula of is as follows:
[0194]
[0195]
[0196] in, is the dimension m u,t ×n t A matrix whose row vectors are the one-hot encodings of the program itself. c u,t is the dimension m u,t ×1 vector of reward values.
[0197] 2), update the parameter A' when the learning target user u does not watch the program quality U,t , b' U,t , the initial values ​​are and A' U,t , b′ U,t The iteration formula of is as follows:
[0198]
[0199]
[0200] in, is the dimension m U,t ×n t The matrix, whose row vector is the one-hot encoding of the program itself, m U,t is the user behavior log record set M of the online video watched by the multi-user set U at time step t U,t The number of records in . c U,t is the dimension m U,t ×1 vector of reward values.
[0201] (7-4) Use the updated parameters to learn the user interest model and user quality model, and perform online recommendation in the next time step.
[0202] In this embodiment, the time step is set to hours. In the same time step, the recommender system is only updated once. In the same time step, although the target user may send a request to the recommender system at various time points, the recommender system only provides the same recommendation list in this time step.
[0203] like Figure 7 As shown, based on the above method, the present invention also discloses a multi-user-oriented recommendation system integration model framework, including an offline multi-user identification prediction module and an online recommendation system module, specifically:
[0204] In the case of cold start (within the first M hours), only the online recommendation system module is started to collect multi-user mixed behavior logs. After obtaining enough user information, in order to provide the log information of the target user to the recommendation system module, the multi-user identification and prediction module will be executed in a period of N hours. This module will provide the recommender system with identity labels and log records of target users. At each time step in the future, the online recommendation module will predict the dominant user for the next N sessions based on the hybrid log records in the last M sessions. The user behavior log records of the target user u are extracted from the multi-user mixed behavior log records through the multi-user feature tags constructed in the present invention.
[0205] The online recommendation system module includes user interest mining model and item quality model. Among them, the user interest mining model mainly learns the interest trend of target users on program topics to control the exploration mechanism, and the separated log files of target user u can help the online recommendation module build a user interest mining model. In addition, using the correlation between the separated user behavior log records of all target users, it helps the recommendation module to build a personalized item quality assurance model, so as to locate the target user's preference for specific programs. The item quality model can be divided into two parts: one part calculates the quality of the programs that the target user has watched, and the other part calculates the quality of the programs that the target user has not watched.
[0206] The experimental method of this embodiment will be further described below.
[0207] In this embodiment, the performance indicators used to evaluate the scheme proposed by the present invention include: precision (Precision), recall (Recall), MAP (Mean Average Precision) and novelty (Novelty). The specific meanings of these four indicators are as follows, and N represents the number of programs selected in the recommendation results:
[0208] Accuracy (Precision@N): Refers to the proportion of successfully recommended programs to actually recommended programs.
[0209] Recall rate (Recall@N): refers to the proportion of successfully recommended programs to the programs actually watched by users.
[0210] MAP (Map@N): It considers the order in which programs are arranged in the recommendation list. The higher the rank of the successfully recommended program, the higher the value.
[0211] Novelty (Novelty@N): It describes the average difference between new programs in the recommended list and programs known to the user, taking N=10. The larger the value, the wider the range of information that the recommender system can provide to target users.
[0212] First, in this embodiment, the effectiveness of the time-varying LinUCB algorithm in the user interest mining model is verified through preliminary experiments, and the performance of the time-varying LinUCB algorithm is compared with three baseline LinUCB algorithms, as follows:
[0213] LinUCB-1: The traditional LinUCB algorithm.
[0214] LinUCB-2: Introduce the LSTM-based incremental update proposed by the present invention into the traditional LinUCB.
[0215] LinUCB-3: Introduce the personalized adaptive exploration scheme proposed by the present invention into the traditional LinUCB.
[0216] Improved-LinUCB: The improved algorithm proposed by the present invention, which introduces a personalized adaptive exploration scheme and LSTM-based incremental update in the traditional LinUCB.
[0217] For the recommendation results of Improved-LinUCB, the experiment evaluates the performance of Improved-LinUCB from accuracy (Precision@N), recall rate (Recall@N), MAP (Map@N) and novelty (Novelty@N).
[0218] like Figure 8a As shown, compared with LinUCB-1, the performance of LinUCB-3 in terms of accuracy has been greatly improved, which shows that the calculation and control of the exploration ratio through the attention mechanism proposed by the present invention can significantly improve the accuracy of the recommendation system . In addition, it can be found that the LSTM incremental update process proposed by the present invention also improves the accuracy of the recommendation results to a certain extent.
[0219] from Figure 8b and Figure 8c It can be seen that, compared with the traditional algorithm, the performance of the time-varying LinUCB proposed by the present invention in terms of recall rate and MAP has been improved, which further shows that the recommendation scheme proposed by the present invention can better learn the user's interests, thereby recommending more suitable programs, and improving the accuracy of the sorting results of the recommendation results. In addition, compared with LinUCB-1 and LinUCB-2, LinUCB-3 has a very significant improvement in both recall rate and MAP. This result further shows that adjusting the exploration ratio through the attention mechanism can greatly improve the performance of the recommendation result. significantly.
[0220] Figure 8d The novelty of the recommendation results of the present invention is described, and it can be seen that the novelty of all the recommendation results is higher than 0.96. The recommendation results are consistent with the previous performance trends in terms of accuracy, recall and MAP. It can be seen from the figure that compared with LinUCB-1, the LinUCB-3 algorithm has a greater improvement in novelty, while On the contrary, the LinUCB-2 algorithm showed a decline phenomenon. It can be seen that the present invention introduces a personalized adaptive exploration strategy implemented by using the attention mechanism in the incremental update process, which can maintain and improve the diversity of recommendation results, while the LSTM-style incremental update weakens it to a certain extent. Such diversity. However, judging from the results of the improved recommendation algorithm, overall, the diversity of recommendation results shows an increasing trend.
[0221]Furthermore, we use different combinations of the two modules introduced above, the multi-user identification module and the recommendation system module, to evaluate the multi-user sharing-oriented multimedia network video recommendation system proposed by the present invention. Specifically, the multi-user identification module includes three schemes: multi-user periodic-identification, fixed-identification and no-identification. The schemes of the online recommendation module can be divided into the following three groups: A, B, and C:
[0222] Group A recommended solution: LinUCB (cold start type)
[0223] A1: A single time-varying LinUCB algorithm that only considers the user's interest in known program topics. We use A1 to verify the importance of multi-user association information used by time-invariant LinUCB.
[0224] A2: It means that in the recommendation system technical solution proposed by the present invention, the integration strategy based on cross-weighting is not adopted. We use A2 to verify the importance of cross weights. Thus in A2, the final estimated reward is given by s u,t ⊙(p v u,t ⊙ (p iv u,t +p iv U,t )) calculated by this formula.
[0225] A3: It means that in the recommendation system technical solution proposed by the present invention, no personalized parameters are used (see (4-4)). We use A3 to verify the effect of attention mechanism on recommender systems.
[0226] A4: It means that in the recommendation system technical solution proposed by the present invention, incremental update based on LSTM is not used (see (7-2)). We use A4 to verify the importance of the LSTM memory network.
[0227] Improved: refers to the online recommendation scheme proposed by the present invention.
[0228] Group B recommendation: Collaborative filtering algorithm (hot start type)
[0229] B1: User-based collaborative filtering. It is mainly based on the user's historical records to find user groups similar to the target user, and finally uses the interests of nearby users to generate the final recommendation results for the target user.
[0230] B2: Item-based collaborative filtering. The main principle is that the target user may like some programs similar to those he has watched. The similarity between programs is calculated by analyzing the user's log records.
[0231] B3: Content-based collaborative filtering. Its main principle is similar to B2, except that the similarity between users is calculated by analyzing the feature vector of the program (that is, the output of the LDA model).
[0232] Group C recommendation scheme: recommendation scheme based on deep neural network (hot start type)
[0233] GRU4Rec: Use the RNN algorithm to model the user's behavior sequence to predict the items that the target user may be interested in next.
[0234] SR-GNN: Use the GNN algorithm with an attention mechanism to model the user's behavior sequence and predict the items that the target user may be interested in next.
[0235] The recommendation system of the present invention can be compared and analyzed in various aspects by using the above schemes. We have recorded in detail the performance comparison results of the recommendation system scheme proposed by the present invention and the three groups of schemes A, B, and C. The session numbers of algorithms in group A, group B and group C are 41842, 27028 and 7167 respectively. In this embodiment, we use Precision@N, Recall@N, and Map@N three performance indicators to evaluate the performance of the recommendation system.
[0236] The comparative analysis of the experimental results of the three groups is as follows:
[0237] Group A is the cold start scheme. Compared with the results of A1, the recommendation scheme of the present invention, the time-invariant LinUCB plays a very important role in the recommendation system, which proves that it is indeed possible to make full use of multiple accounts separated from one account/device. User social information to ensure the quality of each program. At the same time, the results of A2 can verify that the cross-weighting strategy can better improve the accuracy of the recommendation system and reduce the risk of exploration by integrating the time-invariant LinUCB into the time-varying LinUCB. Comparing with A3, we found that the scheme of the present invention has a good improvement in Precision@5, Recall@5, Map@5, which proves that the attention mechanism in time-varying LinUCB can help recommender systems understand users accurately interests and their changing trends. In addition, compared with A4, the memory ability of LSTM is slightly improved in terms of precision Precision@5, Recall@5 and MAP@5, which can also speed up the convergence speed of the recommendation system in the cold start phase.
[0238] When comparing with the warm-start scenario, for uniformity, we only analyze the last week of the dataset. From Table 1, we find that the proposed recommendation scheme outperforms all collaborative filtering schemes in Group B during the hot start phase. In addition, in Group C, SR-GNN has the best performance, and the recommendation scheme proposed by the present invention has 20.5%, 2.9% and 4.8% improvements in Precision@5, Recall@5, and MAP@5, which fully proves that The recommended scheme of the present invention can well reduce the risk of exploration.
[0239] Table 1 Precision@N, Recall@N, Map@N of different recommendation schemes under different multi-user identification schemes
[0240]
[0241]
[0242] The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. All should be covered within the protection scope of the present invention.

PUM

no PUM

Description & Claims & Application Information

We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.

Similar technology patents

Method for extracting peanut polypeptide from peanut meal

PendingCN108300751AImprove utilizationImprove finished product quality
Owner:广西南宁人人想食品有限公司

Rotating disc type automatic assembly line

Owner:武汉孚特锂能科技有限公司

Wall type folding chair for cylindrical wall manned vessel

PendingCN107117182AImprove utilizationReasonable use of space
Owner:烟台福皓医疗设备有限公司

Remote medical system

PendingCN111489835AImplement interaction managementImprove utilization
Owner:四川君德利远程医疗科技有限公司

Classification and recommendation of technical efficacy words

Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products