A training method of a recommendation model, a recommendation method, and related devices

By using an encoder and decoder to generate and calibrate the overall value vector in the recommendation model, and determining the loss function for parameter updates, the problem of sample bias not being corrected in a timely manner in existing technologies is solved, resulting in more accurate recommended content.

CN116522996BActive Publication Date: 2026-06-26ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Filing Date
2023-04-03
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing recommendation models cannot correct and calibrate sample bias in a timely manner during training, resulting in inaccurate recommendations.

Method used

By acquiring behavioral sequence samples labeled with real results, an encoder and decoder are used for training to generate an overall value vector and perform calibration. The loss function is then determined to update the model parameters until convergence.

Benefits of technology

It improves the prediction accuracy of the recommendation model and solves the problem of inaccurate recommendations caused by the inability to correct sample bias in a timely manner.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116522996B_ABST
    Figure CN116522996B_ABST
Patent Text Reader

Abstract

Embodiments of the present application disclose a recommendation model training method, a recommendation method and related devices. A user behavior sequence sample marked with a real result is input into a recommendation model to obtain an overall value vector. Then, the overall value vector is calibrated according to a pricing value vector to obtain a calibrated value vector. The calibrated value vector and previous user behavior features are input into the recommendation model again to obtain a prediction result. The calibrated value vector can help the recommendation model obtain a more accurate prediction result. A loss function is determined based on the behavior features, the prediction result and the real result. The recommendation model is updated based on the loss function until the loss function converges.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments of this application relate to the field of computer technology, and in particular to a method for training a recommendation model, a recommendation method, and related apparatus. Background Technology

[0002] In existing technologies, the recommendation models used are generally neural network models such as MLP. The training of such models relies solely on vector samples. Various feature data are converted into vectors and concatenated together before being fed into the model for learning and training. This makes it impossible to correct and calibrate deviations in the samples in a timely manner, resulting in inaccurate recommendations. Summary of the Invention

[0003] Embodiments of this application provide a training method for a recommendation model, a recommendation method, and related apparatus. The technical solution is as follows:

[0004] In a first aspect, embodiments of this application provide a method for training a recommendation model. The method includes: acquiring behavioral sequence samples, each behavioral sequence sample being a sequence of behavioral features labeled with a real result, the behavioral feature sequence being a sequence composed of multiple behavioral features arranged chronologically; inputting the behavioral sequence samples into a recommendation model to obtain an overall value vector; obtaining a calibration value vector based on the pricing value vector and the overall value vector; inputting the calibration value vector and the behavioral sequence samples into the recommendation model to obtain a prediction result; determining a loss function based on the behavioral features, the prediction result, and the real result; and updating the parameters of the recommendation model based on the loss function until the loss function converges.

[0005] Secondly, embodiments of this application provide a recommendation method, which includes: obtaining a sequence of behavioral features to be recommended; inputting the sequence of behavioral features to be recommended into a recommendation model to obtain a prediction result, wherein the recommendation model is the recommendation model described above.

[0006] Thirdly, embodiments of this application provide a training apparatus for a recommendation model, comprising: an acquisition module for acquiring behavioral sequence samples, each behavioral sequence sample being a behavioral feature sequence labeled with a true result, the behavioral feature sequence being a sequence composed of multiple behavioral features arranged in time; an encoding module for inputting the behavioral feature sequences into a recommendation model to obtain an overall value vector; a calibration module for obtaining a calibration value vector based on the pricing value vector and the overall value vector; a decoding module for inputting the calibration value vector and the behavioral sequence samples into the recommendation model to obtain a prediction result; a loss module for determining a loss function based on the behavioral features and the prediction result; and an update module for updating the parameters of the recommendation model based on the loss function until the loss function converges.

[0007] Fourthly, embodiments of this application provide a recommendation device, the recommendation device comprising: an acquisition module for acquiring a sequence of behavioral features to be recommended; and an input module for inputting the sequence of behavioral features to be recommended into a recommendation model to obtain a prediction result, wherein the recommendation model is the recommendation model described above.

[0008] Fifthly, embodiments of this application provide a computer storage medium storing a plurality of instructions adapted for loading by a processor and executing the above-described method steps.

[0009] Sixthly, embodiments of this application provide an electronic device that may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to execute the above-described method steps.

[0010] The beneficial effects of the technical solutions provided in some embodiments of this application include at least the following:

[0011] In one or more embodiments of this application, a total value vector is obtained by inputting user behavior sequence samples labeled with real results into the recommendation model. This total value vector is then calibrated based on the pricing value vector to obtain a calibrated value vector, achieving timely correction and calibration of sample biases. The calibrated value vector, along with various user behavior features from before, is then fed back into the recommendation model to obtain prediction results. The calibrated value vector helps the recommendation model obtain more accurate prediction results. Based on the aforementioned behavioral features, prediction results, and real results, a loss function is determined. The parameters of the recommendation model are updated based on this loss function until the loss function converges. After the recommendation model is trained, the recommendations obtained using this model are more accurate than those obtained using traditional methods, solving the problem of inaccurate recommendations caused by the inability to timely correct and calibrate sample biases. Attached Figure Description

[0012] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0013] Figure 1 This is a schematic diagram of a training system for a recommendation model provided in this manual.

[0014] Figure 2 This is a flowchart illustrating a training method for a recommendation model provided in this manual.

[0015] Figure 3 It is based on Figure 2 A schematic diagram of the recommendation model network structure is shown in the corresponding embodiment.

[0016] Figure 4 It is based on Figure 2 A flowchart illustrating a specific implementation of step S100 in the training method of the recommendation model shown in the corresponding embodiment.

[0017] Figure 5 It is based on Figure 4 A flowchart illustrating a specific implementation of step S130 in the training method of the recommendation model shown in the corresponding embodiment.

[0018] Figure 6 It is based on Figure 2 A flowchart illustrating a specific implementation of step S200 in the training method of the recommendation model shown in the corresponding embodiment.

[0019] Figure 7It is based on Figure 6 A flowchart illustrating a specific implementation of step S210 in the training method of the recommendation model shown in the corresponding embodiment.

[0020] Figure 8 This is a flowchart illustrating a recommended method provided in this manual.

[0021] Figure 9 It is based on Figure 8 A flowchart illustrating a specific implementation of step S900 in the recommended method shown in the corresponding embodiment.

[0022] Figure 10 It is based on Figure 9 A flowchart illustrating a specific implementation of step S930 in the recommended method shown in the corresponding embodiment.

[0023] Figure 11 This is a schematic diagram of the structure of a training device for a recommended model provided in this specification.

[0024] Figure 12 This is a schematic diagram of a recommended device provided in this specification.

[0025] Figure 13 This is a schematic diagram of the structure of an electronic device provided in this specification.

[0026] Figure 14 This is a schematic diagram of the operating system and user space provided in this manual.

[0027] Figure 15 yes Figure 14 Architecture diagram of the Android operating system in China.

[0028] Figure 16 yes Figure 14 Architecture diagram of the iOS operating system. Detailed Implementation

[0029] The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of this application without creative effort are within the scope of protection of the embodiments of this application.

[0030] In the description of the embodiments of this application, it should be understood that the terms "first," "second," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance. In the description of the embodiments of this application, it should be noted that, unless otherwise expressly specified and limited, "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or devices. Those skilled in the art can understand the specific meaning of the above terms in the embodiments of this application based on the specific circumstances. Furthermore, in the description of the embodiments of this application, unless otherwise stated, "multiple" refers to two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, and B alone. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship.

[0031] The embodiments of this application will be described in detail below with reference to specific examples.

[0032] Please see Figure 1 This is a schematic diagram illustrating a training system for a recommendation model, provided as an embodiment of this application. Figure 1 As shown, the training system for the recommendation model may include at least a client cluster and a service platform 100.

[0033] The client cluster may include at least one client, such as Figure 1 As shown, it specifically includes client 1 corresponding to user 1, client 2 corresponding to user 2, ..., client n corresponding to user n, where n is an integer greater than 0.

[0034] Each client in a client cluster can be an electronic device with communication capabilities, including but not limited to: wearable devices, handheld devices, personal computers, tablets, in-vehicle devices, smartphones, computing devices, or other processing devices connected to a wireless modem. Electronic devices may have different names in different networks, such as: user equipment, access terminal, user unit, user station, mobile station, mobile station, remote station, remote terminal, mobile device, user terminal, terminal, wireless communication equipment, user agent or user device, cellular phone, cordless phone, personal digital assistant (PDA), and electronic devices in 5G networks or future evolved networks.

[0035] The service platform 100 can be a standalone server device, such as a rack-mount, blade, tower, or cabinet-type server device, or a workstation, mainframe, or other hardware device with strong computing power; or it can be a server cluster composed of multiple servers. The servers in the service cluster can be composed in a symmetrical manner, wherein each server is functionally and hierarchically equivalent in the transaction chain, and each server can provide services independently. The independent provision of services can be understood as not requiring the assistance of other servers.

[0036] In one or more embodiments of this application, the service platform 100 can establish a communication connection with at least one client in the client cluster, and complete the data interaction during the training process of the recommendation model based on the communication connection, such as online transaction data interaction. The transaction data includes, but is not limited to, various types of behavioral feature data interaction, and the specific transaction service type is determined based on the actual application situation.

[0037] For example, the service platform 100 can use the recommendation model obtained by the training method of the recommendation model in the embodiments of this application to perform content recommendation to the client; or, the service platform 100 can obtain training data and user behavior data from the client.

[0038] It should be noted that the service platform 100 establishes a communication connection with at least one client in the client cluster via a network for interactive communication. This network can be a wireless network or a wired network. Wireless networks include, but are not limited to, cellular networks, wireless LANs, infrared networks, or Bluetooth networks. Wired networks include, but are not limited to, Ethernet, universal serial bus (USB), or controller area networks. In one or more embodiments of the specification, technologies and / or formats including Hyper Text Markup Language (HTML), Extensible Markup Language (XML), etc., are used to represent data exchanged over the network (such as target compressed packets). Furthermore, conventional encryption technologies such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), and Internet Protocol Security (IPsec) can be used to encrypt all or some links. In other embodiments, customized and / or dedicated data communication technologies can be used to replace or supplement the aforementioned data communication technologies.

[0039] The training system embodiments of the recommendation model provided in this application and the training methods of the recommendation model in one or more embodiments belong to the same concept. The execution entity corresponding to the training method of the recommendation model involved in one or more embodiments of the specification can be the aforementioned service platform 100; the execution entity corresponding to the training method of the recommendation model involved in one or more embodiments of the specification can also be the electronic device corresponding to the client, specifically determined based on the actual application environment. The implementation process of the training system embodiments of the recommendation model can be detailed in the following method embodiments, and will not be repeated here.

[0040] based on Figure 1 The following is a detailed description of the training method of the recommendation model provided by one or more embodiments of this application, as illustrated in the scenario diagram.

[0041] Please see Figure 2 This document provides a flowchart illustrating a method for training a recommendation model, one or more embodiments of which are described in this application. This method can be implemented using a computer program and can run on a training device for recommendation models based on the von Neumann architecture. The computer program can be integrated into an application or run as a standalone utility application. The training device for the recommendation model can be a service platform.

[0042] Specifically, the training methods for this recommendation model include:

[0043] Step S000: Obtain behavioral sequence samples. Each behavioral sequence sample is a behavioral feature sequence labeled with a real result. The behavioral feature sequence is a sequence composed of multiple behavioral features arranged in time.

[0044] Step S100: Input the behavior sequence sample into the recommendation model to obtain the overall value vector.

[0045] Step S200: Obtain the calibration value vector based on the pricing value vector and the overall value vector.

[0046] Step S300: Input the calibration value vector and the behavior sequence sample into the recommendation model to obtain the prediction result.

[0047] Step S400: Determine the loss function based on the behavioral features, the prediction results, and the actual results.

[0048] Step S500: Update the parameters of the recommendation model based on the loss function until the loss function converges.

[0049] In the embodiments of this application, a total value vector is obtained by inputting user behavior sequence samples labeled with real results into the recommendation model. This total value vector is then calibrated based on the pricing value vector to obtain a calibrated value vector, achieving timely correction and calibration of sample biases. The calibrated value vector, along with various user behavior features from before, is then fed back into the recommendation model to obtain prediction results. The calibrated value vector helps the recommendation model obtain more accurate predictions. Based on the aforementioned behavioral features, prediction results, and real results, a loss function is determined. The parameters of the recommendation model are then updated based on this loss function until it converges. After the recommendation model is trained, the recommendations obtained using this model are more accurate than those obtained using traditional methods, solving the problem of inaccurate recommendations caused by the inability to timely correct and calibrate sample biases.

[0050] In one embodiment of this application, the recommendation model includes an encoder and a decoder; step S100 specifically includes: inputting the behavior sequence sample into the encoder to obtain the overall value vector; step S300 specifically includes: inputting the calibration value vector and the behavior sequence sample into the decoder to obtain the prediction result.

[0051] In practice, the recommendation model can be trained using only one neural network model or by jointly training multiple neural network models.

[0052] When joint training is required, such as Figure 3 As shown, the recommendation model can include an encoder and a decoder, which are trained using the same neural network model.

[0053] The specific steps are as follows: Obtain behavioral sequence samples, each of which is a behavioral feature sequence labeled with a true result; input the behavioral sequence samples into the encoder to obtain the overall value vector; obtain the calibration value vector based on the pricing value vector and the overall value vector; input the calibration value vector and the behavioral sequence samples into the decoder to obtain the prediction result; determine the loss function based on the behavioral features, the prediction result, and the true result; update the parameters of the encoder and decoder based on the loss function until the loss function converges. After the model is trained, the decoder can be extracted and deployed as a practical recommendation model.

[0054] When training a neural network model, it can be simply understood that both the encoder and decoder are based on that neural network model. Joint training with both the encoder and decoder networks, where the updated parameters of the encoder and decoder can differ, can improve the targeted accuracy of model parameter updates, resulting in a more precise and efficient training model.

[0055] In step S000, the behavioral features may include user features, recommendation item features, and pre-trained features. User features include basic user information, user behavior, and other user-related basic information. Recommendation item features include basic features and aggregate features of the objects that interact with the user. Pre-trained features are obtained by processing user behavior time sequences. User behavior time sequences are time series formed by arranging user behaviors in chronological order; they only contain user behaviors and do not include features of the behavioral objects.

[0056] Specifically, the way to obtain pre-trained features is to input the time sequence of user behavior into the pre-trained model, and the pre-trained model outputs the pre-trained features.

[0057] The training method for the pre-trained model specifically includes the following steps: obtaining a set of user behavior time-series samples, with each user behavior time-series sample having a corresponding pre-training feature pre-labeled; inputting the data of each user behavior time-series sample into the pre-trained model to obtain the pre-training features output by the pre-trained model; if, after inputting the data of a user behavior time-series sample into the pre-trained model, the obtained pre-trained model is inconsistent with the pre-labeled pre-training features of the user behavior time-series sample, the coefficient of the type level judgment is adjusted until they are consistent; when the data of all user behavior time-series samples are input into the pre-trained model, the obtained pre-trained model is consistent with the pre-labeled pre-training features of the user behavior time-series sample, and the training ends.

[0058] Understandably, the behavioral features input into the recommendation model can include only user features and recommendation item features, or they can include user features, recommendation item features, and pre-trained features.

[0059] In step S100, the specific processing flow of the recommendation model can be referred to in the following embodiment.

[0060] Specifically, in some embodiments, the specific implementation of step S100 can be found in [reference needed]. Figure 4 . Figure 4 It is based on Figure 2 The detailed description of step S100 in the training method of the recommendation model shown in the corresponding embodiment is as follows: In the training method of the recommendation model, the recommendation model includes multiple behavior towers, each of which corresponds to different types of behavior features. Step S100 may include the following steps:

[0061] Step S110: Input the behavioral features into the recommendation model one by one in sequence.

[0062] Step S120: For each input behavioral feature, the corresponding behavioral tower is invoked according to the type of the behavioral feature.

[0063] Step S130: Input each of the behavioral features into the corresponding behavioral tower to obtain the corresponding target value.

[0064] Step S140: Determine the overall value vector based on each target value.

[0065] In the embodiments of this application, the recommendation model includes multiple behavior towers, and the number of behavior towers is fixed. Each behavior tower corresponds to a type of behavioral feature, and the types corresponding to each behavior tower are different.

[0066] When behavioral features are input into the recommendation model one by one in sequence, for each input behavioral feature, the corresponding behavioral tower is invoked according to the type of the behavioral feature, and the behavioral feature is input into the behavioral tower to obtain the corresponding target value. After obtaining the target value corresponding to each behavioral feature, the first overall value can be obtained.

[0067] The aforementioned behavior pyramid is essentially a simple neural network, such as a DNN. Its main function is to integrate and process user behavior features, including user features, recommendation features, and pre-trained features, input from the feature layer, forming a vector that fuses all the aforementioned features. Compared to multiple features concatenated together in the feature layer, this vector has a smaller dimension, making it easier to calculate the target value and easier to input into downstream temporal networks.

[0068] The embodiments in this specification employ behavior towers, which can effectively solve various problems caused by the varying encoding lengths of different user behavior feature data. By arranging user behavior feature data into behavior feature sequences, the corresponding behavior tower is invoked once for each input behavior feature to obtain the behavior data. The number of behavior tower invocations corresponds to the number of behavior features in a behavior feature sequence, enabling the recommendation model to adapt to behavior feature sequences of various lengths. This solves the various problems caused by processing user feature data of different lengths in the past. There is no need to truncate excessively long user feature data, thus avoiding information loss, nor is there a need to supplement excessively short user feature data, thus avoiding storage waste.

[0069] In step S110, the specific way to input each behavioral feature into the recommendation model in chronological order can be to first arrange each behavioral feature in chronological order, then input the earliest behavioral feature into the recommendation model first, and then input the subsequent behavioral features into the recommendation model in the order they are arranged.

[0070] In another embodiment, the earliest behavior feature can be extracted sequentially from the set of all behavioral features and input into the recommendation model until all behavioral features have been input into the recommendation model.

[0071] In step S120, when the behavioral features are input into the recommendation model one by one in sequence, for each input behavioral feature, the corresponding behavioral tower is called according to the type of the behavioral feature, and the behavioral feature is input into the behavioral tower.

[0072] For example, in one embodiment there are three behavior towers: behavior tower A, behavior tower S, and behavior tower D.

[0073] In this model, behavior tower A corresponds to behavior features of type 'a', behavior tower S corresponds to behavior features of type 's', and behavior tower D corresponds to behavior features of type 'd'. For a sequence of behavior features whose types are, in order, 'asadd', when this sequence is input into the recommendation model, the behavior towers are invoked in the following order: first behavior tower A, then behavior tower S, then the previously invoked behavior tower A, then behavior tower D, and finally the previously invoked behavior tower D.

[0074] In step S130, the behavior tower performs integrated calculations on multiple user features, recommendation features, and pre-trained features that make up a behavior feature, and finally obtains the target value.

[0075] Specifically, in some embodiments, the specific implementation of step S130 can be found in [reference needed]. Figure 5 . Figure 5 It is based on Figure 4 The detailed description of step S130 in the training method of the recommendation model shown in the corresponding embodiment includes the following steps:

[0076] Step S132: Input each behavioral feature into the corresponding behavioral tower to obtain the corresponding behavioral value and behavioral probability.

[0077] Step S134: Determine the corresponding target value based on the behavioral value and behavioral probability of each behavioral feature.

[0078] In the embodiments of this application, the behavior tower first performs integrated calculations on the behavior features to obtain the behavior value and behavior probability corresponding to the behavior feature, and then obtains the target value based on the behavior value and behavior probability.

[0079] In step S132, the behavior value is a weight matrix, which represents the value generated in this recommendation scenario. The behavior probability is a vector, which represents the probability of performing the next behavior.

[0080] Specifically, in some embodiments, the specific implementation of step S132 can be found in the following embodiments. This embodiment is based on... Figure 5The detailed description of step S132 in the training method of the recommendation model shown in the corresponding embodiment, wherein the behavior pyramid is arranged in sequence, and step S132 may include the following steps:

[0081] The behavioral characteristics and preset probabilities are input into the corresponding behavioral tower to obtain the corresponding behavioral value and behavioral probability.

[0082] The behavior probability is used as the preset probability input for the next behavior tower until all behavior towers have obtained behavior value and behavior probability.

[0083] In this embodiment, the input to the behavior tower includes not only behavioral features but also preset probabilities. The preset probability of the first behavior tower is 1 by default, and the preset probabilities of other behavior towers are the behavioral probabilities output by the behavior tower above it. That is, after a behavior tower obtains a behavioral value and a behavioral probability through integrated calculation, the behavioral probability output by that behavior tower is input to its next adjacent behavior tower. The behavioral probability output by that behavior tower represents the probability that a next behavior will occur after the behavior represented by the behavioral feature input by that behavior tower has occurred. In other words, it can be understood as the probability of the next behavior occurring. Inputting it into the next behavior tower is equivalent to adding a probability weight to the calculation of the next behavior tower to ensure the accuracy of the calculation result.

[0084] In step S134, the behavior probability represents the probability of the next behavior occurring, which is generated by calculating the behavior value of the previous behavior through the behavior pyramid. Because the behavior value calculation of the next behavior is related to the previous behavior, it is necessary to introduce the behavior probability and perform a weighted calculation to obtain the weighted behavior value.

[0085] In one embodiment, the target value is calculated as the behavior value multiplied by the behavior probability.

[0086] In step S140, the target values ​​corresponding to all behavioral features in the behavioral sequence samples are aggregated together to obtain the overall value vector. Each dimension (or component) of the overall value vector is the target value corresponding to a behavioral feature.

[0087] In step S200, the overall value vector is calibrated by the pricing value vector, thereby enabling timely correction and calibration of deviations in the sample.

[0088] Specifically, in some embodiments, the specific implementation of step S200 can be found in [reference needed]. Figure 6 . Figure 6 It is based on Figure 2 The detailed description of step S200 in the training method of the recommendation model shown in the corresponding embodiment includes the following steps:

[0089] Step S210: The pricing value vector is determined using a pricing calibration algorithm.

[0090] Step S220: Calibrate the overall value vector according to the pricing value vector to obtain the calibrated value vector.

[0091] In this embodiment, a pricing value vector is first obtained using a pricing calibration algorithm, and then the overall value vector is calibrated using the pricing value vector. The pricing calibration algorithm can be executed by a pricing calibration module, which is used to determine a uniform price for each behavior and calibrate the corresponding target value based on this uniform price.

[0092] Unified pricing is based on extensive analysis of user behavior data and algorithm backtesting to determine the specific value of each behavior. For example, the value of a click is often greater than that of browsing. By pricing each behavior, the importance of different behaviors to users can be determined from big data. Weighted recommendations based on the importance of behaviors can more accurately suggest products that users are most interested in at any given moment. Unified pricing also addresses the problem of inaccurate judgments caused by the scarcity of data on inactive users. The richness of real user behavior data varies greatly; compared to highly active users, inactive users have very sparse or missing behavioral data, and some data quality cannot fully reflect the value of the behavior. Models trained on such data are not easy to train effectively, and their performance fluctuates significantly. However, by using unified pricing based on big data to correct the behavioral value of inactive users, the model can be made as close as possible to the true value of behavior, making it more suitable for model training. In practice, models trained in this way provide more accurate predictions for inactive users compared to traditional models.

[0093] In step S210, the pricing value vector can be directly determined based on the user pricing calibration algorithm. The specific steps of the pricing algorithm can be found in the following embodiment:

[0094] Specifically, in some embodiments, the specific implementation of step S210 can be found in [reference needed]. Figure 7 . Figure 7 It is based on Figure 6 The detailed description of step S210 in the training method of the recommendation model shown in the corresponding embodiment includes the following steps:

[0095] Step S212: Determine the pricing value corresponding to each of the behavioral characteristics.

[0096] Step S214: Determine the pricing value vector based on the pricing value corresponding to each behavioral feature.

[0097] In this embodiment, the pricing value corresponding to each behavioral feature included in the overall value vector is first determined, and then the pricing values ​​corresponding to each behavioral feature are aggregated together to obtain the pricing value vector.

[0098] In step S212, the pricing value corresponding to each behavioral feature can be determined by querying a table that maps behavior to value. This table is derived from the analysis of a large amount of user behavior data and algorithm backtesting. It should be a multi-row, two-column table (one column for behavior and the other for value), and this table can be converted into a standard calibration vector.

[0099] In other embodiments, the pricing value corresponding to each behavior can also be determined by relevant calculation formulas.

[0100] In other embodiments, the corresponding pricing value can also be determined using a pricing model.

[0101] Specifically, in some embodiments, behavioral sequence samples can be input into a pricing model, and the pricing model can output the pricing value corresponding to each behavioral feature in the behavioral sequence samples.

[0102] The training method for this pricing model specifically includes: acquiring a behavioral sequence sample set, wherein the behavioral sequence sample set contains multiple behavioral sequence samples, and each behavioral sequence sample is labeled with a pricing value corresponding to each behavioral feature in the behavioral sequence sample; inputting the behavioral sequence samples in the behavioral sequence sample set into the pricing model to obtain the pricing value corresponding to each behavioral feature in the behavioral sequence sample output by the pricing model; if, after inputting behavioral sequence samples of no more than a predetermined proportion into the pricing model, the pricing value corresponding to each behavioral feature in the obtained behavioral sequence sample is inconsistent with the pricing value corresponding to each behavioral feature in the labeled behavioral sequence sample, then the coefficients of the pricing model are adjusted; if, after inputting behavioral sequence samples of more than a predetermined proportion into the pricing model, the pricing value corresponding to each behavioral feature in the obtained behavioral sequence sample is consistent with the pricing value corresponding to each behavioral feature in the labeled behavioral sequence sample, then the training ends.

[0103] It should be noted that the pricing value corresponding to each behavioral feature in each behavioral sequence sample labeled on the above-mentioned behavioral sequence sample is obtained based on the analysis of a large amount of user behavior data and algorithm backtesting, and is the result of big data calculation.

[0104] In step S214, the pricing values ​​corresponding to each behavioral feature are aggregated together to form a multi-dimensional vector, which is the pricing value vector. Each dimension (or component) corresponds to the pricing value of a behavioral feature.

[0105] In step S220, the overall value vector is calibrated according to the pricing value vector. There are various ways to obtain the calibrated value vector, such as using basic calibration algorithms such as analysis of variance and ordinal regression to calculate and calibrate the two. For details, please refer to the following embodiments.

[0106] In this embodiment, the overall value vector is calibrated to reduce the difference between the overall value vector and the pricing value vector within a certain threshold. At this point, the model is essentially guided by a user behavior-based pricing algorithm (command center), preventing significant value deviations.

[0107] Specifically, in some embodiments, the specific implementation of step S220 can be found in the following embodiments. This embodiment is based on... Figure 6 The detailed description of step S220 in the training method of the recommendation model shown in the corresponding embodiment is as follows: In the training method of the recommendation model, each dimension of the overall value vector corresponds to a target value, and each dimension of the pricing value vector corresponds to a pricing value corresponding to a behavioral feature. Step S220 may include the following steps:

[0108] Determine the degree of difference between the pricing value vector and the overall value vector;

[0109] If the degree of difference is greater than a predetermined degree of difference threshold, the overall value vector is adjusted to reduce the difference between the target value corresponding to at least one behavioral feature and the pricing value corresponding to that behavioral feature, thereby obtaining a calibrated value vector.

[0110] In the embodiments of this application, the degree of difference between the pricing value vector and the overall value vector is first determined, and then the overall value vector is adjusted according to the degree of difference.

[0111] Specifically, the degree of difference can be determined by calibration using basic calibration algorithms such as analysis of variance and ordinal-preserving regression. The corresponding degree of difference corresponds to the variance in analysis of variance and the loss function in ordinal-preserving regression.

[0112] For example, when calibrating through analysis of variance, the degree of difference is the difference between the means of the pricing value vector and the overall value vector, and the predetermined threshold for the degree of difference is the minimum significant difference, which is calculated using the following formula:

[0113]

[0114]

[0115] in, For the smallest significant difference, The joint variance of the pricing value vector and the overall value vector. This refers to the number of dimensions (i.e., the number of components) of the pricing value vector. Let Variance be the variance of the pricing value vector. This refers to the number of dimensions (i.e., the number of components) of the overall value vector. The variance of the overall value vector. is the t-test coefficient. When the difference between the mean of the pricing value vector and the overall value vector exceeds a predetermined threshold for the degree of difference, the overall value vector needs to be adjusted.

[0116] For example, when calibrating using ordinal-preserving regression, this degree of difference is the loss function of ordinal-preserving regression, and its calculation formula is:

[0117]

[0118] in, For loss function, For sample weights, For pricing value, For target value, The number of behavioral characteristics.

[0119] In ordinal-preserving regression calibration, the threshold for the degree of difference is the threshold for the loss function, which can be set according to actual needs, for example, 0.1. When adjusting the overall value vector, the following embodiments can be referenced: In one embodiment, when the degree of difference is greater than a predetermined threshold, each target value in the overall value vector is adjusted by a predetermined adjustment range to approach its corresponding pricing value, until the degree of difference is less than the predetermined threshold. The pricing value corresponding to the target value is the pricing value corresponding to the behavioral feature of that target value. In another embodiment, when the degree of difference is greater than a predetermined threshold, the target value in the overall value vector with the largest difference from its corresponding pricing value is adjusted by a predetermined adjustment range, until the degree of difference is less than the predetermined threshold. In yet another embodiment, when the degree of difference is greater than the predetermined threshold, the target values ​​in the overall value vector with the largest difference from their corresponding pricing values ​​are sequentially adjusted to their corresponding pricing values, until the degree of difference is less than the predetermined threshold. At this point, it is equivalent to unified guidance based on the user behavior pricing algorithm (command center), preventing significant value deviations.

[0120] In step S300, the behavior sequence samples are input into the recommendation model again. This time, the calibration value vector is also input, serving as the input to the first behavior tower to correct the outputs of each behavior tower, making them closer to the true values. It should be noted that the output in step S300 includes the predicted probability of each behavior feature occurring in the behavior feature sequence. Therefore, in step S400, the loss function is obtained by calculating the behavior features, the predicted results, and the true results. The loss function can be calculated using cross-entropy, with each step determining whether the behavior has occurred in the true label.

[0121] The specific formula for calculating the loss function is as follows:

[0122]

[0123]

[0124] in, For loss function, For behavioral characteristics, behavioral characteristics The corresponding real tags, Behavioral features in behavioral sequence samples The loss, behavioral characteristics The probability of predicting a positive class is the sum of the values ​​of all behavioral features in the behavioral sequence sample. Together they constitute the prediction result. This represents the number of input behavioral sequence samples.

[0125] In step S500, the loss function is updated, which can be done through backpropagation or gradient descent. Training stops when the loss function converges or falls below a predetermined loss threshold.

[0126] Please see Figure 8 This document provides a flowchart illustrating a recommendation method for one or more embodiments of this application. This method can be implemented using a computer program and can run on a recommendation device based on the von Neumann architecture. The computer program can be integrated into an application or run as a standalone utility application. The recommendation device can be a service platform.

[0127] Specifically, the recommended method includes:

[0128] Step S800: Obtain the behavioral feature sequence to be recommended.

[0129] Step S900: Input the behavioral feature sequence to be recommended into the recommendation model to obtain the prediction result. The recommendation model is the recommendation model described above.

[0130] In the embodiments of this application, the behavioral feature sequence to be recommended is first obtained, and then the behavioral feature sequence to be recommended is input into the recommendation model trained through the above embodiments to obtain the prediction result. The recommendation content obtained using the above recommendation model is more accurate than the recommendation content obtained using traditional models, solving the problem that the bias of the samples cannot be corrected and calibrated in a timely manner, resulting in inaccurate recommendation content.

[0131] In step S800, the behavioral feature sequence to be recommended, that is, the behavioral feature sequence of the user to be recommended, is a feature sequence in which all behavioral features of the user to be recommended are arranged in chronological order over a period of time, which can reflect the user's preferences to a certain extent.

[0132] In step S900, the recommendation model is the recommendation model trained through the above embodiments, and the output of the recommendation model is the prediction result.

[0133] Specifically, in some embodiments, the specific implementation of step S900 can be found in [reference needed]. Figure 9 . Figure 9 It is based on Figure 8 In the detailed description of step S900 in the recommendation method shown in the corresponding embodiment, the model includes multiple behavior towers, each behavior tower corresponding to different types of behavioral features. Step S900 in the recommendation method may include the following steps:

[0134] Step S910: Input the behavioral features into the recommendation model one by one in sequence.

[0135] Step S920: For each input behavioral feature, the corresponding behavioral tower is invoked according to the type of the behavioral feature.

[0136] Step S930: Input each of the behavioral features into the corresponding behavioral tower to obtain the prediction result.

[0137] In the embodiments of this application, the recommendation model still processes behavioral features through a behavior tower. That is, in the training steps described above, when updating parameters, the main focus is on updating the parameters of the behavior tower. Each behavioral feature is input into its corresponding behavior tower to obtain the prediction result.

[0138] In step S930, the specific method for obtaining the prediction result from the behavior tower can be referred to in the following embodiment.

[0139] Specifically, in some embodiments, the specific implementation of step S930 can be found in [reference needed]. Figure 10 . Figure 10 It is based on Figure 9According to the detailed description of step S930 in the recommended method shown in the corresponding embodiment, step S930 in the recommended method may include the following steps:

[0140] Step S932: Input each of the behavioral features into the corresponding behavioral tower to obtain the corresponding behavioral probability.

[0141] Step S934: Normalize the probability of each behavior to obtain the prediction result.

[0142] In this embodiment, each behavioral feature is first input into the corresponding behavioral tower to obtain the corresponding behavioral probability. Then, each behavioral probability is normalized to obtain the prediction result.

[0143] In step S932, similar to the training step, you only need to input the behavioral features into the corresponding behavioral tower in sequence to obtain the corresponding behavioral probabilities.

[0144] In step S934, after obtaining the behavior probability, the behavior probability is subjected to softmax normalization so that the sum of all behavior probabilities is 1, that is, the probability of each recommendation item is obtained. The probability of the recommendation item can be understood as the degree of recommendation of the recommendation item.

[0145] It should be noted that, as described in the training method above, a behavioral feature includes user features and recommendation item features. User features include basic user information, user behavior, and other basic information related to the user. Recommendation item features include basic features and aggregate features of the objects that interact with the user. The objects that interact with the user are the recommendations, which are often referred to as materials. In specific applications, they can include one or more of products, copy, cards, buttons, etc. Therefore, the behavioral probability output by the behavior pyramid is not only the probability of the behavior occurring, but also the probability of the user interacting with the recommendation item. The higher the probability of the user interacting with the recommendation item, the more interested the user is in the recommendation item, and the more suitable it is to be recommended to the user.

[0146] Specifically, in some embodiments, the specific implementation of step S934 can be found in the following embodiments. This embodiment is based on... Figure 10 According to the detailed description of step S934 in the recommended method shown in the corresponding embodiment, step S934 in the recommended method may include the following steps:

[0147] The probability of each behavior is normalized to obtain the probability of the corresponding recommendation item.

[0148] Each recommendation is sorted according to its probability to obtain the prediction result.

[0149] In this embodiment, after obtaining the probability of the recommended items, each recommended item is sorted according to its probability to form a prediction result, so as to more intuitively display the recommended items that the user is interested in.

[0150] The following will combine Figure 11 This application provides a detailed description of the training apparatus for the recommendation model provided in the embodiments. It should be noted that... Figure 11 The training apparatus for the recommendation model shown is used to execute the embodiments of this application. Figures 1-7 The methods in the illustrated embodiments are shown only for ease of explanation, illustrating the parts relevant to the embodiments of this application. For specific technical details not disclosed, please refer to the embodiments of this application. Figures 1-7 The example shown.

[0151] Please see Figure 11 This diagram illustrates the structure of a training device for a recommendation model according to an embodiment of this application. The training device 11 for the recommendation model can be implemented as all or part of a user terminal through software, hardware, or a combination of both. According to some embodiments, the training device 11 for the recommendation model includes an acquisition module 111, an encoding module 112, a calibration module 113, a decoding module 114, a loss module 115, and an update module 116. Specifically, the acquisition module 111 is used to acquire behavioral sequence samples, each of which is a sequence of behavioral features labeled with a true result, and the behavioral feature sequence is a sequence composed of multiple behavioral features arranged chronologically. The encoding module 112 is used to input the behavioral sequence samples into the recommendation model to obtain an overall value vector. The calibration module 113 is used to obtain a calibration value vector based on the pricing value vector and the overall value vector. The decoding module 114 is used to input the calibration value vector and the behavioral sequence samples into the recommendation model to obtain a prediction result. The loss module 115 is used to determine a loss function based on the behavioral features, the prediction result, and the true result. The update module 116 is used to update the parameters of the recommendation model based on the loss function until the loss function converges.

[0152] In one embodiment, the recommendation model includes an encoder and a decoder; the encoding module 112 specifically includes: an encoder module, used to input the behavior sequence samples into the encoder to obtain an overall value vector; and a decoder module, used to decode the module 114 specifically includes: inputting the calibration value vector and the behavior sequence samples into the decoder to obtain a prediction result.

[0153] In one embodiment, the recommendation model includes multiple behavior towers, each behavior tower corresponding to a different type of behavior feature. The encoding module 112 specifically includes: a first input submodule, used to input the behavior features one by one into the recommendation model in sequence; a second invocation submodule, used to invoke the corresponding behavior tower according to the type of the behavior feature after each input behavior feature; a value determination submodule, used to input each behavior feature into the corresponding behavior tower to obtain the corresponding target value; and a vector determination submodule, used to determine the overall value vector based on each target value.

[0154] In one embodiment, the value determination submodule specifically includes: a value probability unit, used to input each of the behavioral features into the corresponding behavior tower to obtain the corresponding behavioral value and behavioral probability; and a target value unit, used to determine the corresponding target value based on the behavioral value and behavioral probability of each of the behavioral features.

[0155] In one embodiment, the behavior towers are arranged in sequence, and the value probability unit is specifically used to perform: inputting the behavior feature and preset probability into the corresponding behavior tower to obtain the corresponding behavior value and behavior probability; using the behavior probability as the preset probability input for the next behavior tower, until all behavior towers have obtained behavior value and behavior probability.

[0156] In one embodiment, the calibration module 113 specifically includes: a pricing submodule, used to determine the pricing value vector using a pricing calibration algorithm; and a calibration submodule, used to calibrate the overall value vector according to the pricing value vector to obtain a calibration value vector.

[0157] In one embodiment, the pricing submodule specifically includes: a pricing value unit, used to determine the pricing value corresponding to each of the behavioral features; and a user pricing unit, used to determine the pricing value vector based on the pricing value corresponding to each of the behavioral features.

[0158] In one embodiment, each dimension of the overall value vector corresponds to a target value, and each dimension of the pricing value vector corresponds to a pricing value corresponding to a behavioral feature. The calibration submodule specifically includes: a difference degree unit, used to determine the difference degree between the pricing value vector and the overall value vector; and a difference adjustment unit, used to adjust the overall value vector if the difference degree is greater than a predetermined difference degree threshold, so as to reduce the difference between the target value corresponding to at least one behavioral feature and the pricing value corresponding to that behavioral feature, thereby obtaining a calibration value vector.

[0159] It should be noted that the training device for the recommendation model provided in the above embodiments is only illustrated by the division of the above functional modules when executing the recommendation model training method. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the training device for the recommendation model and the training method embodiment provided in the above embodiments belong to the same concept, and the implementation process is detailed in the method embodiment, which will not be repeated here.

[0160] The following will combine Figure 12 The recommended apparatus provided in the embodiments of this application will be described in detail. It should be noted that... Figure 12 The recommended apparatus shown is used to perform the embodiments of this application. Figures 8-10 The methods in the illustrated embodiments are shown only for ease of explanation, illustrating the parts relevant to the embodiments of this application. For specific technical details not disclosed, please refer to the embodiments of this application. Figures 8-10 The example shown.

[0161] Please see Figure 12 This diagram illustrates the structure of a recommendation device according to an embodiment of this application. The recommendation device 12 can be implemented as all or part of a user terminal through software, hardware, or a combination of both. According to some embodiments, the recommendation device 12 includes an acquisition module 121 and an input module 122, specifically used for: the acquisition module 121 acquiring a sequence of behavioral features to be recommended; and the input module 122 inputting the sequence of behavioral features to be recommended into a recommendation model to obtain a prediction result, wherein the recommendation model is the one described above.

[0162] In one embodiment, the model includes multiple behavior towers, each behavior tower corresponding to a different type of behavior feature. The acquisition module 121 specifically includes: a second input submodule, used to input the behavior features one by one into the recommendation model in sequence; a second invocation submodule, used to invoke the corresponding behavior tower according to the type of the behavior feature for each input behavior feature; and a prediction result submodule, used to input each behavior feature into the corresponding behavior tower to obtain a prediction result.

[0163] In one embodiment, the prediction result submodule specifically includes: a behavior probability unit, used to input each behavior feature into the corresponding behavior tower to obtain the corresponding behavior probability; and a prediction result unit, used to normalize each behavior probability to obtain a prediction result.

[0164] It should be noted that the recommendation device provided in the above embodiments is only illustrated by the division of the above functional modules when executing the recommendation method. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above. In addition, the recommendation device and recommendation method embodiments provided in the above embodiments belong to the same concept, and the implementation process can be found in the method embodiments, which will not be repeated here.

[0165] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0166] In the embodiments of this application, a total value vector is obtained by arranging various user behavioral features in sequence and inputting them into the recommendation model. This total value vector is then calibrated based on the pricing value vector to obtain a calibrated value vector, achieving timely correction and calibration of sample biases. The calibrated value vector, along with the previous user behavioral features, is then fed back into the recommendation model to obtain prediction results. The calibrated value vector helps the recommendation model obtain more accurate prediction results. Based on the aforementioned behavioral features, prediction results, and actual results, a loss function is determined. The parameters of the recommendation model are then updated based on this loss function until it converges. After the recommendation model is trained, the recommendations obtained using this model are more accurate than those obtained using traditional methods, solving the problem of inaccurate recommendations caused by the inability to promptly correct and calibrate sample biases.

[0167] Embodiments of this application also provide a computer storage medium that can store multiple instructions adapted for loading and execution by a processor as described above. Figures 1-10 The training and recommendation methods of the recommendation model in the illustrated embodiment can be found in the following documentation for details. Figures 1-10 The specific details of the illustrated embodiments will not be elaborated here.

[0168] Embodiments of this application also provide a computer program product storing at least one instruction, which is loaded and executed by the processor as described above. Figures 1-10 The training and recommendation methods of the recommendation model in the illustrated embodiment can be found in the following documentation for details. Figures 1-10 The specific details of the illustrated embodiments will not be elaborated here.

[0169] Please refer to Figure 13This diagram illustrates a structural block diagram of an electronic device provided in an exemplary embodiment of this application. The electronic device in the embodiments of this application may include one or more components such as a processor 110, a memory 120, an input device 130, an output device 140, and a bus 150. The processor 110, memory 120, input device 130, and output device 140 may be connected via the bus 150.

[0170] Processor 110 may include one or more processing cores. Processor 110 connects to various parts of the electronic device using various interfaces and lines, and performs various functions and processes data of electronic device 100 by running or executing instructions, programs, code sets, or instruction sets stored in memory 120, and by calling data stored in memory 120. In one embodiment, processor 110 may be implemented using at least one hardware form of digital signal processing (DSP), field-programmable gate array (FPGA), or programmable logic array (PLA). Processor 110 may integrate one or more of a central processing unit (CPU), graphics processing unit (GPU), and modem. The CPU primarily handles the operating system, user interface, and applications; the GPU is responsible for rendering and drawing the displayed content; and the modem handles wireless communication. It is understood that the modem may also not be integrated into processor 110, but may be implemented separately using a communication chip.

[0171] The memory 120 may include random access memory (RAM) or read-only memory (ROM). In one embodiment, the memory 120 includes a non-transitory computer-readable storage medium. The memory 120 can be used to store instructions, programs, code, code sets, or instruction sets. The memory 120 may include a program storage area and a data storage area. The program storage area may store instructions for implementing an operating system, instructions for implementing at least one function (such as touch functionality, sound playback functionality, image playback functionality, etc.), instructions for implementing the various method embodiments described below, etc. The operating system may be the Android system, including systems deeply developed based on the Android system, the iOS system developed by Apple Inc., including systems deeply developed based on the iOS system, or other systems. The data storage area may also store data created by the electronic device during use, such as phonebook data, audio and video data, chat log data, etc.

[0172] See Figure 14 As shown, the memory 120 can be divided into operating system space and user space. The operating system runs in the operating system space, while native and third-party applications run in the user space. To ensure that different third-party applications can achieve good running performance, the operating system allocates corresponding system resources for each application. However, different application scenarios within the same third-party application have different requirements for system resources. For example, in local resource loading scenarios, third-party applications have high requirements for disk read speed; in animation rendering scenarios, third-party applications have high requirements for GPU performance. Since the operating system and third-party applications are independent of each other, the operating system often cannot promptly perceive the current application scenario of a third-party application, resulting in the operating system's inability to adapt system resources accordingly to the specific application scenario of the third-party application.

[0173] In order for the operating system to distinguish the specific application scenarios of third-party applications, it is necessary to establish data communication between the third-party applications and the operating system. This would allow the operating system to obtain the current scenario information of the third-party applications at any time, and then perform targeted system resource adaptation based on the current scenario.

[0174] Taking the Android operating system as an example, the programs and data stored in memory 120 are as follows: Figure 15As shown, the memory 120 can store the Linux kernel layer 320, the system runtime library layer 340, the application framework layer 360, and the application layer 380. The Linux kernel layer 320, system runtime library layer 340, and application framework layer 360 belong to the operating system space, while the application layer 380 belongs to the user space. The Linux kernel layer 320 provides low-level drivers for various hardware components of the electronic device, such as display drivers, audio drivers, camera drivers, Bluetooth drivers, Wi-Fi drivers, and power management. The system runtime library layer 340 provides support for key features of the Android system through several C / C++ libraries. For example, the SQLite library provides database support, the OpenGL / ES library provides 3D graphics support, and the Webkit library provides browser kernel support. The system runtime library layer 340 also provides the Android runtime library, which mainly provides core libraries that allow developers to write Android applications using the Java language. The Application Framework Layer 360 provides various APIs that may be used when building applications. Developers can also use these APIs to build their own applications, such as activity management, window management, view management, notification management, content provider, package management, call management, resource management, and location management. At least one application runs in the Application Layer 380. These applications can be native applications that come with the operating system, such as contacts, SMS, clock, and camera apps; or third-party applications developed by third-party developers, such as games, instant messaging, and photo editing apps.

[0175] Taking the operating system as an example (iOS), the programs and data stored in memory 120 are as follows: Figure 16As shown, the iOS system includes: Core OS layer 420, Core Services layer 440, Media layer 460, and Cocoa Touch layer 480. Core OS layer 420 includes the operating system kernel, drivers, and low-level program frameworks. These low-level program frameworks provide hardware-level functionality for use by the program frameworks located in Core Services layer 440. Core Services layer 440 provides system services and / or program frameworks required by applications, such as Foundation framework, account framework, advertising framework, data storage framework, network connectivity framework, geolocation framework, motion framework, etc. Media layer 460 provides applications with audiovisual interfaces, such as interfaces related to graphics and images, audio technology, video technology, and wireless playback (AirPlay) interfaces. Cocoa Touch layer 480 provides various commonly used interface-related frameworks for application development and is responsible for user touch interaction on electronic devices. Examples include local notification services, remote push services, advertising frameworks, game tool frameworks, message user interface (UI) frameworks, UIKit user interface frameworks, map frameworks, and so on.

[0176] exist Figure 16 The framework shown includes, but is not limited to, the base framework in the core service layer 440 and the UIKit framework in the touchable layer 480. The base framework provides many basic object classes and data types, offering the most basic system services to all applications, and is independent of the UI. The UIKit framework, on the other hand, provides a basic UI class library for creating touch-based user interfaces. iOS applications can use the UIKit framework to provide their UI, thus providing the application's infrastructure for building user interfaces, drawing, handling user interaction events, responding to gestures, and so on.

[0177] The methods and principles for implementing data communication between third-party applications and the operating system in the iOS system can be referenced from the Android system, and the embodiments of this application will not be described in detail here.

[0178] The input device 130 is used to receive input instructions or data, and includes, but is not limited to, a keyboard, mouse, camera, microphone, or touch device. The output device 140 is used to output instructions or data, and includes, but is not limited to, a display device and a speaker. In one example, the input device 130 and the output device 140 can be combined into a touch screen, which is used to receive touch operations from the user using a finger, stylus, or any suitable object on or near it, and to display the user interface of various applications. The touch screen is usually located on the front panel of the electronic device. The touch screen can be designed as a full-screen, curved screen, or irregularly shaped screen. The touch screen can also be designed as a combination of a full-screen and a curved screen, or a combination of an irregularly shaped screen and a curved screen; the embodiments of this application do not limit this.

[0179] In addition, those skilled in the art will understand that the structure of the electronic device shown in the above figures does not constitute a limitation on the electronic device. The electronic device may include more or fewer components than shown, or combine certain components, or have different component arrangements. For example, the electronic device may also include radio frequency circuits, input units, sensors, audio circuits, wireless fidelity (WiFi) modules, power supplies, Bluetooth modules, etc., which will not be described in detail here.

[0180] In the embodiments of this application, the entity executing each step can be the electronic device described above. In one embodiment, the entity executing each step is the operating system of the electronic device. The operating system can be Android, iOS, or other operating systems, and the embodiments of this application do not limit this.

[0181] The electronic device in the embodiments of this application may also be equipped with a display device. The display device can be various devices capable of display functions, such as: cathode ray tube display (CR), light-emitting diode display (LED), electronic ink screen, liquid crystal display (LCD), plasma display panel (PDP), etc. Users can use the display device on the electronic device 101 to view displayed text, images, videos, and other information. The electronic device may be a smartphone, tablet computer, gaming device, AR (Augmented Reality) device, automobile, data storage device, audio playback device, video playback device, laptop, desktop computing device, wearable devices such as electronic watches, electronic glasses, electronic helmets, electronic bracelets, electronic necklaces, electronic clothing, etc.

[0182] exist Figure 13 In the illustrated electronic device, which can be a terminal, the processor 110 can be used to call the network optimization application stored in the memory 120 and specifically perform the following operations:

[0183] Obtain behavioral sequence samples, each of which is a behavioral feature sequence labeled with a real result, and the behavioral feature sequence is a sequence composed of multiple behavioral features arranged in time;

[0184] The behavioral sequence samples are input into the recommendation model to obtain the overall value vector;

[0185] Based on the pricing value vector and the overall value vector, the calibration value vector is obtained;

[0186] The calibration value vector and the behavior sequence sample are input into the recommendation model to obtain the prediction result;

[0187] Based on the behavioral characteristics, the prediction results, and the actual results, determine the loss function;

[0188] The recommendation model is updated with parameters based on the loss function until the loss function converges.

[0189] In one embodiment, the recommendation model includes an encoder and a decoder; when the processor 110 performs the operation of inputting the behavior sequence sample into the recommendation model to obtain the overall value vector, it specifically performs the following operations: inputting the behavior sequence sample into the encoder to obtain the overall value vector; when the processor 110 performs the operation of inputting the calibration value vector and the behavior sequence sample into the recommendation model to obtain the prediction result, it specifically performs the following operations: inputting the calibration value vector and the behavior sequence sample into the decoder to obtain the prediction result.

[0190] In one embodiment, the recommendation model includes multiple behavior towers, each behavior tower corresponding to a different type of behavior feature. When the processor 110 inputs the behavior sequence samples into the recommendation model to obtain the overall value vector, it specifically performs the following operations: inputting the behavior features into the recommendation model one by one in sequence; calling the corresponding behavior tower according to the type of each input behavior feature; inputting each behavior feature into the corresponding behavior tower to obtain the corresponding target value; and determining the overall value vector based on each target value.

[0191] In one embodiment, when the processor 110 performs the operation of inputting each behavioral feature into the corresponding behavioral tower to obtain the corresponding target value, it specifically performs the following operations: inputting each behavioral feature into the corresponding behavioral tower to obtain the corresponding behavioral value and behavioral probability; and determining the corresponding target value based on the behavioral value and behavioral probability of each behavioral feature.

[0192] In one embodiment, the behavior towers are arranged in sequence. When the processor 110 executes the operation of inputting each behavior feature into the corresponding behavior tower to obtain the corresponding behavior value and behavior probability, it specifically performs the following operations: inputting the behavior feature and preset probability into the corresponding behavior tower to obtain the corresponding behavior value and behavior probability; using the behavior probability as the preset probability input for the next behavior tower, until all behavior towers have obtained behavior value and behavior probability.

[0193] In one embodiment, when the processor 110 executes the operation of obtaining a calibration value vector based on the pricing value vector and the overall value vector, it specifically performs the following operations: determining the pricing value vector using a pricing calibration algorithm; calibrating the overall value vector based on the pricing value vector to obtain the calibration value vector.

[0194] In one embodiment, when the processor 110 executes the pricing value vector determination using a pricing calibration algorithm, it specifically performs the following operations: determining the pricing value corresponding to each of the behavioral features; and determining the pricing value vector based on the pricing value corresponding to each of the behavioral features.

[0195] In one embodiment, each dimension of the overall value vector corresponds to a target value, and each dimension of the pricing value vector corresponds to a pricing value corresponding to a behavioral feature. When the processor 110 calibrates the overall value vector according to the pricing value vector to obtain a calibration value vector, it specifically performs the following operations: determining the degree of difference between the pricing value vector and the overall value vector; if the degree of difference is greater than a predetermined degree of difference threshold, adjusting the overall value vector to reduce the difference between the target value corresponding to at least one behavioral feature and the pricing value corresponding to that behavioral feature, thereby obtaining the calibration value vector.

[0196] In one embodiment, the processor 110 may also execute the model application steps, namely, the steps of executing the recommendation method. When executing the recommendation method, the following operations are specifically performed: obtaining the behavioral feature sequence to be recommended; inputting the behavioral feature sequence to be recommended into the recommendation model to obtain the prediction result, wherein the recommendation model is the recommendation model described above.

[0197] In one embodiment, the model includes multiple behavior towers, each behavior tower corresponding to a different type of behavior feature. When the processor 110 executes the operation of inputting a sequence of behavior features into the recommendation model to obtain a prediction result, it specifically performs the following operations: inputting the behavior features into the recommendation model one by one in sequence; calling the corresponding behavior tower according to the type of each input behavior feature; and inputting each behavior feature into the corresponding behavior tower to obtain a prediction result.

[0198] In one embodiment, when the processor 110 performs the operation of inputting each of the behavioral features into the corresponding behavioral tower to obtain a prediction result, it specifically performs the following operations: inputting each of the behavioral features into the corresponding behavioral tower to obtain the corresponding behavioral probability; and normalizing each behavioral probability to obtain a prediction result.

[0199] In the embodiments of this application, a total value vector is obtained by arranging various user behavioral features in sequence and inputting them into the recommendation model. This total value vector is then calibrated based on the pricing value vector to obtain a calibrated value vector, achieving timely correction and calibration of sample biases. The calibrated value vector, along with the previous user behavioral features, is then fed back into the recommendation model to obtain prediction results. The calibrated value vector helps the recommendation model obtain more accurate prediction results. Based on the aforementioned behavioral features, prediction results, and actual results, a loss function is determined. The parameters of the recommendation model are then updated based on this loss function until it converges. After the recommendation model is trained, the recommendations obtained using this model are more accurate than those obtained using traditional methods, solving the problem of inaccurate recommendations caused by the inability to promptly correct and calibrate sample biases.

[0200] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory, or random access memory, etc.

[0201] It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.), and signals involved in the embodiments of this application are all authorized by the user or fully authorized by all parties, and the collection, use, and processing of related data must comply with the relevant laws, regulations, and standards of the relevant countries and regions. For example, the object characteristics, interactive behavior characteristics, and user information involved in the embodiments of this application are all obtained under full authorization.

[0202] The above-disclosed embodiments are merely preferred embodiments of this application and should not be construed as limiting the scope of the claims of this application. Therefore, any equivalent variations made in accordance with the claims of this application shall still fall within the scope of the embodiments of this application.

Claims

1. A method for training a recommendation model, the method comprising: Obtain behavioral sequence samples, each of which is a behavioral feature sequence labeled with a real result, and the behavioral feature sequence is a sequence composed of multiple behavioral features arranged in time; Input the behavioral sequence samples into the recommendation model to obtain the overall value vector; Based on the pricing value vector and the overall value vector, the calibration value vector is obtained; The calibration value vector and the behavior sequence sample are input into the recommendation model to obtain the prediction result; Based on the behavioral characteristics, the prediction results, and the actual results, determine the loss function; The recommendation model is updated based on the loss function until the loss function converges. In this context, each dimension of the overall value vector corresponds to a target value, and each dimension of the pricing value vector corresponds to a pricing value corresponding to a behavioral feature. The degree of difference between the pricing value vector and the overall value vector is determined. If the degree of difference is greater than a predetermined degree of difference threshold, the overall value vector is adjusted so that the difference between the target value corresponding to at least one behavioral feature and the pricing value corresponding to that behavioral feature is less than a set threshold, thereby obtaining a calibration value vector.

2. The training method for the recommendation model as described in claim 1, wherein the recommendation model includes an encoder and a decoder; The step of inputting the behavioral sequence samples into the recommendation model to obtain the overall value vector specifically includes: The behavioral sequence samples are input into the encoder to obtain the overall value vector; The step of inputting the calibration value vector and the behavior sequence sample into the recommendation model to obtain the prediction result specifically includes: The calibration value vector and the behavior sequence sample are input into the decoder to obtain the prediction result.

3. The training method for the recommendation model as described in claim 1 or 2, wherein the recommendation model comprises multiple behavior towers, each behavior tower corresponding to different types of behavioral features, and the step of inputting the behavior sequence samples into the recommendation model to obtain the overall value vector specifically includes: The behavioral features are input into the recommendation model one by one in sequence; For each input behavioral feature, the corresponding behavioral tower is invoked based on the type of the behavioral feature; Each behavioral feature is input into the corresponding behavioral tower to obtain the corresponding target value; Determine the overall value vector based on the value of each objective.

4. The training method for the recommendation model as described in claim 3, wherein inputting each behavioral feature into the corresponding behavioral pyramid to obtain the corresponding target value specifically includes: Each behavioral feature is input into the corresponding behavioral tower to obtain the corresponding behavioral value and behavioral probability; The corresponding target value is determined based on the behavioral value and behavioral probability of each behavioral feature.

5. The training method for the recommendation model as described in claim 4, wherein the behavior pyramids are arranged in sequence, and the step of inputting each behavior feature into the corresponding behavior pyramid to obtain the corresponding behavior value and behavior probability specifically includes: The behavioral characteristics and preset probabilities are input into the corresponding behavioral tower to obtain the corresponding behavioral value and behavioral probability; The behavior probability is used as the preset probability input for the next behavior tower until all behavior towers have obtained behavior value and behavior probability.

6. The training method for the recommendation model as described in claim 1, wherein obtaining the calibration value vector based on the pricing value vector and the overall value vector specifically includes: The pricing value vector is determined using a pricing calibration algorithm; The overall value vector is calibrated based on the pricing value vector to obtain the calibrated value vector.

7. The training method for the recommendation model as described in claim 6, wherein determining the pricing value vector using a pricing calibration algorithm specifically includes: Determine the pricing value corresponding to each of the aforementioned behavioral characteristics; The pricing value vector is determined based on the pricing value corresponding to each of the aforementioned behavioral characteristics.

8. A recommendation method, the recommendation method comprising: Obtain the behavioral feature sequence to be recommended; The behavioral feature sequence to be recommended is input into the recommendation model to obtain the prediction result. The recommendation model is trained by the training method described in any one of claims 1 to 7.

9. The recommendation method as described in claim 8, wherein the model comprises multiple behavior towers, each behavior tower corresponding to different types of behavioral features, and the step of inputting the behavioral feature sequence into the recommendation model to obtain the prediction result specifically includes: The behavioral features are input into the recommendation model one by one in sequence; For each input behavioral feature, the corresponding behavioral tower is invoked based on the type of the behavioral feature; Each behavioral feature is input into the corresponding behavioral tower to obtain the prediction result.

10. The recommendation method as described in claim 9, wherein inputting each behavioral feature into the corresponding behavioral tower to obtain a prediction result specifically includes: Each behavioral feature is input into the corresponding behavioral tower to obtain the corresponding behavioral probability; The probability of each behavior is normalized to obtain the prediction result.

11. A training apparatus for a recommendation model, the training apparatus for the recommendation model comprising: The acquisition module is used to acquire behavioral sequence samples, each of which is a behavioral feature sequence labeled with a real result, and the behavioral feature sequence is a sequence composed of multiple behavioral features arranged in time. The encoding module is used to input the behavior sequence samples into the recommendation model to obtain the overall value vector; The calibration module is used to obtain a calibration value vector based on the pricing value vector and the overall value vector; The decoding module is used to input the calibration value vector and the behavioral feature sequence into the recommendation model to obtain the prediction result; The loss module is used to determine the loss function based on the behavioral features, the prediction results, and the actual results; The update module is used to update the parameters of the recommendation model based on the loss function until the loss function converges. In this system, each dimension of the overall value vector corresponds to a target value, and each dimension of the pricing value vector corresponds to a pricing value corresponding to a behavioral feature. The calibration module specifically includes: a difference degree unit, used to determine the difference degree between the pricing value vector and the overall value vector; and a difference adjustment unit, used to adjust the overall value vector if the difference degree is greater than a predetermined difference degree threshold, so that the difference between the target value corresponding to at least one behavioral feature and the pricing value corresponding to that behavioral feature is less than a set threshold, thereby obtaining a calibrated value vector.

12. A recommendation device, the recommendation device comprising: The acquisition module is used to acquire the sequence of behavioral features to be recommended; The input module is used to input the behavioral feature sequence to be recommended into the recommendation model to obtain the prediction result, wherein the recommendation model is trained by the training method of any one of claims 1 to 8.

13. A computer storage medium storing a plurality of instructions adapted for loading by a processor and executing the method steps of any one of claims 1 to 10.

14. A computer program product storing at least one instruction, said at least one instruction being loaded by a processor and executing the method steps of any one of claims 1 to 10.

15. An electronic device comprising: A processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and executed the method steps as claimed in any one of claims 1 to 10.