A method and apparatus for information push
By combining feature extraction and fusion of online and offline user data, and using a multi-task learning network to predict user conversion rates, the problem of insufficient prediction accuracy in existing technologies is solved, achieving higher accuracy in information push and user conversion rates.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING JINGDONG YUANSHENG TECH CO LTD
- Filing Date
- 2024-12-17
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, methods that rely solely on mining user order data to predict user conversion rates generally have limited accuracy and are subject to certain errors.
By combining online and offline user data, feature extraction is performed to obtain first and second feature vectors. These features are then fused using a multi-task learning network to predict the probability of a user performing each task, and information is pushed based on this probability.
It improved the accuracy of information delivery, and enhanced user conversion rates and experience.
Smart Images

Figure CN122240909A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer technology, and in particular to a method and apparatus for information push. Background Technology
[0002] With the development of internet technology, many other business sectors have adopted it to achieve online business growth. Internet advertising is a prime example of the integration of internet technology and advertising, enabling numerous companies to acquire new users and improve conversion rates. Building on this, online logistics advertising represents a new business model that combines traditional logistics with internet advertising technology. By displaying logistics ads to target users on a recommendation system platform with a large user base, users are guided to complete online logistics service forms. A courier then picks up the package and delivers it to the recipient within a specified timeframe using the logistics network, thus converting the user into a customer. To improve the effectiveness of advertising, it is often necessary to predict user conversion rates.
[0003] To predict conversion rates, user order data is typically mined and traditional machine learning is used to predict user conversion rates. However, methods that rely solely on order data generally have limited accuracy in predicting results, as they often involve limited data sets and are prone to errors. Summary of the Invention
[0004] In view of this, embodiments of the present invention provide a method and apparatus for information push, which can predict the user conversion rate of multiple tasks and push information based on the prediction results to improve user experience.
[0005] To achieve the above objectives, according to one aspect of the present invention, an information push method is provided, comprising:
[0006] Feature extraction is performed on user online data to obtain the first feature vector;
[0007] User interest and preference features are extracted from offline user behavior data to obtain the second feature vector;
[0008] Feature fusion is performed based on the first and second feature vectors to obtain fused features;
[0009] The fused features are input into the multi-task learning network to obtain the execution probability of each user for each task, and information is pushed based on the execution probability of each user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavioral sequences.
[0010] Optionally, user interest and preference features are extracted from user offline behavior data to obtain a second feature vector, including:
[0011] Based on the execution order of user offline behavior, the user offline behavior data is sorted to obtain the offline behavior sequence. The user offline behavior data includes user identifier, address information of user offline behavior, and distance between the address information of user offline behavior and preset address information.
[0012] The second feature vector is obtained by extracting user interest and preference features based on offline behavior sequences.
[0013] Optionally, before sorting the user's offline behavior data based on the execution order of the user's offline behavior to obtain the offline behavior sequence, the method further includes:
[0014] Based on the address information in the user's historical offline behavior data, the address information with the highest frequency is determined as the preset address information.
[0015] Optionally, multi-task joint training includes:
[0016] Feature extraction is performed on historical user online data to obtain the first historical feature vector;
[0017] Feature extraction is performed on historical user offline behavior data to obtain the historical second feature vector;
[0018] Based on the first historical feature vector and the second historical feature vector, feature fusion is performed to obtain historical fused features;
[0019] Multi-task joint training of neural networks is performed based on historical fusion features to obtain a multi-task learning network.
[0020] Optionally, the online user data includes user profile data, user order data, and user online behavior data. Feature extraction is performed on the online user data to obtain a first feature vector, including:
[0021] Data preprocessing and feature extraction are performed on user profile data, user order data, and user online behavior data;
[0022] The extracted features are encoded, and the high-dimensional feature vector is transformed into a low-dimensional feature vector through embedding processing to obtain the first feature vector.
[0023] Optionally, before extracting features from user online data to obtain the first feature vector, the method further includes:
[0024] Encode the user identity identifiers in the user profile data;
[0025] Perform range mapping on user attribute information in user profile data.
[0026] Optionally, before extracting features from user offline behavior data to obtain the second feature vector, the method further includes:
[0027] Based on a preset coordinate system, the address information in the user's offline behavior data is mapped.
[0028] According to another aspect of the present invention, an information push device is provided, comprising:
[0029] The feature extraction module is used to extract features from online user data to obtain a first feature vector; and to extract user interest and preference features from offline user behavior data to obtain a second feature vector.
[0030] The feature fusion module is used to fuse features based on the first feature vector and the second feature vector to obtain fused features;
[0031] The input module is used to input the fused features into the multi-task learning network to obtain the execution probability of the user for each task, and to push information based on the execution probability of the user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavioral sequences.
[0032] According to another aspect of the present invention, an electronic device is provided, comprising: one or more processors; and a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement the information push method provided in the embodiments of the present invention.
[0033] According to another aspect of the present invention, a computer-readable medium is provided having a computer program stored thereon, which, when executed by a processor, implements the information push method provided in the embodiments of the present invention.
[0034] According to another aspect of the present invention, a computer program product is provided, including a computer program that, when executed by a processor, implements the information push method provided in the embodiments of the present invention.
[0035] One embodiment of the above invention has the following advantages or beneficial effects: it can extract features based on user online data and user offline behavior data respectively, fuse the extracted features, input the obtained fused features into a multi-task learning network, obtain the output user prediction probability for multiple tasks, and push information based on the prediction probability, which can improve the accuracy of information push, help improve user conversion rate, and improve user experience.
[0036] The further effects of the aforementioned unconventional alternative methods will be explained below in conjunction with specific implementation methods. Attached Figure Description
[0037] The accompanying drawings are provided to better understand the invention and are not intended to unduly limit the scope of the invention. Wherein:
[0038] Figure 1 This is a schematic diagram illustrating the main steps of an information push method according to an embodiment of the present invention;
[0039] Figure 2 This is a schematic diagram of the main modules of an information push device according to an embodiment of the present invention;
[0040] Figure 3 This is an exemplary system architecture diagram in which embodiments of the present invention can be applied;
[0041] Figure 4 This is a schematic diagram of the structure of a computer system suitable for implementing terminal devices or servers of the present invention. Detailed Implementation
[0042] The following description, in conjunction with the accompanying drawings, illustrates exemplary embodiments of the present invention, including various details to aid understanding. These details should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the invention. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.
[0043] It should be noted that the technical solutions disclosed in this invention, regarding the collection, updating, analysis, processing, use, transmission, and storage of user personal information, all comply with relevant laws and regulations, are used for legitimate purposes, and do not violate public order and good morals. Necessary measures are taken to prevent unauthorized access to user personal information data and to safeguard user personal information security, network security, and national security.
[0044] It should be noted that the collection, use, storage, sharing and transfer of user personal information involved in the technical solution of the present invention all comply with the provisions of relevant laws and regulations, and require notification to users and obtaining their consent or authorization. When applicable, user personal information is subjected to de-identification and / or anonymization and / or encryption technical processing.
[0045] Figure 1 This is a schematic diagram illustrating the main steps of an information push method according to an embodiment of the present invention, as follows: Figure 1 As shown, it includes:
[0046] Step S101: Extract features from the user's online data to obtain the first feature vector.
[0047] Acquire user online data, including user profile data, user order data, and user online behavior data. Extract features from the user's online data through a feature extraction network to obtain the first feature vector.
[0048] The user profile data includes user IDs and user attributes such as age, gender, and occupation. User order data is statistical data on orders placed over a period of time, including the number of orders, the number of receiving cities, the number of receiving regions, and the number of complaints received. User online behavior data includes category IDs, brand IDs, category group IDs, and product IDs, which can be described by statistically analyzing historical log data or through models. Specifically, statistics can be used to track the brand IDs purchased by users over a period of time, while training a scoring model can score whether a user would purchase a particular brand's products, thus obtaining a characteristic description of the user's perception of the brand's value.
[0049] Step S102: Extract user interest and preference features from user offline behavior data to obtain the second feature vector.
[0050] Acquire user offline behavior data, including delivery area information and timestamp information, specifically the timestamp of the moment the user receives the package. Extract user interest and preference features from the user offline behavior data to obtain a second feature vector.
[0051] Among them, user offline behavior data is actually the user's interaction with multiple delivery areas, which implies the user's logistics service preferences: the more areas a user's offline behavior involves, the more likely the user is to use logistics services; when goods need to move between multiple areas, users are more inclined to use logistics services when the geographical distance between areas is large; regional characteristics reflect the characteristic information of the user group in the region, which can provide supplementary information for analyzing users' logistics service interests and preferences to a certain extent.
[0052] Step S103: Perform feature fusion based on the first feature vector and the second feature vector to obtain fused features.
[0053] The extracted first and second feature vectors are fused to obtain the fused feature, which is a comprehensive description of the user's online and offline behavior data.
[0054] Step S104: Input the fused features into the multi-task learning network to obtain the execution probability of the user for each task, and push information based on the execution probability of the user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavior sequences.
[0055] The obtained fused features are input into a pre-trained multi-task learning network to obtain the output execution probability (conversion rate) of the user for each task in the multi-task network. Furthermore, based on the output execution probabilities, the user's task execution based on the pushed information can be analyzed, and the pushed information can be adjusted. For example, if the user's conversion rate for the current pushed information is low, it can be replaced with other pushed information.
[0056] Among them, the multi-task learning network is a network obtained by jointly training multiple tasks using user data, including online data and offline behavioral data. Based on this network, it can output the predicted conversion rate of users for all tasks. Multi-task refers to an ordered sequence of multiple user behaviors, including five tasks: exposure, click, filling in information, submitting a request, and successful collection. The corresponding predicted execution probabilities are whether to click the task, whether to fill in information, whether to submit a request, and whether to convert successfully.
[0057] The information push method provided by the present invention can extract features based on user online data and user offline behavior data respectively, fuse the extracted features, input the obtained fused features into a multi-task learning network, obtain the output user prediction probability for multiple tasks, and push information based on the prediction probability, which can improve the accuracy of information push, help improve user conversion rate, and improve user experience.
[0058] Optionally, user interest and preference features are extracted from the user's offline behavior data to obtain a second feature vector. This includes: sorting the user's offline behavior data based on the execution order of the user's offline behavior to obtain an offline behavior sequence. The user's offline behavior data includes the user identifier, the address information of the user's offline behavior, and the distance between the address information of the user's offline behavior and preset address information; and extracting user interest and preference features based on the offline behavior sequence to obtain the second feature vector. When extracting user interest and preference features, the execution order of the user's offline behavior needs to be considered. The user's offline behavior 'a' is identified as a sequence to obtain the offline behavior sequence. Among them, user offline behavior 'a' can be further used express, Indicates user, This indicates the area where the user engaged in offline activity. This indicates the area of the user's current offline behavior compared to the area of the user's highest historical frequency. The geographical distance between them. Sequence information can better reflect changes in user interests and the frequency of interaction with different regions, so the interest preference extraction network uses the user's historical interaction behavior sequence as input to capture changes in user interests, which helps to obtain more accurate prediction results later. When performing feature extraction, the features of a and r are first embedded (dimensionality reduction):
[0059]
[0060]
[0061] in, For the i-th user's offline behavior feature after dimensionality reduction, For the i-th user's offline behavior feature before dimensionality reduction, The weight values for user offline behavior features in the embedding layer; The region features of the offline behavior of the u-th user after dimensionality reduction. The region features of the offline behavior of the u-th user before dimensionality reduction. The weight values for the regional features of user offline behavior in the embedding layer.
[0062] The obtained behavioral sequence features are then input into a recurrent neural network composed of multiple GRUs (Gated Recurrent Units) to obtain the hidden information in the output. :
[0063]
[0064] Finally, the results from the two steps are input into the fully connected layer to obtain the output second feature vector. The user interest preference vector is calculated by inputting the results of the sorted sequence into the weights and biases of the fully connected layer, and the activation function used is tanh.
[0065]
[0066] Optionally, before sorting the user's offline behavior data based on the execution order of the user's offline behavior to obtain the offline behavior sequence, the method further includes: determining the most frequently occurring address information as preset address information based on the address information in the user's historical offline behavior data. By statistically analyzing the addresses in the user's historical offline behavior data, the most frequently occurring address information is determined and used as the preset address information.
[0067] Alternatively, the address information that appears most frequently within a period of time close to the current time can be selected as the preset address information.
[0068] Optionally, multi-task joint training includes: extracting features from historical user online data to obtain a historical first feature vector; extracting features from historical user offline behavior data to obtain a historical second feature vector; fusing features based on the historical first feature vector and the historical second feature vector to obtain historical fused features; and performing multi-task joint training on the neural network based on the historical fused features to obtain a multi-task learning network.
[0069] The model models the multiple sequential steps in the user conversion process as multiple tasks, and jointly trains all tasks to ultimately output prediction results for multiple tasks. Specifically, the model mainly includes an input and embedding layer, a progressive network layer, a conditional attention layer, and an output layer.
[0070] The input and embedding layers primarily process the raw data. First, user profile data, online behavior data, and user order data undergo preprocessing, extracting relevant features and then performing one-hot encoding. Next, the high-dimensional sparse vectors obtained after one-hot encoding are transformed into low-dimensional dense vectors through the embedding layer. Finally, the user interest preference representation vectors extracted based on user offline behavior sequences are fused.
[0071]
[0072] Where e represents the result of feature extraction and dimensionality reduction from online data, and I represents the result of feature extraction and dimensionality reduction from offline data.
[0073] The purpose of progressive network layers in multi-task learning is to perform early predictions for each specific task in shallow networks, based on the bottom shared representation of the input and the output of the embedding layer. The output is an early prediction vector representation for that specific task. Since the early prediction results contain some information about the final prediction results for that specific task, they can provide supplementary information for subsequent steps to extract the vector representation for that specific task.
[0074]
[0075]
[0076] in, This is the early prediction vector (based on which the predicted values of the execution probabilities of each task in the early stages can be calculated).
[0077] The purpose of Conditional Attention Networks is to adaptively extract task-specific vector representations from the early predicted vector representations of the progressive network layer outputs, drawing on bottom-shared information and positive feedback from previous steps, and providing positive feedback for subsequent steps. Specifically, it performs an initial prediction based on the fused features of the input and adjusts the model's parameters for optimization based on the initial prediction results and labeled data. The inputs to a Conditional Attention Network are the early predictions from the progressive network layer and the bottom-shared representation of the inputs and the outputs of the Embedding layer.
[0078]
[0079]
[0080]
[0081]
[0082] The above describes the computation process of the conditional attention mechanism. The idea is to obtain an attention map through mapping and masking, and then multiply it with the early predicted conditional information to obtain the conditional attention map, so as to output the final result.
[0083] The output layer primarily makes predictions based on the task-specific vector representations extracted by the conditional attention module, and outputs the final prediction results for all tasks through the corresponding activation functions:
[0084]
[0085] Optionally, the online user data includes user profile data, user order data, and user online behavior data. Feature extraction is performed on the online user data to obtain a first feature vector. This includes: preprocessing the user profile data, user order data, and user online behavior data, and then extracting features; encoding the extracted features, and transforming the high-dimensional feature vector into a low-dimensional feature vector through embedding processing to obtain the first feature vector. Before feature extraction, data preprocessing is required, including data cleaning. Feature extraction is then performed on the processed data, followed by one-hot encoding. The encoded high-dimensional sparse vector is then transformed into a low-dimensional dense vector through an embedding layer to obtain the final first feature vector.
[0086] Optionally, before extracting features from online user data to obtain the first feature vector, the method further includes: encoding the user identity identifier in the user profile data; and performing interval mapping on the user attribute information in the user profile data. To ensure the security of user data, the user identity identifier (ID) in the user profile data can be encoded, i.e., using a preset encoding rule to convert the user ID into a corresponding string. Furthermore, interval mapping can also be performed on user attribute information. For example, user age can be divided into four age intervals: A, B, C, and D, where A represents 0-20, B represents 20-40, etc. Based on the user's specific age, the corresponding age interval is determined. For example, a user who is 28 is mapped to age interval B, thus concealing the user's age. Other attribute information can be handled similarly.
[0087] Optionally, before extracting features from the user's offline behavior data to obtain the second feature vector, the method further includes: mapping the address information in the user's offline behavior data based on a preset coordinate system. To protect user privacy, GIS coordinates can be used to replace the user's specific address information. For example, if the user's address is No. XX, Tongzhou District, Beijing, it can be replaced with GIS coordinates (lat, lng).
[0088] Alternatively, multi-task joint training can also be performed solely based on user online data to obtain a multi-task joint model that makes predictions based on user online data.
[0089] Optionally, as shown in Table 1, there is an example table of user data, including data types, partial feature names and feature value examples. For example, user offline behavior data includes receiving area ID: 2321 and timestamp information: 1673082950.
[0090]
[0091] The information push method provided by the embodiments of the present invention can combine a first feature vector extracted based on user online data and a second feature vector extracted based on user offline behavior data to obtain a fused feature. This fused feature is then input into a multi-task learning network trained based on historical data to predict the execution probability of each task in the multi-task process, thereby improving the accuracy of the prediction results and contributing to the improvement of the accuracy of subsequent information push. Furthermore, through feature encoding and mapping operations, the security of user information can be effectively guaranteed.
[0092] Optionally, an application example of the information push method based on this application is as follows: Using the full set of 51.8M data records, training, validation, and test sets are constructed in chronological order at a ratio of 6:1:3. The embedding size is set to 6, and the MLP network is designed with 3 layers, with dimensions of 128, 64, and 32 respectively. The activation function of the network layers is ReLU. To alleviate overfitting, dropout parameters are set to 0.1, 0.3, and 0.3. The Adam optimizer is applied with a batch size of 2000 and a learning rate of 1e-3. AUC is chosen as the model evaluation metric. Since practical applications are more concerned with whether users submit requests and whether conversions are successful, only the latter two tasks need to be considered.
[0093] Figure 2 This is a schematic diagram of the main process of the information push device 200 according to an embodiment of the present invention, as shown below. Figure 2 As shown, it includes:
[0094] The feature extraction module 201 is used to extract features from online user data to obtain a first feature vector; and to extract user interest and preference features from offline user behavior data to obtain a second feature vector.
[0095] Feature fusion module 202 is used to perform feature fusion based on the first feature vector and the second feature vector to obtain fused features;
[0096] The input module 203 is used to input the fused features into the multi-task learning network to obtain the execution probability of the user for each task, and to push information based on the execution probability of the user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavioral sequences.
[0097] The information push device provided in the embodiments of the present invention can extract features based on user online data and user offline behavior data respectively, fuse the extracted features, input the obtained fused features into a multi-task learning network, obtain the output user prediction probability for multiple tasks, and push information based on the prediction probability, which can improve the accuracy of information push, help improve user conversion rate, and improve user experience.
[0098] Optionally, the feature extraction module 201 is also used for:
[0099] Based on the execution order of user offline behavior, the user offline behavior data is sorted to obtain the offline behavior sequence. The user offline behavior data includes user identifier, address information of user offline behavior, and distance between the address information of user offline behavior and preset address information.
[0100] The second feature vector is obtained by extracting user interest and preference features based on offline behavior sequences.
[0101] Optionally, the device further includes:
[0102] The determination module 204 is used to determine the most frequently used address information as the preset address information based on the address information in the user's historical offline behavior data.
[0103] Optionally, multi-task joint training includes:
[0104] Feature extraction is performed on historical user online data to obtain the first historical feature vector;
[0105] Feature extraction is performed on historical user offline behavior data to obtain the historical second feature vector;
[0106] Based on the first historical feature vector and the second historical feature vector, feature fusion is performed to obtain historical fused features;
[0107] Multi-task joint training of neural networks is performed based on historical fusion features to obtain a multi-task learning network.
[0108] Optionally, the feature extraction module 201 is also used for:
[0109] Data preprocessing and feature extraction are performed on user profile data, user order data, and user online behavior data;
[0110] The extracted features are encoded, and the high-dimensional feature vector is transformed into a low-dimensional feature vector through embedding processing to obtain the first feature vector.
[0111] Optionally, the device further includes:
[0112] The processing module 205 is used to encode the user identity identifier in the user profile data and perform interval mapping processing on the user attribute information in the user profile data.
[0113] Optionally, the device further includes:
[0114] The processing module 205 is also used to map the address information in the user's offline behavior data based on a preset coordinate system.
[0115] The information push device provided by the embodiments of the present invention can combine a first feature vector extracted based on user online data and a second feature vector extracted based on user offline behavior data to obtain a fused feature, which is then input into a multi-task learning network trained based on historical data to obtain the execution probability of each task in the multi-task prediction, thereby improving the accuracy of the prediction results and contributing to the improvement of the accuracy of subsequent information push. Furthermore, through feature encoding and mapping operations, the security of user information can be effectively guaranteed.
[0116] Figure 3 An exemplary system architecture 300 is shown, in which the information push method or information push apparatus of the present invention can be applied.
[0117] like Figure 3 As shown, system architecture 300 may include terminal devices 301, 302, and 303, a network 304, and a server 305. Network 304 serves as the medium for providing communication links between terminal devices 301, 302, and 303 and server 305. Network 304 may include various connection types, such as wired or wireless communication links, or fiber optic cables, etc.
[0118] Users can use terminal devices 301, 302, and 303 to interact with server 305 via network 304 to receive or send messages, etc. Various communication client applications can be installed on terminal devices 301, 302, and 303, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social media platform software, etc. (for example only).
[0119] Terminal devices 301, 302, and 303 can be various electronic devices with displays and web browsing capabilities, including but not limited to smartphones, tablets, laptops, and desktop computers.
[0120] Server 305 can be a server that provides various services, such as a backend management server that supports shopping websites browsed by users using terminal devices 301, 302, and 303 (for example only). The backend management server can analyze and process data such as received push requests, and feed back the processing results (such as target push information - for example only) to the terminal device.
[0121] It should be noted that the information push method provided in the embodiments of the present invention is generally executed by server 305, and correspondingly, the information push device is generally set in server 305.
[0122] It should be understood that Figure 3 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.
[0123] The following is for reference. Figure 4 It shows a schematic diagram of the structure of a computer system 400 suitable for implementing terminal devices or servers of the present invention. Figure 4 The terminal device or server shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present invention.
[0124] like Figure 4 As shown, the computer system 400 includes a central processing unit (CPU) 401, which can perform various appropriate actions and processes based on programs stored in read-only memory (ROM) 402 or programs loaded from storage section 408 into random access memory (RAM) 403. The RAM 403 also stores various programs and data required for the operation of the system 400. The CPU 401, ROM 402, and RAM 403 are interconnected via a bus 404. An input / output (I / O) interface 405 is also connected to the bus 404.
[0125] The following components are connected to I / O interface 405: an input section 406 including a keyboard, mouse, etc.; an output section 407 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and speakers, etc.; a storage section 408 including a hard disk, etc.; and a communication section 409 including a network interface card such as a LAN card, modem, etc. The communication section 409 performs communication processing via a network such as the Internet. Drive 410 is also connected to I / O interface 405 as needed. Removable media 411, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., are installed on drive 410 as needed so that computer programs read from them can be installed into storage section 408 as needed.
[0126] In particular, according to the embodiments disclosed in this invention, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, embodiments disclosed in this invention include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via communication section 409, and / or installed from removable medium 411. When the computer program is executed by central processing unit (CPU) 401, it performs the functions defined above in the system of this invention.
[0127] It should be noted that the computer-readable medium shown in this invention can be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this invention, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this invention, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. Computer-readable signal media can also be any computer-readable medium other than computer-readable storage media, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber, RF, etc., or any suitable combination thereof.
[0128] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0129] The units or modules described in the embodiments of the present invention can be implemented in software or hardware. The described units or modules can also be housed in a processor; for example, a processor can be described as including a feature extraction module, a feature fusion module, and an input module. The names of these units or modules do not necessarily limit the specific unit or module itself; for example, a feature fusion module can also be described as "a module that performs feature fusion based on a first feature vector and a second feature vector to obtain fused features."
[0130] In another aspect, the present invention also provides a computer-readable medium, which may be included in the device described in the above embodiments; or it may exist independently and not assembled into the device. The computer-readable medium carries one or more programs, which, when executed by the device, cause the device to include:
[0131] Feature extraction is performed on user online data to obtain the first feature vector;
[0132] User interest and preference features are extracted from offline user behavior data to obtain the second feature vector;
[0133] Feature fusion is performed based on the first and second feature vectors to obtain fused features;
[0134] The fused features are input into the multi-task learning network to obtain the execution probability of each user for each task, and information is pushed based on the execution probability of each user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavioral sequences.
[0135] According to the technical solution of the present invention, feature extraction can be performed based on user online data and user offline behavior data respectively, and the extracted features can be fused. The fused features are then input into a multi-task learning network to obtain the output user's predicted probability for multiple tasks. Information can be pushed based on the predicted probability, which can improve the accuracy of information push, help improve user conversion rate, and improve user experience.
[0136] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can occur depending on design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.
Claims
1. A method for information push, characterized in that, include: Feature extraction is performed on user online data to obtain the first feature vector; User interest and preference features are extracted from offline user behavior data to obtain the second feature vector; Based on the first feature vector and the second feature vector, feature fusion is performed to obtain fused features; The fused features are input into a multi-task learning network to obtain the execution probability of each user for each task, and information is pushed based on the execution probability of each user for each task. The multi-task learning network is obtained by joint training of multiple tasks based on historical user online data and historical user offline behavior data. The multiple tasks are generated based on multiple ordered behavioral sequences.
2. The method according to claim 1, characterized in that, The step of extracting user interest and preference features from user offline behavior data to obtain a second feature vector includes: Based on the execution order of user offline behavior, the user offline behavior data is sorted to obtain an offline behavior sequence. The user offline behavior data includes user identifier, address information of user offline behavior, and distance between the address information of user offline behavior and preset address information. Based on the offline behavior sequence, user interest and preference features are extracted to obtain a second feature vector.
3. The method according to claim 2, characterized in that, Before sorting the user's offline behavior data based on the execution order of the user's offline behavior to obtain the offline behavior sequence, the method further includes: Based on the address information in the user's historical offline behavior data, the address information with the highest frequency is determined as the preset address information.
4. The method according to any one of claims 1-3, characterized in that, The multi-task joint training includes: Feature extraction is performed on historical user online data to obtain the first historical feature vector; Feature extraction is performed on historical user offline behavior data to obtain the historical second feature vector; Based on the first historical feature vector and the second historical feature vector, feature fusion is performed to obtain historical fused features; Based on the historical fusion features, the neural network is jointly trained for multiple tasks to obtain a multi-task learning network.
5. The method according to any one of claims 1-3, characterized in that, The online user data includes user profile data, user order data, and user online behavior data. The step of extracting features from the online user data to obtain a first feature vector includes: The user profile data, user order data, and user online behavior data are preprocessed and feature extracted. The extracted features are encoded, and the high-dimensional feature vector is transformed into a low-dimensional feature vector through embedding processing to obtain the first feature vector.
6. The method according to claim 5, characterized in that, Before extracting features from the user's online data to obtain the first feature vector, the method further includes: The user identity identifiers in the user profile data are encoded. The user attribute information in the user profile data is subjected to interval mapping processing.
7. The method according to claim 6, characterized in that, Before extracting features from the user's offline behavior data to obtain the second feature vector, the method further includes: Based on a preset coordinate system, the address information in the user's offline behavior data is mapped.
8. An information push device, characterized in that, include: The feature extraction module is used to extract features from online user data to obtain a first feature vector; and to extract user interest and preference features from offline user behavior data to obtain a second feature vector. The feature fusion module is used to perform feature fusion based on the first feature vector and the second feature vector to obtain fused features; The input module is used to input the fused features into the multi-task learning network to obtain the execution probability of the user for each task, and to push information based on the execution probability of the user for each task. The multi-task learning network is obtained by multi-task joint training based on historical user online data and historical user offline behavior data. The multi-task is generated based on multiple ordered behavioral sequences.
9. An electronic device, characterized in that, include: One or more processors; Storage device for storing one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-7.
10. A computer-readable medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-7.
11. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-7.