A model training method, device, storage medium and electronic device
By adjusting the parameters of the encoding layer in the risk identification model, the feature distance of the training samples is minimized and maximized, thus solving the accuracy problem of user risk identification in the existing technology and achieving a more efficient risk identification effect.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
- Filing Date
- 2023-09-06
- Publication Date
- 2026-06-19
Smart Images

Figure CN117312847B_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of computer technology, and in particular to a method, apparatus, storage medium, and electronic device for model training. Background Technology
[0002] With the development of information technology, there are more and more service providers supporting users in carrying out their business, such as those providing payment functions. At the same time, the security of privacy data has also received widespread attention.
[0003] Currently, users may engage in unauthorized activities while using the functions provided by service providers. Therefore, service providers need to identify the risks associated with user-executed transactions to determine if any risks exist. For example, when a user uses a payment tool provided by the service provider, the service provider needs to identify the risks associated with the transaction and determine if the transaction itself is risky. Therefore, how to train a model to determine whether user-executed transactions pose a risk is a very important issue.
[0004] Based on this, this specification provides a method for model training. Summary of the Invention
[0005] This specification provides a method, apparatus, storage medium, and electronic device for model training, in order to partially solve the aforementioned problems existing in the prior art.
[0006] The following technical solution is adopted in this specification:
[0007] This manual provides a method for model training, including:
[0008] The historical transaction events of each user were identified as training samples.
[0009] For each training sample, the data of a specified type in that training sample is designated as the specified data.
[0010] The specified data corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample; all types of data in the training sample are input into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample.
[0011] With the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least.
[0012] Optionally, the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier;
[0013] The data of a specified type in the training sample is identified as the specified data, specifically including:
[0014] Determine the data of each specified type in the training sample as the specified data.
[0015] The specified data corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample, specifically including:
[0016] The specified data corresponding to the training sample are concatenated to obtain the text corresponding to the training sample.
[0017] The text corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
[0018] Optionally, the risk identification model to be trained further includes an identification layer;
[0019] With the objective of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted, specifically including:
[0020] With the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least.
[0021] The risk profile corresponding to each user's transaction events in history is used as the label for each training sample.
[0022] The second feature of each training sample is input into the recognition layer of the risk recognition model to be trained, and the recognition result corresponding to each training sample is determined.
[0023] With the goal of minimizing the difference between each recognition result and each annotation, at least the model parameters of the second coding layer are adjusted.
[0024] Optionally, the method further includes:
[0025] Identify the transaction events of the user to be identified;
[0026] The transaction event is input into the trained second encoding layer to determine the second feature of the transaction event;
[0027] The second feature is input into the trained recognition layer to determine the risk status of the user to be identified;
[0028] Based on the aforementioned risk situation, risk control measures are implemented for the users to be identified.
[0029] Optionally, the method further includes:
[0030] Identify the transaction events of the user to be identified;
[0031] The transaction event is input into the trained second encoding layer to determine the second feature of the transaction event;
[0032] In a pre-built retrieval library, a third feature is identified whose distance from the second feature is within a specified range;
[0033] Determine the risk profile of the user corresponding to the third feature, and use it as the risk profile of the user to be identified;
[0034] Risk control is performed on the users to be identified based on their risk profile.
[0035] Optionally, the method further includes:
[0036] Identify the transaction events of the user to be identified;
[0037] The transaction event is input into the second encoding layer of the trained risk identification model to determine the second feature of the transaction event;
[0038] In a pre-built retrieval library, a third feature is identified whose distance from the second feature is within a specified range;
[0039] Determine the risk profile of the user corresponding to the third feature, and use it as the risk profile of the user to be identified;
[0040] Risk control is performed on the users to be identified based on their risk profile.
[0041] Optionally, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted, specifically including:
[0042] The first feature of the training sample and the second feature of the training sample are used as a first combination, and the first feature of the training sample is used as a second combination with the second features of other training samples besides the training sample.
[0043] With the goal of minimizing the distance between features within each first combination and maximizing the distance between features within each second combination, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least.
[0044] This specification provides a model training apparatus, comprising:
[0045] The first determination module is used to determine the transaction events of each user in history as training samples.
[0046] The second determining module is used to determine, for each training sample, the data of a specified type in that training sample as the specified data;
[0047] The feature extraction module is used to input the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample; and to input all types of data in the training sample into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample.
[0048] The training module is used to adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples.
[0049] Optionally, the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier;
[0050] The second determining module is specifically used to determine the specified data types in the training sample as the specified data.
[0051] The feature extraction module is specifically used to concatenate the specified data corresponding to the training sample to obtain the text corresponding to the training sample; and input the text corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
[0052] Optionally, the risk identification model to be trained further includes an identification layer;
[0053] The training module is specifically used to: adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples; use the risk status corresponding to the transaction events of each user in history as the label corresponding to each training sample; input the second features of each training sample into the identification layer of the risk identification model to be trained to determine the identification result corresponding to each training sample; and adjust the model parameters of at least the second encoding layer with the goal of minimizing the difference between each identification result and each label.
[0054] Optionally, the device further includes:
[0055] An application module is used to determine the transaction events of a user to be identified; input the transaction events into a trained second encoding layer to determine a second feature of the transaction events; input the second feature into a trained recognition layer to determine the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status.
[0056] Optionally, the device further includes:
[0057] An application module is used to determine the transaction events of a user to be identified; input the transaction events into a trained second encoding layer to determine the second feature of the transaction events; determine a third feature in a pre-built retrieval library whose distance from the second feature is within a specified range; determine the risk status of the user corresponding to the third feature and use it as the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status of the user to be identified.
[0058] Optionally, the device further includes:
[0059] An application module is used to determine the transaction events of the user to be identified; input the transaction events into the second encoding layer of the trained risk identification model to determine the second feature of the transaction events; determine the third feature in a pre-built retrieval library whose distance from the second feature is within a specified range; determine the risk status of the user corresponding to the third feature and use it as the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status of the user to be identified.
[0060] Optionally, the training module is specifically used to combine the first feature of the training sample with the second feature of the training sample as a first combination, and combine the first feature of the training sample with the second features of other training samples besides the training sample as a second combination; with the goal of minimizing the distance between features in each first combination and maximizing the distance between features in each second combination, at least the model parameters of the second encoding layer in the risk identification model to be trained are adjusted.
[0061] This specification provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method for training the model.
[0062] This specification provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the model training method described above.
[0063] The above-mentioned technical solutions adopted in this specification can achieve the following beneficial effects:
[0064] The model training method provided in this specification determines historical transaction events of each user as training samples. Then, for each training sample, a specified type of data within that training sample is designated as specified data. This specified data is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample. Furthermore, all types of data from the training sample are input into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample. Finally, with the goal of minimizing the distance between the first and second features of the training sample and maximizing the distance between the first feature of the training sample and the second features of all other training samples, at least the model parameters of the second encoding layer of the risk identification model to be trained are adjusted.
[0065] As can be seen from the above method, this method first determines the transaction events of each user in history as training samples. For each training sample, a specified type of data in that training sample is designated as specified data. This specified data is then input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample. Similarly, all types of data in the training sample are input into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample. Then, with the goal of minimizing the distance between the first and second features of the training sample and maximizing the distance between the first feature of the training sample and the second features of all other training samples, the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted. The first features corresponding to each training sample guide the second encoding layer to extract features from each training sample, enabling the extracted features to better represent the risks associated with users. Based on the extracted features, user risk identification is performed, resulting in more accurate identification results. The subsequently trained risk identification model can then perform risk identification on the users to be identified, improving the accuracy of the identification results. Attached Figure Description
[0066] The accompanying drawings, which are included to provide a further understanding of this specification and form part of this specification, illustrate exemplary embodiments and their descriptions, serving to explain this specification and do not constitute an undue limitation thereof.
[0067] In the picture:
[0068] Figure 1 This is a flowchart illustrating a model training method provided in this specification;
[0069] Figure 2 This is a schematic diagram illustrating the application of a risk identification model provided in this specification;
[0070] Figure 3 This is a schematic diagram illustrating the application of another risk identification model provided in this specification;
[0071] Figure 4 This is a schematic diagram of a model training device provided in this specification;
[0072] Figure 5 The corresponding information provided in this specification Figure 1 A schematic diagram of an electronic device. Detailed Implementation
[0073] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this specification.
[0074] The embodiments of this specification provide a method, apparatus, storage medium, and electronic device for model training. The technical solutions provided by the embodiments of this specification are described in detail below with reference to the accompanying drawings.
[0075] Figure 1 This is a flowchart illustrating a model training method provided in this specification, which specifically includes the following steps:
[0076] S100: Identify the transaction events of each user in history as training samples.
[0077] In this specification, the device used to train the model determines each historical user transaction event as a training sample. This device can be a server used for training the model, or a device such as a mobile phone or personal computer (PC) capable of executing the scheme described in this specification. For ease of explanation, the following description uses a server as the executing entity. Historical user transaction events are transaction data that occurred when a user made payments using payment tools provided by the service provider. This transaction data typically includes three types of data: continuous data, categorical data, and identifier data (i.e., ID data). Categorical data contains data with fixed categories, such as the user's province. Identifier data contains data without fixed categories or whose categories cannot be enumerated, such as the identifier of the transaction device used by the user, the name of the transaction item, and the identifier of the transaction tool used by the user. This transaction tool can be the payment tool used for the transaction.
[0078] Therefore, transaction data can include information such as the transaction initiator, the transaction recipient, and the transaction amount. The transaction initiator information refers to the information of the user who made the payment using a payment tool. This user information includes the user's name, the identifier of the transaction device used by the user, the name of the transaction device, the identifier of the transaction tool used by the user, and the name of the item being traded. The transaction recipient information includes the information of the user who received the payment.
[0079] S102: For each training sample, determine the data of a specified type in that training sample as the specified data.
[0080] For each training sample, the server can determine the specified data type within that sample. If the specified type is an identifier type (i.e., id type), then the specified data is identifier type data. Since there are many types of identifier data in a transaction event, the specified data can be at least one of the following: transaction device identifier, transaction item name, or transaction tool identifier. The specified data determined by the server can be represented in key-value JSON format; that is, the specified data determined by the server can be text in key-value JSON format. For example, if the server determines the specified data as transaction device identifier 1 from training sample 1, then the specified data represented in key-value JSON format would be {transaction device identifier: 1}.
[0081] In addition, when the specified data is not just one type of identifier, that is, when the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier, the server can determine that each specified type of data in the training sample is the specified data.
[0082] S104: Input the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample; input all types of data in the training sample into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample.
[0083] S106: With the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of other training samples, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least.
[0084] The server inputs the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained, determining the first feature of the training sample. Simultaneously, it inputs all types of data from the training sample into the second encoding layer of the risk identification model to be trained, determining the second feature of the training sample. Then, with the goal of minimizing the distance between the first and second features of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the model parameters of at least the second encoding layer of the risk identification model to be trained are adjusted. The risk identification model to be trained includes a first encoding layer and a second encoding layer. The first encoding layer is used to extract features from identifier-type data, and the second encoding layer is used to extract features from transaction data (i.e., all types of data in the training sample). This transaction data includes all types of data, not only identifier-type data but also continuous type data and categorical type data. The specific types of data included are not specifically limited in this specification.
[0085] In this specification, when training at least the second encoding layer of the risk identification model to be trained, the server can employ a contrastive learning training method. For each training sample, the first feature and second feature form a positive sample pair, and the first feature of each training sample forms a negative sample pair with the second features of other training samples. The model parameters of at least the second encoding layer of the risk identification model to be trained are adjusted with the goal of minimizing the distance between features within a positive sample pair and maximizing the distance between features within a negative sample pair. Based on this, when adjusting the model parameters of at least the second encoding layer of the risk identification model to be trained with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, the server can use the first feature and the second feature of the training sample as a first combination, and the first feature of the training sample and the second features of other training samples as a second combination. Then, the model parameters of at least the second encoding layer of the risk identification model to be trained are adjusted with the goal of minimizing the distance between features within each first combination and maximizing the distance between features within each second combination.
[0086] For example, the training samples used to train the risk identification model are samples 1 to 3. The first feature corresponding to sample 1 is feature A, and the second feature corresponding to sample 1 is feature a. The first feature corresponding to sample 2 is feature B, and the second feature corresponding to sample 2 is feature b. The first feature corresponding to sample 3 is feature C, and the second feature corresponding to sample 3 is feature c. Then, there are three positive sample pairs: feature A and feature a, feature B and feature b, and feature C and feature c. There are six negative sample pairs: feature A and feature b, feature A and feature c, feature B and feature a, feature B and feature c, feature C and feature a, and feature C and feature b. With the goal of minimizing the distance between features within each positive sample pair (i.e., between feature A and feature a, between feature B and feature b, and between feature C and feature c) and maximizing the distance between features within each negative sample pair (i.e., between feature A and feature b, between feature A and feature c, between feature B and feature a, between feature B and feature c, between feature C and feature a, and between feature C and feature b), the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted.
[0087] In this specification, users at risk often exhibit clustering in their identifier types. This means that the identifier types in the transaction data of most at-risk users are quite similar. For example, multiple users may use the same transaction device, or multiple users may have the same recipient in their transaction data (i.e., multiple users transfer funds to the same transaction instrument). Therefore, the transaction instrument identifiers for these users will be identical. Based on this, when training the risk identification model, the goal is to minimize the distance between the first feature obtained from specified data in the training samples and the second feature obtained from all types of data in that training sample, and to maximize the distance between the first feature and the second feature obtained from all types of data in other training samples. At least the model parameters of the second encoding layer in the risk identification model to be trained are adjusted so that the distance between the first and second features of each training sample becomes increasingly closer, increasing their similarity, while the distance between the first feature of each training sample and the second features of other training samples becomes increasingly farther, decreasing their similarity. By using the first feature of each training sample as a supervision signal, the second encoding layer in the risk identification model to be trained is guided to extract features from the training samples. This allows the extracted features to better characterize the risks associated with users, and the risk identification of users is performed based on the extracted features, resulting in more accurate identification results.
[0088] Furthermore, this specification primarily utilizes the second encoding layer to extract features from all types of data in the training samples, obtaining the second features. Subsequent applications also employ the second encoding layer to extract features from the transaction events of the user to be identified, thereby determining whether the user poses a risk based on the extracted features. The first encoding layer is mainly used to extract the first features from specified data in the training samples, and these first features guide the second encoding layer in feature extraction. Therefore, in step S106 above, when the server trains the model with the goal of minimizing the distance between the first and second features of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples, it must at least adjust the model parameters of the second encoding layer in the risk identification model to be trained, so that the second encoding layer can better extract features under the guidance of the first features. In subsequent applications, the trained second encoding layer is used to extract features from the transaction events of the user to be identified, allowing the extracted features to better characterize the risk posed by the user to be identified.
[0089] Based on this, the aforementioned first encoding layer can be a pre-trained encoding layer. Specifically, the server can pre-train the first encoding layer based on text data. This text data can be a general text dataset, and its format can be KV or JSON text; this specification does not impose specific limitations. Furthermore, the aforementioned first encoding layer can be any existing text encoder. Therefore, when the first encoding layer is a pre-trained encoding layer, in step S106 above, the server can adjust only the model parameters of the second encoding layer in the risk recognition model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples.
[0090] Furthermore, the aforementioned first encoding layer can also be an untrained encoding layer, which can be trained together with the second encoding layer in step S106. Therefore, in step S106, the server can adjust the model parameters of the first encoding layer and the second encoding layer in the risk recognition model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples.
[0091] In this specification, since the specified data can be at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier, the server can determine each specified data in step S102 above. Therefore, when inputting the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample, the server can concatenate the specified data corresponding to the training sample to obtain the text corresponding to the training sample. Then, the text corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample. Each specified data is text in KV JSON format. For example, if the server determines the specified data from training sample 1 as transaction device identifier 1, transaction item name name 2, and transaction tool identifier 1, then concatenating the specified data yields the text corresponding to the training sample, which is {transaction device identifier: 1, transaction item name: name 2, transaction tool identifier: 1}. This text is then input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
[0092] As can be seen from the above method, when training the model, the server first determines the historical transaction events of each user as training samples. Then, for each training sample, a specified type of data within that training sample is designated as specified data. This specified data is input into the first encoding layer of the risk identification model to be trained, determining the first feature of the training sample. Similarly, all types of data from the training sample are input into the second encoding layer of the risk identification model to be trained, determining the second feature of the training sample. Subsequently, with the goal of minimizing the distance between the first and second features of the training sample and maximizing the distance between the first feature of the training sample and the second features of all other training samples, the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted. The first features corresponding to each training sample guide the second encoding layer in extracting features from each training sample, enabling the extracted features to better represent the risks associated with users. Based on these extracted features, user risk identification is performed, resulting in more accurate identification results. The subsequently trained risk identification model can then perform risk identification on the users to be identified, improving the accuracy of the identification results.
[0093] In this specification, the training samples in step S100 above can be labeled or unlabeled samples. The server can adjust the model parameters of at least the second encoding layer in the risk identification model to be trained (i.e., train the second encoding layer) based solely on the training samples, and upon completion of training, apply the trained second encoding layer to perform risk identification on the user to be identified. Therefore, after training the second encoding layer of the risk identification model to be trained, as follows... Figure 2 As shown, Figure 2 This diagram illustrates the application of a risk identification model provided in this specification. The server identifies the transaction events of the user to be identified, inputs these events into the second encoding layer of the trained risk identification model, and determines the second feature of the transaction events. Then, in a pre-built retrieval database, a third feature is identified that is within a specified distance from the second feature. The risk profile of the user corresponding to this third feature is then used as the risk profile of the user to be identified. Based on the risk profile of the user to be identified, risk control is implemented for that user.
[0094] The retrieval database is pre-built on the server and includes features extracted from each user's transaction events. Users corresponding to the features in the database can be classified as either risky or risk-free. Furthermore, whether a user is risky can be determined by risk control personnel based on the user's transaction events; that is, whether the risk control personnel have marked the user as risky. Specifically, when a transaction event is risky, the risk control personnel can mark it as risky, thus the user corresponding to that transaction event is considered risky. When a transaction event is not risky, the risk control personnel can mark it as risk-free, thus the user corresponding to that transaction event is considered risk-free. Of course, whether a user is risky can also be determined by any existing risk control identification system based on the user's transaction events; this specification does not specify a particular type of risk. Similarly, the specific type of risk a risky user may possess can also be marked by risk control personnel or determined by any existing risk control identification system; this specification does not specify a particular type of risk.
[0095] The specified range mentioned above is a pre-set value. When the distance between a certain feature in the search database and the second feature is within the specified range, it indicates that the feature is similar to the second feature. The risk status of the user corresponding to this feature can be used as the risk status of the user to be identified. In other words, when the feature is similar to the second feature, the user corresponding to the feature is at risk, and therefore the user to be identified corresponding to the second feature may also be at risk. Thus, the risk status of the user corresponding to this feature can be used as the risk status of the user to be identified. The aforementioned risk status can be either risky or risk-free, or it can be any type of risk or risk-free; this specification does not impose specific limitations.
[0096] In the aforementioned risk control process based on the risk profile of the user to be identified, the server can determine a risk control strategy based on the user's risk profile and then implement the risk control accordingly. The strategy is determined based on the user's risk profile. When a user poses a high risk, a stricter strategy is required. For example, if user 1 is identified as high-risk, the strategy could be to restrict user 1's use of payment tools. Conversely, when a user poses a low risk or no risk at all, a more lenient strategy is required. For instance, if user 1 is identified as having relatively low risk, the strategy could be to limit the number of times user 1 can use payment tools each month.
[0097] When determining risk control strategies for a user based on their risk profile, the server can send the user's risk profile to risk control personnel. The risk control personnel can then formulate a strategy based on the received risk profile and return it. Alternatively, the server can determine the corresponding strategy from a pre-built strategy library based on the user's risk profile.
[0098] In this specification, to better identify user risks, the server can fine-tune the risk identification model to be trained based on the labels corresponding to the training samples, so that the trained risk identification model can better identify risks. Therefore, the risk identification model to be trained also includes a recognition layer. In step S106 above, the server can first adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples. Then, the risk information corresponding to each user's transaction events in history is used as the label for each training sample. The second features of each training sample are input into the recognition layer of the risk identification model to be trained to determine the recognition result corresponding to each training sample. With the goal of minimizing the difference between each recognition result and each label, the model parameters of at least the second encoding layer are adjusted. Here, the server trains at least the second encoding layer of the risk identification model to be trained based on the training samples. Then, based on the training samples and the corresponding annotations, at least the second encoding layer in the risk identification model to be trained is trained. The risk situation corresponding to the transaction events of each user in history can be one of risky and risk-free, or one of various types of risk and risk-free. The risk situation can be determined by risk control personnel or by any existing risk control identification system. This specification does not make specific limitations.
[0099] When the second features of each training sample are input into the recognition layer of the risk recognition model to be trained, the second features of each training sample input into the recognition layer are obtained by inputting each training sample into the trained second encoding layer. The trained second encoding layer is obtained by training with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples.
[0100] In this specification, the risk identification model can be trained in two phases. In the first training phase, the server trains at least the second encoding layer of the risk identification model for each training sample, aiming to minimize the distance between the first feature and the second feature of that training sample, and maximize the distance between the first feature and the second features of all other training samples. In the second training phase, following the first, the server trains at least the second encoding layer of the risk identification model based on each training sample (i.e., the samples used to train the second encoding layer) and the annotations of each training sample.
[0101] In the second training phase described above, the second encoding layer is fine-tuned using the annotations and recognition results of each training sample. This allows the second encoding layer to better extract risk-related features from the training samples, thereby better identifying the risks present in the training samples. Therefore, the recognition layer can be a network layer pre-trained by the server, and in the second training phase, only the second encoding layer can be trained. This recognition layer can be pre-trained by the server based on historical transaction events; of course, it can also be any existing network layer that identifies whether a transaction event has risk, and this specification does not specify any particular limitation. Therefore, in the second training phase, the goal is to minimize the difference between each recognition result and each annotation. At least when adjusting the model parameters of the second encoding layer, the server can aim to minimize the difference between each recognition result and each annotation, and only adjust the model parameters of the second encoding layer.
[0102] In addition, the aforementioned recognition layer can also be an untrained network layer. This recognition layer can be trained together with the second encoding layer in the second training stage. Therefore, in the second training stage, with the goal of minimizing the difference between each recognition result and each label, at least when adjusting the model parameters of the second encoding layer, the server can adjust the model parameters of the second encoding layer and the model parameters of the recognition layer with the goal of minimizing the difference between each recognition result and each label.
[0103] In this specification, during the second training phase, after adjusting the model parameters of at least the second encoding layer in the risk identification model to be trained, the server can also determine some new labeled training samples. Based on the newly determined training samples and their labels, the server can then adjust the model parameters of at least the second encoding layer in the risk identification model to be trained. Therefore, the training samples determined in step S100 can be unlabeled training samples. Specifically, in step S106, the server can first adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, aiming to minimize the distance between the first feature and the second feature of the training sample and maximize the distance between the first feature and the second features of other training samples. Then, the server can determine historical transaction events of each user as labeled samples and use the risk status corresponding to each transaction event as the label for each labeled sample. Next, the labeled samples are input into the second encoding layer to determine the identification results. Then, the server adjusts the model parameters of at least the second encoding layer, aiming to minimize the difference between each identification result and each label.
[0104] Based on this, after training the second encoding layer, the server can use only the trained risk identification model's second encoding layer, combined with a pre-built retrieval library, to perform risk identification on the user to be identified. Specifically, the server can determine the user's transaction events, input these events into the trained second encoding layer, and determine the second feature of the transaction events. Then, in the pre-built retrieval library, it determines the third feature whose distance from the second feature is within a specified range, and then determines the user's risk profile corresponding to the third feature, which is used as the user's risk profile to be identified. Based on the user's risk profile, risk control is then implemented for the user to be identified.
[0105] Furthermore, the aforementioned recognition layer can be a pre-trained network layer or a network layer trained together with the second encoding layer. Therefore, after training the second encoding layer or both the second encoding layer and the recognition layer, the server can also utilize the trained risk recognition model's second encoding layer and recognition layer to perform risk identification on the user to be identified. Specifically, for example... Figure 3 As shown, Figure 3 This diagram illustrates the application of another risk identification model provided in this specification. The server can determine the transaction events of the user to be identified, input the transaction events into the trained second encoding layer, and determine the second feature of the transaction events. Then, the second feature is input into the trained identification layer to determine the risk profile of the user to be identified, and risk control is implemented for the user based on the risk profile.
[0106] In this specification, each training sample can have multiple labels. For example, a user may face multiple types of risks, or a transaction event may involve multiple types of risks. Therefore, the training samples corresponding to a user's transaction event or the transaction event can have multiple labels, representing multiple types of risks. Thus, when adjusting the model parameters of the second encoding layer with the goal of minimizing the difference between each recognition result and each label, the server can adjust the model parameters of the second encoding layer for each training sample and for each label of that training sample, using the recognition result corresponding to that training sample and that label. This allows the second encoding layer to learn how to extract features of various types of risks, enabling better identification of different types of risks in subsequent steps.
[0107] The above describes a model training method provided by one or more embodiments of this specification. Based on the same idea, this specification also provides a corresponding model training apparatus, such as... Figure 4 As shown.
[0108] Figure 4 This is a schematic diagram of a model training device provided in this specification, specifically including:
[0109] The first determination module 200 is used to determine the transaction events of each user in history as training samples.
[0110] The second determining module 202 is used to determine, for each training sample, the data of a specified type in the training sample as the specified data;
[0111] The feature extraction module 204 is used to input the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample; and to input all types of data in the training sample into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample.
[0112] Training module 206 is used to adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of other training samples.
[0113] Optionally, the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier;
[0114] The second determining module 202 is specifically used to determine the data of each specified type in the training sample as each specified data.
[0115] The feature extraction module 204 is specifically used to concatenate the specified data corresponding to the training sample to obtain the text corresponding to the training sample; input the text corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
[0116] Optionally, the risk identification model to be trained further includes an identification layer;
[0117] The training module 206 is specifically used to: adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second features of other training samples; use the risk status corresponding to the transaction events of each user in history as the label corresponding to each training sample; input the second features of each training sample into the identification layer of the risk identification model to be trained to determine the identification result corresponding to each training sample; and adjust the model parameters of at least the second encoding layer with the goal of minimizing the difference between each identification result and each label.
[0118] Optionally, the device further includes:
[0119] Application module 208 is used to determine the transaction events of the user to be identified; input the transaction events into the trained second encoding layer to determine the second feature of the transaction events; input the second feature into the trained recognition layer to determine the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status.
[0120] Optionally, the device further includes:
[0121] Application module 208 is used to determine the transaction events of the user to be identified; input the transaction events into the trained second encoding layer to determine the second feature of the transaction events; determine the third feature in a pre-built retrieval library whose distance from the second feature is within a specified range; determine the risk status of the user corresponding to the third feature and use it as the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status of the user to be identified.
[0122] Optionally, the device further includes:
[0123] Application module 208 is used to determine the transaction events of the user to be identified; input the transaction events into the second encoding layer of the trained risk identification model to determine the second feature of the transaction events; determine the third feature in a pre-built retrieval library whose distance from the second feature is within a specified range; determine the risk status of the user corresponding to the third feature and use it as the risk status of the user to be identified; and perform risk control on the user to be identified based on the risk status of the user to be identified.
[0124] Optionally, the training module 206 is specifically used to: take the first feature of the training sample and the second feature of the training sample as a first combination, and take the first feature of the training sample and the second features of other training samples besides the training sample as a second combination; with the goal of minimizing the distance between features in each first combination and maximizing the distance between features in each second combination, adjust at least the model parameters of the second encoding layer in the risk identification model to be trained.
[0125] This specification also provides a computer-readable storage medium storing a computer program that can be used to execute the above-described... Figure 1 The method for training the model shown.
[0126] This instruction manual also provides Figure 5 The diagram shows a schematic structural representation of the electronic device. Figure 5 At the hardware level, the electronic device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for the business operations. The processor reads the corresponding computer program from the non-volatile memory into memory and then runs it to achieve the above-mentioned functions. Figure 1 The method for training the model is shown. Of course, in addition to software implementation, this specification does not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. In other words, the execution subject of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.
[0127] In the 1990s, improvements to a technology could be clearly distinguished as either hardware improvements (e.g., improvements to the circuit structure of diodes, transistors, switches, etc.) or software improvements (improvements to the methodology). However, with technological advancements, many methodological improvements today can be considered direct improvements to the hardware circuit structure. Designers almost always obtain the corresponding hardware circuit structure by programming the improved methodology into the hardware circuit. Therefore, it cannot be said that a methodological improvement cannot be implemented using hardware physical modules. For example, a Programmable Logic Device (PLD) (such as a Field Programmable Gate Array (FPGA)) is such an integrated circuit whose logic function is determined by the user programming the device. Designers can program and "integrate" a digital system onto a PLD themselves, without needing chip manufacturers to design and manufacture dedicated integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing integrated circuit chips, this programming is mostly implemented using "logic compiler" software. Similar to the software compiler used in program development, the original code before compilation must be written in a specific programming language, called a Hardware Description Language (HDL). There are many HDLs, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), Confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), Lava, Lola, MyHDL, PALASM, and RHDL (Ruby Hardware Description Language). Currently, the most commonly used are VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog. Those skilled in the art should understand that by simply performing some logic programming on the method flow using one of these hardware description languages and programming it into an integrated circuit, the hardware circuit implementing the logical method flow can be easily obtained.
[0128] The controller can be implemented in any suitable manner. For example, it can take the form of a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro)processor, logic gates, switches, application-specific integrated circuits (ASICs), programmable logic controllers, and embedded microcontrollers. Examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicon Labs C8051F320. A memory controller can also be implemented as part of the control logic of the memory. Those skilled in the art will also recognize that, in addition to implementing the controller in purely computer-readable program code form, the same functionality can be achieved by logically programming the method steps to make the controller take the form of logic gates, switches, application-specific integrated circuits, programmable logic controllers, and embedded microcontrollers. Therefore, such a controller can be considered a hardware component, and the means included therein for implementing various functions can also be considered as structures within the hardware component. Alternatively, the means for implementing various functions can be considered as both software modules implementing the method and structures within the hardware component.
[0129] The systems, devices, modules, or units described in the above embodiments can be implemented by computer chips or entities, or by products with certain functions. A typical implementation device is a computer. Specifically, a computer can be, for example, a personal computer, laptop computer, cellular phone, camera phone, smartphone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or any combination of these devices.
[0130] For ease of description, the above devices are described in terms of function, divided into various units. Of course, in implementing this specification, the functions of each unit can be implemented in one or more software and / or hardware.
[0131] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0132] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0133] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0134] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0135] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.
[0136] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.
[0137] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.
[0138] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0139] Those skilled in the art will understand that the embodiments of this specification can be provided as methods, systems, or computer program products. Therefore, this specification may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this specification may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0140] This specification can be described in the general context of computer-executable instructions that are executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform a specific task or implement a specific abstract data type. This specification can also be practiced in distributed computing environments, where tasks are performed by remote processing devices connected via a communication network. In distributed computing environments, program modules can reside in local and remote computer storage media, including storage devices.
[0141] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to interchangeably. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions in the method embodiments.
[0142] The above description is merely an embodiment of this specification and is not intended to limit this specification. Various modifications and variations can be made to this specification by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this specification should be included within the scope of the claims of this specification.
Claims
1. A method for training a model, comprising: Historical transaction events of each user were identified as training samples. For each training sample, the data of a specified type in that training sample is designated as the specified data. Input the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained, and determine the first feature of the training sample; All types of data from the training sample are input into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample; With the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of every other training sample, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted.
2. The method as described in claim 1, wherein the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier; The data of a specified type in the training sample is identified as the specified data, specifically including: Determine the data of each specified type in the training sample as the specified data. The specified data corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample, specifically including: The specified data corresponding to the training sample are concatenated to obtain the text corresponding to the training sample. The text corresponding to the training sample is input into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
3. The method as described in claim 1, wherein the risk identification model to be trained further includes an identification layer; With the objective of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of every other training sample, the model parameters of at least the second encoding layer in the risk identification model to be trained are adjusted, specifically including: With the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of every other training sample except the training sample, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least. The risk profile corresponding to each user's transaction events in history is used as the label for each training sample. The second feature of each training sample is input into the recognition layer of the risk recognition model to be trained, and the recognition result corresponding to each training sample is determined. With the goal of minimizing the difference between each recognition result and each annotation, at least the model parameters of the second coding layer are adjusted.
4. The method of claim 3, further comprising: Identify the transaction events of the user to be identified; The transaction event is input into the trained second encoding layer to determine the second feature of the transaction event; The second feature is input into the trained recognition layer to determine the risk status of the user to be identified; Based on the aforementioned risk situation, risk control measures are implemented for the users to be identified.
5. The method of claim 3, further comprising: Identify the transaction events of the user to be identified; The transaction event is input into the trained second encoding layer to determine the second feature of the transaction event; In a pre-built retrieval library, a third feature is identified whose distance from the second feature is within a specified range; Determine the risk profile of the user corresponding to the third feature, and use it as the risk profile of the user to be identified; Risk control is performed on the users to be identified based on their risk profile.
6. The method of claim 1, further comprising: Identify the transaction events of the user to be identified; The transaction event is input into the second encoding layer of the trained risk identification model to determine the second feature of the transaction event; In a pre-built retrieval library, a third feature is identified whose distance from the second feature is within a specified range; Determine the risk profile of the user corresponding to the third feature, and use it as the risk profile of the user to be identified; Risk control is performed on the users to be identified based on their risk profile.
7. The method of claim 1, wherein the goal is to minimize the distance between the first feature and the second feature of the training sample and maximize the distance between the first feature of the training sample and the second feature of every other training sample, the method adjusts at least the model parameters of the second encoding layer in the risk identification model to be trained, specifically including: The first feature of the training sample and the second feature of the training sample are used as a first combination, and the first feature of the training sample is used as a second combination with the second features of other training samples besides the training sample. With the goal of minimizing the distance between features within each first combination and maximizing the distance between features within each second combination, the model parameters of the second encoding layer in the risk identification model to be trained are adjusted at least.
8. An apparatus for model training, comprising: The first determination module is used to determine the transaction events of each user in history as training samples. The second determining module is used to determine, for each training sample, the data of a specified type in that training sample as the specified data; The feature extraction module is used to input the specified data corresponding to the training sample into the first encoding layer of the risk identification model to be trained, and to determine the first feature of the training sample. All types of data from the training sample are input into the second encoding layer of the risk identification model to be trained to determine the second feature of the training sample; The training module is used to adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of every other training sample.
9. The apparatus of claim 8, wherein the specified data is at least two of the following: transaction device identifier, transaction item name, and transaction tool identifier; The second determining module is specifically used to determine the specified data types in the training sample as the specified data. The feature extraction module is specifically used to concatenate the specified data corresponding to the training sample to obtain the text corresponding to the training sample; and input the text corresponding to the training sample into the first encoding layer of the risk identification model to be trained to determine the first feature of the training sample.
10. The apparatus of claim 8, wherein the risk identification model to be trained further comprises an identification layer; The training module is specifically used to: adjust the model parameters of at least the second encoding layer in the risk identification model to be trained, with the goal of minimizing the distance between the first feature and the second feature of the training sample and maximizing the distance between the first feature of the training sample and the second feature of every other training sample; use the risk status corresponding to the transaction events of each user in history as the label corresponding to each training sample; input the second feature of each training sample into the identification layer of the risk identification model to be trained to determine the identification result corresponding to each training sample; and adjust the model parameters of at least the second encoding layer with the goal of minimizing the difference between each identification result and each label.
11. The apparatus of claim 10, further comprising: An application module is used to determine the transaction events of the user to be identified; the transaction events are input into the trained second encoding layer to determine the second feature of the transaction events; The second feature is input into the trained recognition layer to determine the risk status of the user to be identified; based on the risk status, risk control is performed on the user to be identified.
12. The apparatus of claim 10, further comprising: An application module is used to determine the transaction events of the user to be identified; the transaction events are input into the trained second encoding layer to determine the second feature of the transaction events; In a pre-built retrieval database, a third feature is determined whose distance from the second feature is within a specified range; the risk status of the user corresponding to the third feature is determined and used as the risk status of the user to be identified; based on the risk status of the user to be identified, risk control is performed on the user to be identified.
13. The apparatus of claim 8, further comprising: The application module is used to determine the transaction events of the user to be identified; The transaction event is input into the second encoding layer of the trained risk identification model to determine the second feature of the transaction event; In a pre-built retrieval database, a third feature is determined whose distance from the second feature is within a specified range; the risk status of the user corresponding to the third feature is determined and used as the risk status of the user to be identified; based on the risk status of the user to be identified, risk control is performed on the user to be identified.
14. The apparatus of claim 8, wherein the training module is specifically configured to: combine the first feature of the training sample with the second feature of the training sample as a first combination, combine the first feature of the training sample with the second features of other training samples besides the training sample as a second combination; and adjust the model parameters of at least the second encoding layer in the risk identification model to be trained with the goal of minimizing the distance between features in each first combination and maximizing the distance between features in each second combination.
15. A computer-readable storage medium storing a computer program that, when executed by a processor, implements the method described in any one of claims 1 to 7.
16. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the method described in any one of claims 1 to 7.