Risk prediction method and device based on user behavior data
By constructing associated feature data and combining it with a supervised prediction model for feature fusion, the problem of low accuracy in user behavior analysis in existing technologies is solved, achieving more efficient risk identification and a balance between user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA UNITED NETWORK COMM GRP CO LTD
- Filing Date
- 2022-12-30
- Publication Date
- 2026-06-19
AI Technical Summary
Existing risk identification methods based on user profiles and other behavioral analyses suffer from low accuracy and are unable to flexibly adapt to various breaching techniques, making it difficult to balance user experience and security.
By acquiring authentication feature data from multiple dimensions, constructing associated feature data using the Pearson correlation coefficient, and combining the first and second supervised prediction models for feature fusion, the prediction accuracy is improved.
It improves the accuracy and comprehensiveness of user behavior risk prediction, better describes user behavior characteristics, and enhances the flexibility and accuracy of risk identification.
Smart Images

Figure CN116011640B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of computer technology, and specifically relates to a risk prediction method and apparatus based on user behavior data. Background Technology
[0002] Currently, from the national level to enterprises, security and efficiency are receiving increasing attention, and identity management in the security field is something everyone faces daily. Security measures used in identity management include commonly used QR code verification, fingerprint recognition verification, dynamic facial recognition verification, ID card / phone number verification, and password verification using alphanumeric input. As can be seen, there are many existing verification methods. This abundance stems from two main reasons: firstly, each verification method and its application scenarios present a constant challenge, with technologies constantly being compromised; secondly, the verification methods themselves have imperfections. For example, strong passwords may be difficult for users to remember, while weak passwords are easily cracked. In the field of facial recognition, for instance, using head-shaking liveness detection results in a poor user experience, while static detection allows fake photos and videos to slip through. In short, user experience, security, and cost are often mutually exclusive during the verification process.
[0003] To address these issues, behavioral analysis technologies such as user profiling have emerged. These technologies improve user experience without interfering with user behavior while simultaneously enhancing the security and accuracy of identity verification. Currently, the most practical and feasible method in the industry is rule-based analysis of user behavior. However, this method has limitations: it is experience-based, meaning it matches user behavior rigidly, potentially classifying users as security threats even when none exist. Furthermore, with the proliferation of various breaching techniques, experience-based rules cannot flexibly adapt to new security threats.
[0004] This shows that traditional risk identification methods based on user profiles and other behavioral analyses have the drawback of low accuracy. Summary of the Invention
[0005] This invention proposes a risk prediction method and apparatus based on user behavior data to solve the problem of low accuracy in traditional risk identification methods based on user profiles and other behavioral analyses.
[0006] Firstly, this disclosure provides a risk prediction method based on user behavior data, including:
[0007] Acquire user behavior data of the target user; wherein, the user behavior data includes authentication feature data of multiple dimensions;
[0008] Based on the correlation between authentication feature data from multiple dimensions, authentication feature data from at least two dimensions are combined into associated feature data.
[0009] The authentication feature data of the multiple dimensions and the association feature data are input into the first supervised prediction model to obtain the first prediction result output by the first supervised prediction model.
[0010] The authentication feature data of the multiple dimensions and the association feature data are input into the second supervised prediction model to obtain the second prediction result output by the second supervised prediction model.
[0011] The first prediction result and the second prediction result are fused together to obtain a fused prediction result, and the behavioral risk level of the target user is predicted based on the fused prediction result.
[0012] Secondly, this disclosure provides a risk prediction device based on user behavior data, including:
[0013] The data acquisition module is adapted to acquire user behavior data of the target user; wherein, the user behavior data includes authentication feature data of multiple dimensions;
[0014] The combination module is suitable for combining authentication feature data of at least two dimensions into associated feature data based on the correlation between authentication feature data of multiple dimensions.
[0015] The first result acquisition module is adapted to input the authentication feature data of the multiple dimensions and the associated feature data into the first supervised prediction model to obtain the first prediction result output by the first supervised prediction model.
[0016] The second result acquisition module is adapted to input the authentication feature data of the multiple dimensions and the association feature data into the second supervised prediction model to obtain the second prediction result output by the second supervised prediction model;
[0017] The fusion prediction module is adapted to perform fusion processing on the first prediction result and the second prediction result to obtain a fusion prediction result, and predict the behavioral risk level of the target user based on the fusion prediction result.
[0018] Thirdly, this disclosure provides an electronic device, including:
[0019] At least one processor; and a memory communicatively connected to said at least one processor;
[0020] The memory stores one or more computer programs that can be executed by the at least one processor, and the one or more computer programs are executed by the at least one processor to enable the at least one processor to perform the risk prediction method based on user behavior data as described above.
[0021] Fourthly, this disclosure provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the risk prediction method based on user behavior data as described above.
[0022] According to the risk prediction method and apparatus based on user behavior data proposed in this invention, authentication feature data of at least two dimensions can be combined into associated feature data based on the correlation between authentication feature data of multiple dimensions. Since the associated feature data contains at least two highly correlated authentication feature data, the correlation between the authentication feature data can be strengthened, thereby more accurately describing the user's behavioral characteristics and improving the accuracy of subsequent predictions. Furthermore, this method employs a combination of a first supervised prediction model and a second supervised prediction model. By fusing the first prediction result output by the first supervised prediction model and the second prediction result output by the second supervised prediction model, the accuracy and comprehensiveness of the final fused prediction result can be improved, thereby further enhancing the accuracy of risk prediction.
[0023] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description
[0024] Various other advantages and benefits will become apparent to those skilled in the art upon reading the following detailed description of preferred embodiments. The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0025] Figure 1 A flowchart of a risk prediction method based on user behavior data provided in Embodiment 1 of the present invention is shown.
[0026] Figure 2 A flowchart illustrating a specific example of a risk prediction method based on user behavior data is shown.
[0027] Figure 3 A schematic diagram of a risk prediction device based on user behavior data provided in Embodiment 2 of the present invention is shown.
[0028] Figure 4A schematic diagram of the structure of an electronic device provided in Embodiment 3 of the present invention is shown. Detailed Implementation
[0029] To enable those skilled in the art to better understand the technical solutions of this disclosure, exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments of this disclosure to aid understanding. These should be considered merely exemplary. Therefore, those skilled in the art should recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and conciseness, descriptions of well-known functions and structures are omitted in the following description.
[0030] Where there is no conflict, the various embodiments of this disclosure and the features thereof in the embodiments may be combined with each other.
[0031] As used herein, the term “and / or” includes any and all combinations of one or more related enumerated entries.
[0032] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit this disclosure. As used herein, the singular forms “a” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that when the terms “comprising” and / or “made of” are used in this specification, the presence of the stated feature, integral, step, operation, element, and / or component is specified, but the presence or addition of one or more other features, integrals, steps, operations, elements, components, and / or groups thereof is not excluded. Words such as “connected” or “linked” are not limited to physical or mechanical connections but can include electrical connections, whether direct or indirect.
[0033] Unless otherwise specified, all terms used herein (including technical and scientific terms) have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant art and this disclosure, and will not be interpreted as having an idealized or overly formal meaning, unless expressly so defined herein.
[0034] Example 1
[0035] Figure 1 The flowchart of a risk prediction method based on user behavior data provided in Embodiment 1 of the present invention is shown below. Figure 1 The method includes:
[0036] Step S110: Obtain user behavior data of the target user; wherein, the user behavior data contains authentication feature data of multiple dimensions.
[0037] In this context, "target users" refers to users whose behavioral risk level is to be assessed. User behavior data is used to characterize the operational behaviors of target users, specifically including various types such as login-related behaviors, browsing-related behaviors, and click-related behaviors. Correspondingly, login-related behaviors correspond to authentication feature data for the login dimension, browsing-related behaviors correspond to authentication feature data for the browsing dimension, and click-related behaviors correspond to authentication feature data for the click dimension. Therefore, user behavior data contains authentication feature data across multiple dimensions, with each dimension describing a specific type of behavioral characteristic related to user authentication from a particular perspective.
[0038] For example, in a specific instance, it's necessary to obtain user authentication behavior data (i.e., user behavior data), specifically including: authentication time, username, IP address, location, browser version number, browser type, etc. Optionally, authentication risk data can also be further obtained, specifically including: IP threat level, risk identifier, etc. Authentication risk data is used to predict the user's risk profile.
[0039] Step S120: Based on the correlation between authentication feature data of multiple dimensions, combine authentication feature data of at least two dimensions into associated feature data.
[0040] In developing this invention, the inventors discovered that using isolated authentication feature data across multiple dimensions is often insufficient to reflect the interrelationships and interactions between these features, leading to incomplete and inaccurate feature data in the input model. To address this issue, this embodiment further explores the relationships between authentication feature data across multiple dimensions, combining closely related features into a holistic set of associated feature data to comprehensively and accurately reflect the characteristics of the feature data. Specifically, the relationships between authentication feature data across multiple dimensions reflect the degree of connection between two or more authentication feature data. Correspondingly, authentication feature data with high degrees of connection are combined to obtain associated feature data, thereby comprehensively describing user behavior characteristics through this associated feature data.
[0041] In one optional implementation, when combining authentication feature data of at least two dimensions into associated feature data based on the correlation between authentication feature data of multiple dimensions, this is achieved as follows: First, calculate the Pearson correlation coefficient between any two dimensions of authentication feature data; the Pearson correlation coefficient is used to characterize the degree of correlation between any two dimensions of authentication feature data. For example, the larger the Pearson correlation coefficient, the more correlated the corresponding two dimensions of authentication feature data are; conversely, the smaller the Pearson correlation coefficient, the less correlated the corresponding two dimensions of authentication feature data are. Then, combine the authentication feature data of two dimensions with a Pearson correlation coefficient greater than a preset threshold into associated feature data. Thus, this method can construct new features (i.e., associated feature data) based on feature correlation. Specifically, the Pearson correlation coefficient can be used to construct a correlation coefficient matrix for each feature, thereby combining highly correlated features into new strong features. Therefore, associated feature data can also be called enhanced associated feature data, used to combine two originally independent authentication feature data into a single, interconnected holistic feature data.
[0042] In one optional implementation, the authentication feature data for two dimensions with a Pearson correlation coefficient greater than a preset threshold includes: authentication feature data for the login dimension and authentication feature data for the login time dimension; and the associated feature data includes: authentication login sequence data; correspondingly, when combining the authentication feature data for two dimensions with a Pearson correlation coefficient greater than the preset threshold into associated feature data, specifically, the authentication feature data for the login dimension and the authentication feature data for the login time dimension are combined into authentication login sequence data; wherein, the authentication login sequence data is used to characterize the time interval between multiple authentication login operations of the same user. Specifically, a user login sequence can be constructed as authentication login sequence data. Since risk authentication and time often have a very strong correlation, in this embodiment, by adding historical features to the current user's current verification, an authentication login time series for the current user can be obtained. Based on the authentication login time series, the time interval features of user login can be obtained, such as average interval time, median interval login time, and other feature information, thereby comprehensively reflecting the characteristics of user login time.
[0043] Step S130: Input the authentication feature data and associated feature data of multiple dimensions into the first supervised prediction model, and obtain the first prediction result output by the first supervised prediction model.
[0044] In this embodiment, the authentication feature data and related feature data of multiple dimensions are used together as the user behavior feature data of the target user. The authentication feature data and related feature data of multiple dimensions are respectively input into the first supervised prediction model and the second supervised prediction model so as to perform risk prediction through the first supervised prediction model and the second supervised prediction model respectively.
[0045] The first and second supervised prediction models can be implemented using different types of models. Furthermore, both models are used to perform risk prediction based on the input feature data. Therefore, by utilizing the first and second supervised prediction models, the accuracy and comprehensiveness of the predictions can be improved.
[0046] In one optional implementation, the first supervised prediction model is an XGB model, and the first supervised prediction model further includes a first XGB model and a second XGB model; then the first prediction result output by the first supervised prediction model is obtained by performing feature fusion processing on the first XGB prediction result output by the first XGB model and the second XGB prediction result output by the second XGB model to obtain the first prediction result output by the first supervised prediction model.
[0047] Step S140: Input the authentication feature data and association feature data of multiple dimensions into the second supervised prediction model, and obtain the second prediction result output by the second supervised prediction model.
[0048] In one optional implementation, the second supervised prediction model is an LGB model, and the second supervised prediction model further includes: a first LGB model and a second LGB model; then the second prediction result output by the second supervised prediction model is obtained by performing a weighted fusion process on the first LGB prediction result output by the first LGB model and the second LGB prediction result output by the second LGB model to obtain the second prediction result output by the second supervised prediction model.
[0049] To improve the accuracy of each model, the parameters of each model are further optimized through random grid parameter tuning. Accordingly, the model parameters in the first XGB model, the second XGB model, the first LGB model, and / or the second LGB model are set as follows: First, a first parameter range group is obtained by randomly seeding parameters; then, an elasticity coefficient is used to narrow the value range of the first parameter range group to obtain a second parameter range group; wherein the parameter range of the second parameter range group is smaller than that of the first parameter range group; finally, a set of parameters is selected as the model parameters within the parameter range of the second parameter range group. This method optimizes the model parameters of each model and improves prediction accuracy.
[0050] Step S150: Perform fusion processing on the first prediction result and the second prediction result to obtain the fused prediction result, and predict the behavioral risk level of the target user based on the fused prediction result.
[0051] Specifically, the first and second prediction results are combined to obtain a fused prediction result, which is then used to predict the behavioral risk level of the target user. Since the fused prediction result is determined jointly by the first and second prediction results, it combines the advantages of both models, thereby improving the accuracy of the prediction.
[0052] Optionally, the first supervised prediction model and the second supervised prediction model can be trained using the user behavior data and authentication risk data of the sample users; wherein, the authentication risk data of the sample users is used to characterize the risk level of the user behavior data of the sample users.
[0053] Therefore, in this embodiment, authentication feature data of at least two dimensions can be combined into associated feature data based on the correlation between authentication feature data of multiple dimensions. Since the associated feature data contains at least two highly correlated authentication feature data, the correlation between the authentication feature data can be strengthened, thereby more accurately describing the user's behavioral characteristics and improving the accuracy of subsequent predictions. Furthermore, this method combines a first supervised prediction model and a second supervised prediction model. By fusing the first prediction result output by the first supervised prediction model and the second prediction result output by the second supervised prediction model, the accuracy and comprehensiveness of the final fused prediction result can be improved, further enhancing the accuracy of risk prediction.
[0054] For ease of understanding, Figure 2 A flowchart illustrating a specific example is shown. For example... Figure 2 As shown, this risk prediction method specifically includes the following steps:
[0055] Step 1: Obtain data.
[0056] Specifically, it obtains user authentication behavior data, including authentication time, username, IP address, location, browser version number, and browser type. Additionally, it can obtain authentication risk data, including IP threat level and risk indicator.
[0057] Step 2: Model feature construction.
[0058] Specifically, new features are constructed based on feature correlation: Pearson correlation coefficients are used to construct correlation coefficient matrices for each feature, and features with strong correlations are combined into new strong features. For example, constructing user login sequences: risk authentication and time often have a very strong correlation. Here, by adding historical features to the current user's current authentication, we can obtain the authentication login time series for a specific user. Based on the login time series, we can obtain the time interval features of user logins, such as the average interval time, the median interval login time, and other feature information.
[0059] Step 3: Constructing a multi-supervised model.
[0060] Specifically, this example uses two supervised models, XGB and LGB, as illustrations. In addition to XGB and LGB supervised models, algorithms such as SVM and CNN can also be used.
[0061] Basic model construction: First, the model features constructed using user authentication behavior data and authentication risk data are input into two types of supervised models, XGB and LGB, to obtain two cold-start basic training models, resulting in four basic models: XGB model 1, XGB model 2, LGB model 1, and LGB model 2.
[0062] Step 4: Random grid parameter tuning.
[0063] Specifically, random grid elasticity coefficient tuning is used for four basic models: XGB model 1, XGB model 2, LGB model 1, and LGB model 2. The process involves randomly seeding parameters to obtain a set of favorable parameter ranges. An elasticity coefficient 'a' is then used to narrow down the parameter range. Within this favorable range, the parameters are further optimized to obtain an optimal set of parameters. For example, if a favorable seed parameter is 1.3, a favorable parameter range is obtained as [-1.3a, 1.3a]. Further optimization within this new range yields better parameter combinations, significantly reducing training overhead and allowing for the acquisition of optimal parameters for each basic model.
[0064] Step 5: Feature fusion of results.
[0065] Specifically, this includes both feature fusion and result fusion.
[0066] Feature fusion: XGB model 1 and XGB model 2 will each obtain the XGB class prediction results with the optimal parameters. Here, the single model fusion method is used to fuse the prediction results of the two XGB model 1 and XGB model 2 with different parameters as features and the original model features to obtain the XGB_Feature_Fusion feature. This fused feature is used as the feature of the new LGB model to obtain the prediction result LGB_XGB.
[0067] Results fusion: LGB model 1 and LGB model 2 will each obtain the LGB class prediction results with the optimal parameters. Using a uniform fusion weighted fusion method, the LGB_Fusion fusion result is obtained.
[0068] Step 6: Fusion of results from multiple model structures.
[0069] From the above feature fusion and result fusion steps, we finally obtained the prediction result LGB_XGB after feature fusion and the prediction result LGB_Fusion after model result fusion. Then, we use a weighted fusion method to fuse these two final results.
[0070] Step 7: Obtain the final result.
[0071] By fusing the results from the above multi-model structures, we can ultimately obtain a prompt indicating whether the user's authentication behavior is risky or abnormal, and we can also determine the specific type of abnormality for policy formulation.
[0072] Therefore, security is the primary gateway for business operations across all industries, and security issues are the most fundamental aspects that need to be protected. However, since user behavior rule analysis methods cannot dynamically adapt to various breaching techniques, this application proposes a risk anomaly assessment method by constructing a user authentication behavior feature model. This method uses user authentication behavior data, authentication log data, and risk log data as its data foundation. By constructing a user authentication behavior feature model, it performs a risk anomaly assessment and prediction model, using this risk assessment model to determine whether the current user authentication behavior poses a risk. This method employs a supervised algorithm, using basic user authentication data as a foundation, constructing "strong data features" based on user authentication behavior data, using a "random grid parameter method" for parameter tuning, and using a "single-model fusion of multiple models" method to fuse the prediction results as features with similar and dissimilar models with different parameters. This constructs a risk anomaly prediction model to determine whether the current user's authentication behavior poses a risk.
[0073] This proposal suggests a risk anomaly assessment method that constructs a user authentication behavior feature model. Based on user authentication behavior data, authentication log data, and risk log data, this method builds a user authentication behavior feature model and then uses this model to determine whether the current user authentication behavior poses a risk. The method employs a supervised algorithm, using basic user authentication data as a foundation. It constructs "strong data features" based on user authentication behavior data, uses "result feature fusion" and "random grid tuning" for parameter optimization, and uses a "single-model fusion of multiple models" approach to fuse the prediction results as features with similar and dissimilar models with different parameters. This process constructs a risk anomaly prediction model to determine whether the current user's authentication behavior poses a risk.
[0074] In summary, this scheme has the following characteristics:
[0075] 1. Utilize the construction of user authentication behavior feature models for risk anomaly prediction;
[0076] 2. Construct a set of optimal parameter intervals using the elasticity coefficients;
[0077] 3. Improve prediction accuracy through a dual fusion channel of feature fusion and result fusion;
[0078] 4. Construct strong features using feature similarity;
[0079] 5. Construct feature information such as average interval time and median interval login time using the user's historical login sequence;
[0080] 6. Utilize multiple models to fuse prediction results;
[0081] 7. Utilize model prediction results to convert features into non-similar models for model result optimization;
[0082] 8. Find local optimal parameters using a random seed grid.
[0083] Example 2
[0084] Figure 3 A schematic diagram of a risk prediction device based on user behavior data provided in Embodiment 2 of the present invention is shown. (Refer to...) Figure 3 ,include:
[0085] The data acquisition module 31 is adapted to acquire user behavior data of the target user; wherein the user behavior data includes authentication feature data of multiple dimensions;
[0086] The combination module 32 is adapted to combine authentication feature data of at least two dimensions into associated feature data based on the correlation between authentication feature data of multiple dimensions.
[0087] The first result acquisition module 33 is adapted to input the authentication feature data of the multiple dimensions and the associated feature data into the first supervised prediction model to obtain the first prediction result output by the first supervised prediction model.
[0088] The second result acquisition module 34 is adapted to input the authentication feature data of the multiple dimensions and the association feature data into the second supervised prediction model to obtain the second prediction result output by the second supervised prediction model;
[0089] The fusion prediction module 35 is adapted to perform fusion processing on the first prediction result and the second prediction result to obtain a fusion prediction result, and predict the behavioral risk level of the target user based on the fusion prediction result.
[0090] Optionally, the combined module is specifically adapted to:
[0091] Calculate the Pearson correlation coefficient between authentication feature data of any two dimensions; wherein the Pearson correlation coefficient is used to characterize the magnitude of the correlation between authentication feature data of any two dimensions;
[0092] The authentication feature data of two dimensions with a Pearson correlation coefficient greater than a preset threshold are combined into associated feature data.
[0093] Optionally, the authentication feature data of the two dimensions with a Pearson correlation coefficient greater than a preset threshold includes: authentication feature data of the authentication login dimension and authentication feature data of the login time dimension; and the associated feature data includes: authentication login sequence data;
[0094] The combined module is specifically suitable for:
[0095] The authentication feature data of the authentication login dimension and the authentication feature data of the login time dimension are combined into authentication login sequence data;
[0096] The authentication login sequence data is used to characterize the time interval between multiple authentication login operations of the same user.
[0097] Optionally, the first supervised prediction model is an XGB model, and the first supervised prediction model further includes: a first XGB model and a second XGB model; then the first prediction result output by the first supervised prediction model is obtained in the following way:
[0098] Feature fusion processing is performed on the first XGB prediction result output by the first XGB model and the second XGB prediction result output by the second XGB model to obtain the first prediction result output by the first supervised prediction model.
[0099] Optionally, the second supervised prediction model is an LGB model, and the second supervised prediction model further includes: a first LGB model and a second LGB model;
[0100] The second prediction result output by the second supervised prediction model is obtained in the following way:
[0101] The first LGB prediction result output by the first LGB model and the second LGB prediction result output by the second LGB model are weighted and fused to obtain the second prediction result output by the second supervised prediction model.
[0102] Optionally, the model parameters in the first XGB model, the second XGB model, the first LGB model, and / or the second LGB model are set in the following manner:
[0103] The first parameter range group is obtained by randomly seeding parameters;
[0104] Using an elasticity coefficient, the value range of the first parameter range group is narrowed to obtain a second parameter range group; wherein the parameter range of the second parameter range group is smaller than the parameter range of the first parameter range group.
[0105] Within the parameter range of the second parameter range group, select a set of parameters as the model parameters.
[0106] Optionally, the first supervised prediction model and the second supervised prediction model are trained using the user behavior data and authentication risk data of the sample users; wherein, the authentication risk data of the sample users is used to characterize the risk level of the user behavior data of the sample users.
[0107] Example 3
[0108] Figure 4 The diagram illustrates the structure of an electronic device according to Embodiment 3 of the present invention. The specific embodiments of the present invention do not limit the specific implementation of the electronic device. (Refer to...) Figure 4 The electronic device includes:
[0109] At least one processor 401; a memory 402 communicatively connected to at least one processor; a communication interface 403; and a communication bus 404.
[0110] in:
[0111] The processor 401, memory 402, and communication interface 403 communicate with each other through the communication bus 404.
[0112] Communication interface 403 is used to communicate with other network elements such as clients or other servers.
[0113] The memory 402 stores one or more computer programs 405 that can be executed by at least one processor 401, which in turn executes the at least one processor 401 to enable the at least one processor 401 to perform the corresponding operations as described in the above-described communication intercom method embodiment.
[0114] Example 4
[0115] Embodiment 4 of this application provides a non-volatile computer storage medium storing at least one executable instruction that can execute the object loading method in the virtual scene of any of the above method embodiments. Specifically, the executable instruction can be used to cause the processor to perform the corresponding operations in the above method embodiments.
[0116] Those skilled in the art will understand that all or some of the steps, systems, and apparatuses disclosed above, and their functional modules / units, can be implemented as software, firmware, hardware, or suitable combinations thereof. In hardware implementations, the division between functional modules / units mentioned above does not necessarily correspond to the division of physical components; for example, a physical component may have multiple functions, or a function or step may be performed collaboratively by several physical components. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit (ASIC). Such software can be distributed on a computer-readable storage medium, which may include computer storage media (or non-transitory media) and communication media (or transient media).
[0117] As is known to those skilled in the art, the term computer storage medium includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information (such as computer-readable program instructions, data structures, program modules, or other data). Computer storage media includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), static random access memory (SRAM), flash memory or other memory technologies, portable compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical disc storage, magnetic cartridges, magnetic tape, disk storage or other magnetic storage devices, or any other medium that can be used to store desired information and is accessible to a computer. Furthermore, it is known to those skilled in the art that communication media typically contain computer-readable program instructions, data structures, program modules, or other data in modulated data signals such as carrier waves or other transmission mechanisms, and may include any information delivery medium.
[0118] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.
[0119] Computer program instructions used to perform the operations of this disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, status setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, etc., and conventional procedural programming languages such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing the status information of the computer-readable program instructions to implement various aspects of this disclosure.
[0120] The computer program product described herein can be implemented specifically through hardware, software, or a combination thereof. In one alternative embodiment, the computer program product is specifically embodied in a computer storage medium; in another alternative embodiment, the computer program product is specifically embodied in a software product, such as a software development kit (SDK), etc.
[0121] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.
[0122] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.
[0123] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.
[0124] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.
[0125] Example embodiments have been disclosed herein, and while specific terminology has been used, it is for illustrative purposes only and should be construed as such, and is not intended to be limiting. In some instances, it will be apparent to those skilled in the art that features, characteristics, and / or elements described in connection with particular embodiments may be used alone, or in combination with features, characteristics, and / or elements described in connection with other embodiments, unless otherwise expressly indicated. Therefore, those skilled in the art will understand that various changes in form and detail may be made without departing from the scope of this disclosure as set forth by the appended claims.
Claims
1. A risk prediction method based on user behavior data, characterized in that, include: Acquire user behavior data of the target user; wherein, the user behavior data includes authentication feature data of multiple dimensions; Based on the correlation between authentication feature data of multiple dimensions, authentication feature data of at least two dimensions are combined into associated feature data, wherein the associated feature data is the holistic data of the at least two dimensions of authentication feature data. The authentication feature data of the multiple dimensions and the association feature data are input into the first supervised prediction model to obtain the first prediction result output by the first supervised prediction model. The authentication feature data of the multiple dimensions and the association feature data are input into the second supervised prediction model to obtain the second prediction result output by the second supervised prediction model. The first supervised prediction model and the second supervised prediction model are different types of models. The first prediction result and the second prediction result are fused together to obtain a fused prediction result, and the behavioral risk level of the target user is predicted based on the fused prediction result.
2. The method according to claim 1, characterized in that, The step of combining authentication feature data of at least two dimensions into associated feature data based on the correlation between authentication feature data of multiple dimensions includes: Calculate the Pearson correlation coefficient between authentication feature data of any two dimensions; wherein the Pearson correlation coefficient is used to characterize the magnitude of the correlation between authentication feature data of any two dimensions; The authentication feature data of two dimensions with a Pearson correlation coefficient greater than a preset threshold are combined into associated feature data.
3. The method according to claim 2, characterized in that, The authentication feature data for the two dimensions where the Pearson correlation coefficient is greater than a preset threshold includes: authentication feature data for the authentication login dimension and authentication feature data for the login time dimension; and the associated feature data includes: authentication login sequence data; The step of combining authentication feature data from two dimensions with a Pearson correlation coefficient greater than a preset threshold into associated feature data includes: The authentication feature data of the authentication login dimension and the authentication feature data of the login time dimension are combined into authentication login sequence data; The authentication login sequence data is used to characterize the time interval between multiple authentication login operations of the same user.
4. The method according to claim 1, characterized in that, The first supervised prediction model is an XGB model, and the first supervised prediction model further includes: a first XGB model and a second XGB model; then the first prediction result output by the first supervised prediction model is obtained in the following way: Feature fusion processing is performed on the first XGB prediction result output by the first XGB model and the second XGB prediction result output by the second XGB model to obtain the first prediction result output by the first supervised prediction model.
5. The method according to claim 4, characterized in that, The second supervised prediction model is an LGB model, and the second supervised prediction model further includes: a first LGB model and a second LGB model; The second prediction result output by the second supervised prediction model is obtained in the following way: The first LGB prediction result output by the first LGB model and the second LGB prediction result output by the second LGB model are weighted and fused to obtain the second prediction result output by the second supervised prediction model.
6. The method according to claim 5, characterized in that, The model parameters in the first XGB model, the second XGB model, the first LGB model, and / or the second LGB model are set in the following manner: The first parameter range group is obtained by randomly seeding parameters; Using an elasticity coefficient, the value range of the first parameter range group is narrowed to obtain a second parameter range group; wherein the parameter range of the second parameter range group is smaller than the parameter range of the first parameter range group. Within the parameter range of the second parameter range group, select a set of parameters as the model parameters.
7. The method according to any one of claims 1-6, characterized in that, The first supervised prediction model and the second supervised prediction model are trained using the user behavior data and authentication risk data of the sample users; wherein, the authentication risk data of the sample users is used to characterize the risk level of the user behavior data of the sample users.
8. A risk prediction device based on user behavior data, characterized in that, include: The data acquisition module is adapted to acquire user behavior data of the target user; wherein, the user behavior data includes authentication feature data of multiple dimensions; The combination module is adapted to combine authentication feature data of at least two dimensions into associated feature data based on the correlation between authentication feature data of multiple dimensions, wherein the associated feature data is the holistic data of the at least two dimensions of authentication feature data. The first result acquisition module is adapted to input the authentication feature data of the multiple dimensions and the associated feature data into the first supervised prediction model to obtain the first prediction result output by the first supervised prediction model. The second result acquisition module is adapted to input the authentication feature data of the multiple dimensions and the association feature data into the second supervised prediction model to obtain the second prediction result output by the second supervised prediction model, wherein the first supervised prediction model and the second supervised prediction model are different types of models; The fusion prediction module is adapted to perform fusion processing on the first prediction result and the second prediction result to obtain a fusion prediction result, and predict the behavioral risk level of the target user based on the fusion prediction result.
9. An electronic device, characterized in that, include: At least one processor; and a memory communicatively connected to the at least one processor; The memory stores one or more computer programs that can be executed by the at least one processor, and the one or more computer programs are executed by the at least one processor to enable the at least one processor to perform the method as described in any one of claims 1-7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, The computer program, when executed by a processor, implements the method as described in any one of claims 1-7.