System and method for credit washing risk detection

A multi-faceted framework using supervised and unsupervised models effectively detects credit washing fraud by generating risk scores from attribute-based rules and clustering, ensuring secure credit score integrity and resource access control.

WO2026136266A1PCT designated stage Publication Date: 2026-06-25EQUIFAX INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
EQUIFAX INC
Filing Date
2025-12-15
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Credit washing fraud poses a significant risk to institutions due to the accessibility and scalability of manipulating credit scores through transaction disputes, leading to improper score boosts and financial losses.

Method used

A multi-faceted credit washing detection framework utilizing supervised and unsupervised machine learning models to generate risk scores by evaluating attribute-based rules and clustering techniques, combining outputs to assess an entity's credit washing risk.

Benefits of technology

Accurately identifies credit washing activities, enabling controlled access to computing resources and preventing fraudulent score manipulation, thereby protecting institutional interests.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure US2025059695_25062026_PF_FP_ABST
    Figure US2025059695_25062026_PF_FP_ABST
Patent Text Reader

Abstract

In one example, a method is described, including receiving candidate data associated with an entity, where the data includes both a first set of entity attributes related to fraud and a second set of attributes associated with entity dispute patterns. The method includes generating, by a first model, a risk score for the entity by evaluating the extent to which a threshold number of attribute-based rules are triggered within the first set of anomalous behavior patterns-related attributes. The operations include generating, by a second model, a separate risk score for the entity by clustering the second set of dispute-related attributes. The method includes generating an aggregate risk score by combining the outputs of the first and second models to then provide an assessment of the entity's risk profile.
Need to check novelty before this filing date? Find Prior Art

Description

Attomey Docket No. 096923-1529769EFX-203WOSYSTEM AND METHOD FOR CREDIT WASHING RISK DETECTIONCross-Reference to Related Applications

[0001] This application claims priority to U.S. Provisional Patent Application Nos. 63 / 737,190, filed on December 20, 2024, and 63 / 734,409, filed on December 16, 2024, each entitled "SYSTEM AND METHOD FOR CREDIT WASHING RISK DETECTION," the entirety of which is hereby incorporated by reference for all purposes.Technical Field

[0002] The present disclosure relates generally to artificial intelligence. More specifically, but not by way of limitation, this disclosure relates to machine learning using supervised and unsupervised models for detecting a risk of an entity attempting to perform unauthorized actions including credit washing.Background

[0003] Credit washing refers to a process where after a trade is opened following a transaction, a dispute is submitted to delete or “wash” the trade from a user’s credit file. Washing the trade can lead to an increase in the disputer’s credit score. Disputing such credit scores is a process necessarily available to allow users to properly dispute transactions. However, fraudsters can manipulate the dispute process to improperly boost their credit scores by cycling through opening and defaulting on additional trades and subsequently washing the discharged trades from their credit score. Such credit washing and dispute manipulation can lead to significant losses for various institutions relying on accurate credit scores. The risks of credit washing are exacerbated by: 1) the right to dispute transactions making credit washing tactics accessible to fraudsters; 2) the credit washing process being highly repeatable; and 3) the process being scalable with other methods of defrauding innocent parties.Summary

[0004] Various aspects of the present disclosure provide systems and methods for executing supervised and unsupervised models to detect credit washing risk. In one example, a method is described, including receiving candidate data associated with an entity, where the data includes both a first set of entity attributes related to fraud and a second set of attributes associated with entity dispute patterns. The method includes generating, by a first model, a risk score for the entity by evaluating the extent to which a threshold number of attribute-Attorney Docket No. 096923-1529769EFX-203WO based rules are triggered within the first set of fraud-related attributes. The operations include generating, by a second model, a separate risk score for the entity by clustering the second set of dispute-related attributes. The method includes generating an aggregate risk score by combining the outputs of the first and second models to then provide an assessment of the entity’s risk profile.

[0005] In another example, a system includes a processing device and a memory device in which instructions executable by the processing device are stored for causing the processing device to perform operations. The operations include receiving candidate data associated with an entity, where the data includes both a first set of entity attributes related to fraud and a second set of attributes associated with entity dispute patterns. The operations include generating, by a first model, a risk score for the entity by evaluating the extent to which a threshold number of attribute-based rules are triggered within the first set of fraud- related attributes. The operations include generating, by a second model, a separate risk score for the entity by clustering the second set of dispute-related attributes. The operations include generating an aggregate risk score by combining the outputs of the first and second models to then provide an assessment of the entity’s risk profile.

[0006] In yet another example, a non-transitory computer-readable storage medium having program code that is executable by a processor device to cause a computing device to perform operations. The operations include receiving candidate data associated with an entity, where the data includes both a first set of entity attributes related to fraud and a second set of attributes associated with entity dispute patterns. The operations include generating, by a first model, a risk score for the entity by evaluating the extent to which a threshold number of attribute-based rules are triggered within the first set of fraud-related attributes. The operations include generating, by a second model, a separate risk score for the entity by clustering the second set of dispute-related attributes. The operations include generating an aggregate risk score by combining the outputs of the first and second models to then provide an assessment of the entity’s risk profile.

[0007] This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used in isolation to determine the scope of the claimed subject matter. The subject matter should be understood by reference to appropriate portions of the entire specification, any or all drawings, and each claim.Attorney Docket No. 096923-1529769EFX-203WO

[0008] The foregoing, together with the other features and examples, will become more apparent upon referring to the following specification, claims, and accompanying drawings.Brief Description of the Drawings

[0009] FIG. l is a block diagram of an operating environment capable of implementing a credit washing risk scoring system according to certain examples.

[0010] FIG. 2 is a flow diagram of a process for implementing a credit washing risk score computing system according to certain examples.

[0011] FIG. 3 is a flow diagram depicting an example set of operations for training and applying a first model including a set of automatically generated rules, according to certain aspects of the present disclosure.

[0012] FIG. 4 is a flow diagram depicting an example set of operations for training and applying a second model which applies unsupervised clustering techniques to determine risk scores, according to certain aspects of the present disclosure.

[0013] FIG. 5 is a block diagram illustrating an example of a computing device that can be used to implement the credit washing risk scoring according to certain examples.Detailed Description

[0014] Certain aspects and examples of the present disclosure relate to machine learning techniques using supervised and unsupervised models for detecting a risk of an entity attempting to perform unauthorized actions including credit washing. Credit washing can refer to a process of manipulating transaction disputes processes to delete or “wash” poor credit performance from a user’s credit history. Users have a right to dispute and correct errors in credit reports through a process called credit dispute. Credit disputes are a normal part of credit maintenance operations and are important parts of credit maintenance.However, persons can abuse the credit dispute process with specific intent to erase bad trades from their credit reports to boost credit scores despite no valid basis for initiating the dispute. Credit washing thus represents a form of fraud, where users can boost credit scores without a proper basis. Supervised and unsupervised models refer to machine-learning models which can be trained to generate statistical predictions and scores indicating confidence based on particular configurations. As described here, such models can be trained and instructed to assess the likelihood that users are engaging in credit washing.Attorney Docket No. 096923-1529769EFX-203WO

[0015] One characteristic of credit washing is that the damage done per person is dependent on the fraudster and can vary person to person. Fraudsters can engage in credit washing to improperly secure various loans such as personal loans, car loans, and the like without intent to pay on the loans. For example, the fraudsters can apply for multiple credit cards and max those out without any intent to pay. Thus, the very definition of credit washing fraud can be multi-faceted, requiring a multi-faceted credit washing detection framework to overcome the challenges related to credit washing.

[0016] The framework for the described, multi-faceted credit-washing detection system includes multiple algorithmic structures, each trained, programmed, or otherwise configured to assess the risk that a specific entity on a specified date is likely to be engaging in credit washing activities. As a first structure, the analytical prowess of supervised learning models may be applied. In the supervised learning structure, labeled experiments across varied scopes and temporal ranges are used to discern underlying patterns in data used to assign credit washing risk scores. As a second structure, the pattern recognition capabilities of unsupervised learning, particularly through clustering algorithms, may be employed. In the unsupervised learning structure, algorithms such as Kernel Density Estimation (“KDE”) and K-means clustering may be employed to detect potential dispute pattern anomalies further indicative of fraudulent activity. As a third potential structure, which in some examples is incorporated into the first supervised learning structure, insightful attributes may be implemented to assign credit-washing risk, where the insightful attributes represent a specified collection of attributes historically determined relevant to determining credit washing risk.

[0017] Rather than using typical machine learning model frameworks (e.g., xgboost) as the final output of a credit wash risk score prediction, the predictive power of the multiple models as described above are employed to determine credit washing risk scores. As a result, no model object needs to be deployed to score a new transaction. Instead, to determine the risk level of a transaction, attributes can be generated, where the attributes are evaluated against a list of prebuilt rules.

[0018] The present application relates to systems and methods for credit washing detection. The systems and methods described herein provide the ability to receive data associated with an entity, where the data includes a set of one or more attributes describing the entity, and based on that data, generate a credit washing risk score predictive of the likelihood that the entity is engaging in an act of credit washing. The credit washing riskAttorney Docket No. 096923-1529769EFX-203WO score is generated based on a set of first, second, and optionally third risk scores, where each of the first, second, and third risk scores are generated from respective models.

[0019] For instance, a first model for generating the first risk score can rely on a set of manually generated rules or rules generated from supervised model structures. The number of rules triggered based on the entity attribute data can determine the first risk score. A second model generating the second risk score can apply unsupervised modelling clustering techniques to generate the second risk score. The clustering techniques can apply a doubleclustering procedure where a first clustering technique is applied based on data retrieved from a set of entities. Attributes may be generated from the first clustering, and subsequently used to generate a second clustering, where the second clustering is used to generate the second risk score. Optionally, a third model can generate the third risk score, where the third risk score, like the first risk score is based on a set of one or more rules triggered. Unlike the rules of the first model, the third model rules, referred to as insightful attributes, may be afforded special weight in generating the third risk score, subsequently used to generate the credit washing risk score.

[0020] In some examples, more or fewer structures may be employed to determine credit washing risk. For instance, the first supervised learning structure may be used to generate a first risk score, and the second unsupervised structure may be used to generate a second risk score, where the first risk score and the second risk score are combined to generate the final output risk score, the final output risk score being indicative that an entity is likely to engage in credit washing activity during dispute at a given time. In other examples, the third structure employing the insightful attributes may additionally or alternatively be employed to generate a third risk score combined with the first risk score and the second risk score to generate the final output risk score. Various other implementations of the three described structures, in addition to other structures used to generate risk scores and a final output risk score are contemplated according to other examples of the current disclosure.

[0021] Certain aspects provide improvements to controlling access to computing resources. For example, the credit washing risk scores can be used by a credit washing risk scoring system to more accurately and efficiently control access to computing resources such as an interactive computing environment that can provide computing resources, such as computational processing power, computer memory, and the like, to the entity. In particular, techniques relating to generating constituent first and second risk scores according to respective models can more accurately determine credit washing risk scores. Based on theAttorney Docket No. 096923-1529769EFX-203WO credit washing risk score, the credit washing risk scoring system can more accurately and efficiently control access to the computing resources.

[0022] Referring now to the drawings, FIG. l is a block diagram of an operating environment 100 capable of implementing a credit washing risk scoring system according to certain examples. In the operating environment 100 credit washing risk scoring system 130 builds and trains models that can be used to predict a risk that a candidate entity is likely to be performing credit washing. The credit washing risk scoring system 130 can further apply one or more algorithms involving the several models to determine a credit washing risk score. FIG. 1 depicts examples of hardware components of the credit washing risk scoring system 130, according to some aspects. The credit washing risk scoring system 130 is a specialized computing system that may be used for processing large amounts of data using a large number of computer processing cycles. The credit washing risk scoring system 130 can include a model training server 110 for building and training machine learning models 120 including supervised models and unsupervised models (e.g., clustering models). The credit washing risk scoring system 130 can further include a credit washing risk scoring server 118 for determining credit washing risk score corresponding to a candidate entity based on entity data 124 (including data corresponding to the candidate entity and several other entities) and attribute data 142.

[0023] The model training server 110 can include one or more processing devices that execute program code, such as a model training application 112. The program code is stored on a non-transitory computer-readable medium. The model training application 112 can execute one or more processes to train a machine learning model for predicting a risk that a candidate entity is likely to be performing credit washing based on entity data 124, attribute data 142, and model training samples 126.

[0024] In some aspects, the model training application 112 can build and train a machine learning model 120 utilizing model training samples 126 (e.g., including training entity data and training attributes). The model training samples 126 can be stored in one or more network-attached storage units on which various repositories, databases, or other structures are stored. Examples of these data structures are the data repository 122.

[0025] Network-attached storage units may store a variety of different types of data organized in a variety of different ways and from a variety of different sources. For example, the network-attached storage unit may include storage other than primary storage located within the model training server 110 that is directly accessible by processors located therein.Attorney Docket No. 096923-1529769EFX-203WOIn some aspects, the network-attached storage unit may include secondary, tertiary, or auxiliary storage, such as large hard drives, servers, virtual memory, among other types. Storage devices may include portable or non-portable storage devices, optical storage devices, and various other mediums capable of storing and containing data. A machine- readable storage medium or computer-readable storage medium may include a non-transitory medium in which data can be stored and that does not include carrier waves or transitory electronic signals. Examples of a non-transitory medium may include, for example, a magnetic disk or tape, optical storage media such as a compact disk or digital versatile disk, flash memory, memory, or memory devices.

[0026] The credit washing risk scoring server 118 can include one or more processing devices that execute program code, such as a credit washing risk scoring application 114. The program code is stored on a non-transitory computer-readable medium. The credit washing risk scoring application 114 can execute one or more processes to utilize the machine learning model 120 trained by the model training application 112 based on entity data 124 to determine credit washing risk score. For example, the credit washing risk scoring application 114 can apply one or more algorithms and models configured to determine separate credit washing risk scores combinable to form a final credit washing risk score.

[0027] The credit washing risk score can be utilized to make decisions about the entity. For example, the credit washing risk score of the entity can be used to determine whether a risk associated with granting the entity access to resources, such as computational resources, access to an interactive computing environment, and the like, is high, for example, higher than a threshold risk value. If the risk is high, the entity may be denied access to the resources. For instance, the risk may be related to the credit washing risk score or other risk score of the entity and if the credit washing risk score or other risk score, based on the credit washing risk scoring system determinations, is too high, then the entity may be denied the resources. In another example, the resources may include cloud computing resources, such as online virtual machine instances, or online storage resources. The credit wash risk scoring server 118 can use the credit washing risk score to predict the risk for the entity. Depending on the predicted risk, the entity may be granted or denied access to the online resources.

[0028] Furthermore, the credit washing risk scoring system 130 can communicate with various other computing systems, such as client computing systems 104. For example, client computing systems 104 may send entity queries to the credit washing risk scoring server 118 to determine a credit washing risk score for a candidate entity or may send signalsAttorney Docket No. 096923-1529769EFX-203WO to the credit washing risk scoring server 118 that control or otherwise influence different aspects of the credit washing risk score system 130. The client computing systems 104 may also interact with user computing systems 106 via one or more public data networks 108 to facilitate interactions between users of the user computing systems 106 and interactive computing environments provided by the client computing systems 104.

[0029] Each client computing system 104 may include one or more third-party devices, such as individual servers or groups of servers operating in a distributed manner. A client computing system 104 can include any computing device or group of computing devices operated by a seller, lender, or other providers of products or services. The client computing system 104 can include one or more server devices. The one or more server devices can include or can otherwise access one or more non-transitory computer-readable media. The client computing system 104 can also execute instructions that provide an interactive computing environment accessible to user computing systems 106. Examples of the interactive computing environment include a mobile application specific to a particular client computing system 104, a web-based application accessible via a mobile device, etc. The executable instructions are stored in one or more non-transitory computer-readable media.

[0030] The client computing system 104 can further include one or more processing devices that are capable of providing the interactive computing environment to perform operations described herein. The interactive computing environment can include executable instructions stored in one or more non-transitory computer-readable media. The instructions providing the interactive computing environment can configure one or more processing devices to perform operations described herein. In some aspects, the executable instructions for the interactive computing environment can include instructions that provide one or more graphical interfaces. The graphical interfaces are used by a user computing system 106 to access various functions of the interactive computing environment. For instance, the interactive computing environment may transmit data to and receive data from a user computing system 106 to shift between different states of the interactive computing environment, where the different states allow one or more electronics transactions between the user computing system 106 and the client computing system 104 to be performed.

[0031] In some examples, a client computing system 104 may have other computing resources associated therewith (not shown in FIG. 1), such as server computers hosting and managing virtual machine instances for providing cloud computing services, serverAttorney Docket No. 096923-1529769EFX-203WO computers hosting and managing online storage resources for users, server computers for providing database services, and others. The interaction between the user computing system 106 and the client computing system 104 may be performed through graphical user interfaces presented by the client computing system 104 to the user computing system 106, or through an application programming interface (“API”) calls or web service calls.

[0032] A user computing system 106 can include any computing device or other communication device operated by a user, such as a consumer or a customer. The user computing system 106 can include one or more computing devices, such as laptops, smartphones, and other personal computing devices. A user computing system 106 can include executable instructions stored in one or more non-transitory computer-readable media. The user computing system 106 can also include one or more processing devices that are capable of executing program code to perform operations described herein. In various examples, the user computing system 106 can allow a user to access certain online services from a client computing system 104 or other computing resources, to engage in mobile commerce with a client computing system 104, to obtain controlled access to electronic content hosted by the client computing system 104, etc.

[0033] For instance, the user can use the user computing system 106 to engage in one or more electronic transactions with a client computing system 104 via an interactive computing environment. The one or more transactions can be determined based on the credit washing risk score output by the credit washing risk scoring system 130 in response to receiving a transaction request from the user computing system 106. In some examples, the credit washing risk scoring system 130 can receive entity data 124 and attribute data 142 from the user computing system 106.

[0034] In some examples, the credit washing risk scoring system 130 can cause the user computing system 106, the client computing system 104, or a combination thereof to execute one or more actions in accordance with the determined credit washing risk score. For instance, as described in the example above, the credit washing risk scoring system 130 can communicate with a user computing system to automatically cause one or more components of the user computing system 106 to reject a transaction based on a determined credit washing risk score.

[0035] Each communication within the operating environment 100 may occur over one or more data networks, such as a public data network 108, a network 116 such as a private data network, or some combination thereof. A data network may include one or moreAttorney Docket No. 096923-1529769EFX-203WO of a variety of different types of networks, including a wireless network, a wired network, or a combination of a wired and wireless network. Examples of suitable networks include the Internet, a personal area network, a local area network (“LAN”), a wide area network (“WAN”), or a wireless local area network (“WLAN”). A wireless network may include a wireless interface or a combination of wireless interfaces. A wired network may include a wired interface. The wired or wireless networks may be implemented using routers, access points, bridges, gateways, or the like, to connect devices in the data network.

[0036] The number of devices depicted in FIG. 1 is provided for illustrative purposes. Different numbers of devices may be used. For example, while certain devices or systems are shown as single devices in FIG. 1, multiple devices may instead be used to implement these devices or systems. Similarly, devices or systems that are shown as separate, such as the model training server 110 and the credit washing risk scoring server 118, may be instead implemented in a single device or system.

[0037] FIG. 2 is a flow diagram of a process 200 for implementing a credit washing risk score computing system (e.g., the credit washing risk score system 130) according to certain examples. For illustrative purposes, the process 200 is described with reference to implementations described above with respect to one or more examples described herein. Other implementations, however, are possible. In some aspects, the operations in FIG. 2 may be implemented in program code that is executed by one or more computing devices such as the credit wash risk scoring server 118 depicted in FIG. 1. In some aspects of the present disclosure, one or more operations shown in FIG. 2 may be omitted or performed in a different order. Similarly, additional operations not shown in FIG. 2 may be performed.

[0038] At block 202, the process 200 includes receiving candidate data associated with an entity, the candidate data including a first set of entity attributes and a second set of entity attributes. The entity, also referred to as a consumer, can include data representative of a person having a credit score. Each entity analyzed may or may not be a credit washer, and thus the process 200 begins by receiving data associated with the candidate entity (i.e., the entity under review for credit washing risk assessment) to determine an associated credit washing risk score. The associated data can include client identifying data (“CID”) representative of the entity. The associated data can also include a set of one or more entity attributes describing characteristics of the entity. Entity attributes are broadly defined and can include any identifying behavior of an entity such as dispute data, tradeline data, default data, credit score data and the like. Additional entity attributes are described according to theAttorney Docket No. 096923-1529769EFX-203WO various examples discussed below. The entity data is described as including a first set of entity attributes and a second set of entity attributes. The first set of entity attributes and the second set of entity attributes may overlap (i.e., the same entity attributes are included in the first set and second set), may be identical (i.e., complete one-to-one overlap), or may have no overlap between entity attributes.

[0039] The first set of entity attributes can include attributes indicative of anomalous behavior patterns, themselves indicative of fraud, and can be accessed from various databases such as dispute databases and trade databases. The first set of entity attributes can relate to various categories including volume (e.g., the number of disputes, inquiries, and reports alleging fraud), variety (e.g., the number of institutions and industries where fraud claims were made), recency (e.g., the number of days since the last dispute claiming fraud), and the like.

[0040] The second set of entity attributes can include data related to dispute patterns, including fraudulent tradeline patterns or inquiry patterns. The dispute patterns relate to the number and frequency of disputes, inquiries and activity on fraudulent tradelines. For example, a large percentage of entities may be determined to submit between one and nine disputes over yearlong windows, while far from the central distribution of entity behaviors some entities may be identified as submitting greater than ten disputes in the same period. The smaller subset of entity behavior captured further away from the central distribution can be referred to as the “long-tail”. According to some examples, the dispute patterns can be further analyzed, per a second risk score model (discussed with respect to block 206), to further determine the risk that an entity is likely to be performing credit washing.

[0041] At block 204, the process 200 includes generating by a first model, a first risk score associated with the entity, where the first model determines the first risk score based on a threshold number of attribute-based rules triggered based on the first set of entity attributes. The first model, discussed further below in the Example Architecture of the First Model section, operates based on a set of attribute-based rules. Any number of attribute-based rules can be included, but generally the number of attribute-based rules will be determined to improve precision and recall, while managing efficient processing, for instance by reducing redundant or inefficient rules from the final attribute-based rule set. According to certain examples, the rules may be manually generated. In another preferred example, the rules are generated based on a supervised model training structure trained on sets of training labels and training attributes based on entity data stored in an entity data repository.Attorney Docket No. 096923-1529769EFX-203WO

[0042] At block 206, the process 200 includes generating by a second model, a second risk score associated with the entity, where the second model determines the second risk score based on clustering the second set of entity attributes. The second model applies an unsupervised model to perform a two-step clustering process. The two-step clustering process can involve, for each entity, first clustering dispute submissions (i.e., the second set of entity attributes) into groups. Clustering attributes based on these groups may then be formed. Then, in a second clustering procedure, each entity including the candidate entity can be clustered based on the clustering attributes. Further structure of the second, clustering based model is discussed below in the Example Architecture of the Second Model section.

[0043] Block 208 of process 200 is shown in dotted lines to illustrate an optional operation of process 200 to be performed according to certain examples. At optional block 208, the process 200 includes generating, by a third model, a third risk score associated with the entity, where the third model determines the third risk score based on a third set of entity attributes within the candidate data. Also referred to as the insightful attribute model, the third model can include specific attribute-based rules defined by payment-interval attributes, proxy-dispute attributes, and / or personally identifiable information (“PII”)-variation attributes discussed further with respect to the example architecture of the third model.

[0044] At block 210, the process 200 includes generating a credit washing risk score based on the first risk score and the second risk score. The credit washing risk score represents the output risk score for the candidate entity based on the processing and determinations made by the first model and the second model. The credit washing risk score can be generated based on the first risk score and the second risk score according to various weights. For instance, the first risk score and the second risk score may be afforded equal weights in the credit washing risk score determination or may be assigned different weights. According to certain examples, for instance, when block 208 of process 200 is performed, block 210 can further include generating the credit washing risk score based on each of the first risk score, the second risk score, and the third risk score.

[0045] In some examples, the process 200 can include transmitting, by the credit washing risk scoring system and to a remote computing device, a responsive message including at least the credit washing risk score and associated information for the entity for use in controlling access of the entity to one or more interactive computing environments. The credit washing risk scoring server 118 can transmit the responsive message to a remote computing system such as the client computing system 104, and the like. The responsiveAttorney Docket No. 096923-1529769EFX-203WO message may cause computing resources to be allocated, or to not be allocated, to the entity based on the credit washing risk score. Additionally, or alternatively, the responsive message may cause opportunities to be presented to the entity.Example Architecture of the First Model Based on Rule Trigserins

[0046] The first model, as described with respect to block 204 of process 200, includes a set of attribute-based rules which may or may not be triggered based on the first set of entity attributes. Each rule of the generated set of one or more attribute-based rules can include binary prediction of whether an entity, based on associated attributes, is likely to be engaging in credit washing activities. Each rule generated may be triggered by different sets of attributes associated with the entity being reviewed. The output of each rule (i.e., whether the rule was triggered or not based on the attributes), may be aggregated to generate the first credit washing risk score. According to some examples, some or all of the attribute-based rules may be manually generated. According to an additional example, the attribute-based rules are generated according to one or more supervised learning model(s).

[0047] The process for generating the set of one or more rules can involve first training the supervised learning model(s) using training labels and identifying sets of attributes for input into the supervised learning model(s). The supervised learning model(s) may then be iteratively trained to generate an initial set of rules (e.g., by use of Skope Rules classifier methods). Once the initial set of rules is generated, the rules may be evaluated and cleaned to reduce the initial set of rules down to a final set of rules. The final set of rules may then be employed to evaluate entities’ risk of credit washing based on the entities’ associated attributes. Once an entities’ attributes are entered into the final set of rules, whether each rule is triggered is logged and aggregated. The aggregated sum or ratio of the final set of rules triggered is then used to determine the first credit washing risk score.

[0048] FIG. 3 is a flow diagram depicting an example set of operations 300 for training and applying a first model including a set of automatically generated rules, according to certain aspects of the present disclosure. For illustrative purposes, the process 300 is described with reference to implementations described above with respect to one or more examples described herein. Other implementations, however, are possible. In some aspects, the operations in FIG. 3 may be implemented in program code that is executed by one or more computing devices such as the credit wash risk scoring server 118 depicted in FIG. 1. In some aspects of the present disclosure, one or more operations shown in FIG. 3 may beAttorney Docket No. 096923-1529769EFX-203WO omitted or performed in a different order. Similarly, additional operations not shown in FIG. 3 may be performed.

[0049] At block 302, the operations 300 include receiving a set of training labels. To receive and generate the training labels, label data, including dispute label data, can be gathered from a variety of databases and data sets. For instance, credit data repositories, such as Automated Credit Reporting Online (“ACRO”) data repositories, may supply the label data or may provide the initial data from which labels are derived. Some initial labels retrieved may not be used in the supervised training process and may instead be intermediate labels used to generate final labels then input into the supervised model training. Sets of training labels generated can relate to specific tradelines and can include early payment default (“EPD”), first payment default (“FPD), credit card high utilization, and tradeline bad (e.g., whether a specific tradeline defaulted). Other non-tradeline level labels include labels indicating whether an entity which had previously completed a dispute then satisfied a specified condition within an allotted time period following the completed dispute. As examples, training labels can relate to the number of tradelines opened in the six months following a successful dispute exceeding a threshold, the number of auto-trades opened in the nine months following the successful dispute exceeding a threshold, or the number of product type tradelines opening in the year following the successful dispute exceeding a threshold. Such examples are non-limiting and illustrative, other parameters and time periods for each parameter are considered within the scope of examples of labels generated.

[0050] At block 304, the operations 300 include receiving a set of training attributes. To receive and generate the attributes for use in the supervised learning model(s), attribute data can be gathered from the same and / or additional data sources as used to identify the labels. For instance, dispute data may be accessed from a dispute data repository (e.g., e- OSCAR), while trade data may be accessed from a trade data repository (e.g., Automated Credit Reporting Online database). Examples of dispute data can include data defining employment disputes, PII disputes, and tradeline disputes. Dispute data can also organize disputes by submissions and sequences (i.e., line items). Each attribute generated can then be based at least in part on data accessed from one or more of the dispute data and trade data.

[0051] Each attribute is defined at a level of granularity including the entity (e.g., person represented via CID) and a given date. Examples of attributes include volume-based attributes (e.g., the number of tradeline disputes claiming identify fraud, the number of inquiry disputes claiming fraud, and / or the number of tradelines disputes claiming fraudAttorney Docket No. 096923-1529769EFX-203WO while also providing a police report), and variety-based attributes (e.g., the number of financial institutions where disputes have been submitted claiming identity fraud, the number of industries such as credit card, auto, and personal loan industries where disputes have been submitted). Additional attributes can include account life cycle attributes indicating the median number of days from account open to disputes regarding the account are submitted claiming identity fraud, suspicious dispute behavior attributes indicating the number of tradelines disputed through multiple channels claiming fraud and / or the number of tradelines in an allotted time period which several payments were made prior to disputing the payments through claiming fraud, and recency attributes such as indicating the number of days since the last dispute claiming identity fraud.

[0052] At block 306, the operations 300 include generating an initial set of attributebased rules based at least in part on the set of training labels and set of training attributes. Once labels and attributes are generated, the supervised learning model(s)may generate an initial set of one or more rules. The rules generated, including the initial set of rules, and the reduced set of rules determined after a cleaning process (described further at block 308), provide explicit logic for classifying the data, providing transparency in the determination process. Each rule is defined by a set of one or more components which must simultaneously be true for the rule to be triggered. Each component of a rule is a logical expression tied to an attribute. Thus, a rule comprising two components can appear as the number of industries engaged in exceeding two (component one) and the number of fraud tradeline disputes exceeding three (component two).

[0053] Rules may be generated manually (as described further according to the Example Architecture of the Third Model Based on Insightful Attribute Scoring included below) or may be preferentially generated according to the described supervised learning structures. Any supervised learning structure capable of generated rules based at least in part on labels and attributes may be applied. In one example, Skope rule generation may be applied. Skope rule generation provides supervised classification technique aimed at providing interpretability of generated constraints. Skope rules can generate the rules in a sequence of steps including training a bagging estimator (e.g., including multiple decision tree-based classifiers), performance filtering, and semantic duplication. Other supervised learning-based rule generation systems suitable for fraud detection and anomaly detection more generally may be applied. For instance, random forest models, waterfall models, and other supervised rule-generation models may be employed to generate the initial set of rules.Attorney Docket No. 096923-1529769EFX-203WO

[0054] At block 308, the operations 300 include cleaning the initial set of attributebased rules to reduce the initial set of attribute-based rules to generate the reduced attributebased rules. The initial set of rules generated may be evaluated based partially on performance and cleaned, or filtered, to reduce the initial rule set to a final rule set. When automatically generated by supervised learning model(s), the generated initial set of rules will be produced which fit a set of performance criteria (e.g., precision, recall, and the like). The performance criteria can be evaluated by graphing the set of rules based at least in part on the number of rules compared against the performance criteria and identifying the number of rules optimal for balancing performance criteria improvements while handling the complexity of managing more rules. Thus, the number of rules selected based at least in part on marginal effectiveness of increased sets of rules may comprise the first step of performance evaluation and rule cleaning and filtering to reduce the initial ruleset to a final rule set.

[0055] Initial rule cleaning, performance evaluation, and filtration can also include accounting for the performance criteria as evaluated against the configuration population as well as the percentage of alerts generated and number of rules. The rules may be cleaned and filtered to maximize performance criteria, such as precision and recall, while reducing the alert percentage and the number of rules.

[0056] Rule cleaning can include evaluating the initial rules to ensure the logic of the final rules have an internal logic. Such evaluation can include ensuring all attributes in a rule contain at least one fraud related attribute and replacing at least one attribute if no fraudbased attribute is detected. Initial rule thresholds can be adjusted to be rendered more accurate. Overlapping rules, such as those with the same attributes and components but with differing thresholds, can be compared and rules with redundant thresholds removed.

[0057] Rule cleaning can include de-correlating the initial rule set. The initial rule set may be iteratively tested on a variety of attributes, and the performance of each rule correlated with other rules within the initial rule set. Highly correlated rules may be indicative of unnecessary redundancies otherwise impairing the efficiency of the rules during a subsequent evaluation stage through requiring increased computing expenditures. Highly correlated rules may then be evaluated and reduced such that one rule selected from a set of rules satisfying a rule correlation threshold is selected for inclusion in the final rule set, while the remaining rules are removed from the final rule set.

[0058] Subsequent to the initial rule generation and rule cleaning described above, a reduced set of attribute-based rules is generated. The final set of rules can then beAttorney Docket No. 096923-1529769EFX-203WO incorporated into the credit washing detection algorithm such that once data corresponding to an entity is received, the data including a set of attributes, each rule of the set of rules will or will not trigger. A first risk score is then generated, where the first risk score corresponds to the number or percentage of the reduced set of attribute-based rules triggered based on the entities’ associated attributes and the rule configuration.Example Architecture o f the Second Model Based on Unsupervised Risk Scoring

[0059] Trends in the dispute procedure indicate that a large majority of consumers tend to rarely if ever dispute or inquire into tradelines over extended time windows. On the other hand, there is a long tail (i.e., where a subset of consumers dispute with much greater frequency compared to a majority of the consumers). High activity in the dispute process may be indicative of credit washing risk, and thus unsupervised learning models trained to analyze patterns in dispute timing can augment the process in generating a second risk score pertaining to credit washing which can be combined with the first risk score discussed above to output a credit washing risk score assessing the likely risk of credit washing activity.

[0060] The second model structure, following an unsupervised learning approach can involve a two-step clustering process where, for each entity, the entity’s dispute submissions are clustered into groups. Cluster-level attributes based on these groups are derived, and then a second set of clusters are developed for the entities based on cluster-level attributes derived from the first set of clusters.

[0061] FIG. 4 is a flow diagram depicting an example set of operations 400 for training and applying a second model which applies unsupervised clustering techniques to determine risk scores, according to certain aspects of the present disclosure. For illustrative purposes, the process 400 is described with reference to implementations described above with respect to one or more examples described herein. Other implementations, however, are possible. In some aspects, the operations in FIG. 4 may be implemented in program code that is executed by one or more computing devices such as the credit wash risk scoring server 118 depicted in FIG. 1. In some aspects of the present disclosure, one or more operations shown in FIG. 4 may be omitted or performed in a different order. Similarly, additional operations not shown in FIG. 4 may be performed.

[0062] At block 402, the operations 400 include receiving a set of dispute activities corresponding to a set of entities. Dispute activities can be received as dispute data which may be accessed from a dispute data repository (e.g., e-OSCAR), Dispute activities can include data pertaining to various dispute categories including employment disputes, PIIAttorney Docket No. 096923-1529769EFX-203WO disputes, and tradeline disputes. Compared to data sets accessed by the first model for supervised rule generation, the scope of the set of dispute activities received per block 402 can span a much broader time range, such as the entire time range of dispute activity for each entity. By applying a broader time range of dispute activity data (e.g., at least a year or longer), a sufficient time scale of dispute activity data can be gathered to uncover a greater number of dispute patterns including potential credit washing patterns.

[0063] At block 404, the operations 400 include generating a first set of clusters based at least in part on the set of dispute activities. To generate the first set of clusters, individual dispute activities may be clustered via KDE. The breadth of dispute activities that may be used to form the first set of clusters may be overbroad and thus a subset of such dispute activities may be selected. As an example, the subset of selected dispute activities for use in the first clustering procedure may comprise any tradeline or inquiry disputes related to fraud. By limiting the size of dispute activities in the first cluster, noise related to the majority of the disputes which are non-fraudulent may be removed from the clustering process.

[0064] At block 406, the operations 400 include generating a set of cluster-level attributes based at least in part on the first set of clusters. Once the first set of clusters is formed, based at least in part on the dispute activities or a reduced set of the dispute activities, cluster-level attributes for each entity may be derived. Such cluster-level attributes can include the number of clusters in the first set of clusters, a number or percentage of inquiry-only clusters, an average or max number of disputes per cluster, a number of tradelines, and / or inquiry disputes per cluster in the first set of clusters. Such examples are non-limiting and other cluster-level attributes for each entity may be derived.

[0065] At block 408, the operations 400 include generating a set of second clusters based at least in part on the set of cluster-level attributes. After the cluster-level attributes are generated, the entities may again be clustered based at least in part on the cluster-level attributes. The second clustering procedure can use the same or different clustering techniques as the first clustering procedure. In one example, while the first clustering procedure applies KDE clustering, the second clustering procedure applies K-means clustering. Other centroid-based clustering techniques may be applied in addition to other clustering paradigms such as density or distribution-based clustering. The second clustering procedure can thus cluster entities into different risk-based clusters according to associated attributes, particularly timing-based attributes indicating the rate that entities dispute transaction.Attorney Docket No. 096923-1529769EFX-203WO

[0066] At block 410, the operations 400 include generating a second risk score based at least in part on the second set of clusters. Different risk levels may be assigned to each cluster. For instance, a first group of attribute-based clusters may be a low-risk group, where in a first cluster, entities dispute quickly (e.g., within a threshold time) of a given transaction and are highly likely to dispute the same trades again. A second cluster within the low-risk group can cluster entities based on high frequency of inquiries compared to low tradeline deletion rates, indicating a low credit washing risk.

[0067] Assigning risk scores, per block 410, can include one or more of human-in- the-loop or automated based scoring processes. In human-in-the-loop examples, users may manually review the clusters generated per block 408 and assign certain risk levels for each cluster. In automated examples, one or more of relative distance mapping between clusters and attributes within clusters may be used to assign scores. In an example, certain clustered attributes may be closer to a ground truth for credit washing behavior, such as tradeline deletion attributes. Tradeline deletion attributes within each cluster may then be used as a proxy to assign the risk score for each cluster. If a given cluster includes a higher tradeline deletion score, that cluster, and each attribute in that cluster, may then be assigned a higher risk score compared to a cluster including a lower tradeline deletion score.

[0068] A second group of attribute-based clusters may be assigned a medium risk score. The second, medium risk group, may be defined as including entities fitting within multiple clusters who are unlikely to dispute the same trade again, and are defined as disputing according to a set schedule. A significant difference between the first low-risk cluster group and the second medium risk cluster group is the time between clusters, in addition to the medium risk group having a higher tradeline deletion rate compared to the first low-risk group.

[0069] A third, group of attribute-based clusters may be assigned a high-risk score. The third, high-risk group can consist of entities who are identified as fitting in the greatest number of clusters, are highly unpredictable in their dispute timing, and are most likely to dispute the same trade again. The third, high-risk group, can also have the highest tradeline deletion rate of the three groups.

[0070] While three groups of attribute-based clusters are described according to increasing levels of severity, it is to be appreciated that the unstructured model risk-based scoring can include more or fewer risk score assignments. For instance, a binary score of “risk” or “no risk” may be applied as opposed to a three-tier ranking structure. Alternatively,Attorney Docket No. 096923-1529769EFX-203WO four or more degrees of risk assignment can be applied. In other examples, the risk score generated by the unsupervised clustering approach can be a numerical or continuous score to be integrated with the first, rule-based score.Example Architecture of the Third Model Based on Insightful Attribute Scoring

[0071] According to some examples, the credit washing risk score system 130 can include a third model for generating a third set of risk scores associated with an entity based at least in part on the entity attributes. The third set of risk scores may be evaluated with the first set of risk scores and the second set of risk scores to generate the final output credit washing risk score.

[0072] The third model can include a specific set of attribute-based rules, similar to those generated by the first model, where the specific set of attribute-based rules of the third model are afforded special weight in relation to the first model set of rules. For instance, the weighting of the third model rules may be weighted independently, as opposed to being summed against other rules as described with respect to the first model rules. Referred to as insightful attributes, the attribute-based rules of the third model provide additional means of evaluating credit washing risk according to given attributes associated with an entity.

[0073] A first example of an insightful attribute which may be applied per the third model set of rules includes a payment-interval attribute corresponding to a given tradeline. Also referred to as “paid a while” tradeline, the payment-interval attribute compares the time between a tradeline being opened to an identified delinquency, the time until delinquency may be compared against a time until delinquency threshold, where if the time until delinquency exceeds the time until delinquency threshold, the third risk score may be assigned a higher value than if the time until delinquency threshold was not met.

[0074] Different delinquencies may be compared against. For instance, delinquencies may be assigned on a threshold scale, where a subset of all delinquencies may be identified as major delinquencies, the major delinquencies identified as occurring after a certain period of continuous delinquency. A first time until delinquency may describe the time between a tradeline being opened until the first identified delinquency. The first time until delinquency may be compared against a first time until delinquency threshold to determine a first time until delinquency attribute score. A second time until delinquency attribute may describe the time between a tradeline being opened until the first major identified delinquency. The second time until delinquency may be compared against a second time until delinquency threshold to determine a second time until delinquency attribute score. The first and secondAttorney Docket No. 096923-1529769EFX-203WO time until delinquency attribute scores may be combined to generate a final time until delinquency attribute score, which may form the payment-interval attribute. If no delinquency is on file, then the number of days from the tradeline account opening to a dispute submission date may otherwise provide a score to form the payment-interval attribute.

[0075] A second example of an insightful attribute which may be applied per the third model set of rules includes a proxy-dispute attribute. A common method that credit washers like to use to help their chances to successfully clean off a bad debt is by leveraging proxies, such as credit clinics, to dispute on their behalf. Credit clinics include organizations associated with entities assigned to boost the entities’ credit score. Thus, use of such proxies may be indicative of credit washing fraud, and so the third model may determine risk scores based in part on the use of such proxies. The proxy-dispute attribute may be defined as the number of disputes associated with an entity claiming fraud through a proxy over an allotted period.

[0076] A third example of an insightful attribute which may be applied per the third model set of rules includes a Pll-variation attribute. In some cases, credit washers may attempt to remove compromised PII from their credit file (e.g., cutting off ties with PII that are associated with previous fraudulent activities. The PII variation attribute may be defined by the number of PII disputes over an allotted period. In some examples, the period can include a complete period (e.g., from creation of the entity’s CID account).

[0077] The third model including the insightful attributes can generate the third set of risk scores based on one or more of the attributes above including the payment-interval attribute, the proxy-dispute attribute, and / or the Pll-variation attribute. When multiple of the payment-interval attribute, the proxy-dispute attribute, and the Pll-variation attributes are included, each included attribute may be afforded a different or same weight in generating the third set of risk scores.Example of a Credit Washing Risk Computing System

[0078] As described according to the above examples, multiple models may each be used to generate respective sets of risks scores including a first model generating first risk scores, second model generating second risk scores, and third model generating third risk scores. One or more of the sets of risks scores may then be used to generate the final risk score for assessing a risk that users are engaging in fraudulent credit washing activities.Attorney Docket No. 096923-1529769EFX-203WO

[0079] In examples where multiple sets of risk scores are aggregated to generate the final risk score, the sets of risk scores can be aggregated in a variety of manners. In some instances, the sets of risk scores can be aggregated according to equal weights, where each of the first, second, and third sets may be evenly weighed as a third of the final risk score. In the same and other instances, users may manipulate the weights of the sets of risk scores for further tuning. In further examples still, the sets of risk scores can be aggregated via an ensemble approach, where a fourth model may be implemented to modify the weights of the sets of risk scores.

[0080] The final generated risk score, determined based on the one or more sets of risk scores generated by the respective first, second, and third models, can be used to affect operations of a greater computing environment. The final risk score can be output in the form of a notification, alert, warning, or other message to various user interfaces, such as that of an administrator or merchant who is party to a given dispute.

[0081] In the same or other examples, the final generated risk score can be used to control access to computing resources and interactive computing environments. Particularly low scores (e.g., below an exceedingly low threshold) can indicate that a given dispute is an attempt at credit washing. A repeated pattern of disputes below the threshold can more strongly that a particular entity is likely to be engaging in credit washing. A set quantity of disputes below the dispute threshold may then be used to identify fraudulent behavior within a dispute service. In response, the credit washing risk computing system can include flagging the entity initiating the disputes as a fraudulent entity and respond by restricting or limiting the entities access to one or more secure computing environments, including the dispute service. Further, responsive to such determined anomalous behavior, the data enrichment service may automatically initiate further authentication requests to establish the identity of the disputing entity.Example of a Credit Washing Risk Computing System

[0082] Any suitable computing system or group of computing systems can be used to perform the operations described herein. For example, FIG. 5 is a block diagram illustrating an example of a computing device 500 that can be used to implement the credit washing risk scoring described above, according to certain examples. The computing device 500 can include various devices for communicating with other devices in a computing environment.

[0083] The computing device 500 can include a processor 502 that is communicatively coupled to a memory 504. The processor 502 can execute Computer-Attorney Docket No. 096923-1529769EFX-203WO executable program code stored in the memory 504, can access information stored in the memory 504, or both. Program code may include machine-executable instructions that may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc., may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, among others.

[0084] Examples of a processor 502 can include a microprocessor, an applicationspecific integrated circuit (ASIC), a field-programmable gate array (FPGA), or any other suitable processing device. The processor 502 can include any suitable number of processing devices, including one. The processor 502 can include or communicate with a memory 504. The memory 504 can store program code that, when executed by the processor 502, causes the processor 502 to perform the operations described herein.

[0085] The memory 504 can include any suitable non-transitory computer-readable medium. The computer-readable medium can include any electronic, optical, magnetic, or other storage device capable of providing a processor with computer-readable program code or other program code. Non-limiting examples of a computer-readable medium can include a magnetic disk, memory chip, optical storage, flash memory, storage class memory, ROM, RAM, an ASIC, magnetic storage, or any other medium from which a computer processor can read and execute program code. The program code may include processor-specific program code generated by a compiler or an interpreter from code written in any suitable computer-programming language. Examples of suitable programming language can include Hadoop, C, C++, C#, Visual Basic, Java, Python, Perl, JavaScript, ActionScript, etc.

[0086] The computing device 500 may also include a number of external or internal devices such as input or output devices. For example, the computing device 500 is illustrated with an input / output interface 508 that can receive input from input devices or provide output to output devices. A bus 506 can also be included in the computing device 500. The bus 506 can communicatively couple one or more components of the computing device 500.

[0087] The computing device 500 can execute program code 514 that can include the presently described models, testing operations, and the like. The computing device 500 can also generate and store program data 516 that can include any of the described dataAttorney Docket No. 096923-1529769EFX-203WO structures. The program code 514 and program data 516 may be resident in any suitable computer-readable medium and may be executed on any suitable processing device.Executing the program code 514 can configure the processor 502 to perform one or more of the operations described herein.

[0088] In some aspects, the computing device 500 can include one or more output devices. One example of an output device can be the network interface device 510 depicted in FIG. 1. A network interface device 510 can include any device or group of devices suitable for establishing a wired or wireless data connection to one or more data networks described herein. Non-limiting examples of the network interface device 510 can include an Ethernet network adapter, a modem, etc.

[0089] Another example of an output device can include the presentation device 512 depicted in FIG. 1. A presentation device 512 can include any device or group of devices suitable for providing visual, auditory, or other suitable sensory output. Non-limiting examples of the presentation device 512 can include a touchscreen, a monitor, a speaker, a separate mobile computing device, etc. In some aspects, the presentation device 512 can include a remote client-computing device that can communicate with the computing device 500 using one or more data networks described herein. In other aspects, the presentation device 512 can be optional.

[0090] The foregoing description of some examples has been presented only for the purpose of illustration and description and is not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Numerous modifications and adaptations thereof will be apparent to those skilled in the art without departing from the spirit and scope of the disclosure.

Claims

1. Attorney Docket No. 096923-1529769EFX-203WOClaims1. A computer-implemented method comprising: receiving candidate data associated with an entity, the candidate data including a first set of entity attributes and a second set of entity attributes, the first set of entity attributes including attributes related to anomalous behavior patterns and the second set of entity attributes including attributes related to entity dispute patterns; generating by a first model, a first risk score associated with the entity, wherein the first model determines the first risk score based at least in part on a threshold number of reduced attribute-based rules triggered based at least in part on the first set of entity attributes; generating by a second model, a second risk score associated with the entity, wherein the second model determines the second risk score based at least in part on clustering the second set of entity attributes; and generating an aggregate risk score based at least in part on the first risk score and the second risk score.

2. The computer-implemented method of claim 1, wherein the aggregate risk score comprises a credit washing risk score that is predictive of a likelihood that the entity is engaging in an act of credit washing.

3. The computer-implemented method of claim 2, further comprising transmitting to a remote computing device, a responsive message including at least the credit washing risk score and associated information for the entity for use in controlling access of the entity to one or more interactive computing environments.

4. The computer-implemented method of claim 1, wherein the candidate data includes a third set of entity attributes, and the method further comprises: generating by a third model, a third risk score associated with the entity; and generating the aggregate risk score further based on the third risk score, wherein the third model determines the third risk score based at least on the third set of entity attributes compared with one or more insightful attributes including: a payment-interval attribute;Attorney Docket No. 096923-1529769EFX-203WO a proxy-dispute attribute; or a personally identifiable information (“PIT’) variation attribute.

5. The computer-implemented method of claim 1, wherein the reduced attribute-based rules are generated by the first model, wherein the first model generates the reduced attributebased rules through operations comprising: receiving a set of training labels; receiving a set of training attributes; generating an initial set of attribute-based rules based at least in part on the set of training labels and set of training attributes; and cleaning the initial set of attribute-based rules to reduce the initial set of attributebased rules to generate the reduced attribute-based rules.

6. The computer-implemented method of claim 1, wherein clustering the second set of entity attributes comprises: receiving a set of dispute activities corresponding to a set of entities; generating a first set of clusters based at least in part on the set of dispute activities; generating a set of cluster-level attributes based at least in part on the first set of clusters; generating a second set of clusters based at least in part on the set of cluster-level attributes; and generating the second risk score based at least in part on the second set of clusters.

7. The computer-implemented method of claim 6, wherein the cluster-level attributes includes: a number of clusters in the first set of clusters; a number of disputes per cluster in the first set of clusters; and a number of tradelines per cluster in the first set of clusters.

8. A non-transitory computer-readable storage medium having program code executable by a processing device to perform operations comprising:Attorney Docket No. 096923-1529769EFX-203WO receiving candidate data associated with an entity, the candidate data including a first set of entity attributes and a second set of entity attributes, the first set of entity attributes including attributes related to anomalous behavior patterns and the second set of entity attributes including attributes related to entity dispute patterns; generating by a first model, a first risk score associated with the entity, wherein the first model determines the first risk score based at least in part on a threshold number of reduced attribute-based rules triggered based at least in part on the first set of entity attributes; generating by a second model, a second risk score associated with the entity, wherein the second model determines the second risk score based at least in part on clustering the second set of entity attributes; and generating an aggregate risk score based at least in part on the first risk score and the second risk score.

9. The non-transitory computer-readable storage medium of claim 8, wherein the aggregate risk score comprises a credit washing risk score, predictive of a likelihood that the entity is engaging in an act of credit washing.

10. The non-transitory computer-readable storage medium of claim 9, transmitting to a remote computing device, a responsive message including at least the credit washing risk score and associated information for the entity for use in controlling access of the entity to one or more interactive computing environments.

11. The non-transitory computer-readable storage medium of claim 8, wherein the candidate data includes a third set of entity attributes, and wherein the operations further comprise: generating by a third model, a third risk score associated with the entity; and generating the aggregate risk score further based at least in part 'on the third risk score, wherein the third model determines the third risk score based at least in part on the third set of entity attributes, wherein the third model determines the third risk score based at least in part on comparing the third set of entity attributes with one or more insightful attributes including: a payment-interval attribute; a proxy-dispute attribute; orAttorney Docket No. 096923-1529769EFX-203WO a PII variation attribute.

12. The non-transitory computer-readable storage medium of claim 8, wherein the reduced attribute-based rules are generated by the first model, wherein the first model generates the reduced attribute-based rules through operations comprising: receiving a set of training labels; receiving a set of training attributes; generating an initial set of attribute-based rules based at least in part on the set of training labels and set of training attributes; and cleaning the initial set of attribute-based rules to reduce the initial set of attributebased rules to generate the reduced attribute-based rules.

13. The non-transitory computer-readable storage medium of claim 8, wherein clustering the second set of entity attributes comprises: receiving a set of dispute activities corresponding to a set of entities; generating a first set of clusters based at least in part on the set of dispute activities; generating a set of cluster-level attributes based at least in part on the first set of clusters; generating a second set of clusters based at least in part on the set of cluster-level attributes; and generating the second risk score based at least in part on the second set of clusters.

14. The non-transitory computer-readable storage medium of claim 13, wherein the cluster-level attributes includes: a number of clusters in the first set of clusters; a number of disputes per cluster in the first set of clusters; and a number of tradelines per cluster in the first set of clusters.

15. A computing system comprising: a processing device; a non-transitory computer-readable storage medium having program code executable by the processing device to perform operations comprising:Attorney Docket No. 096923-1529769EFX-203WO receiving candidate data associated with an entity, the candidate data including a first set of entity attributes and a second set of entity attributes, the first set of entity attributes including attributes related to anomalous behavior patterns and the second set of entity attributes including attributes related to entity dispute patterns; generating by a first model, a first risk score associated with the entity, wherein the first model determines the first risk score based at least in part on a threshold number of reduced attribute-based rules triggered based at least in part on the first set of entity attributes; generating by a second model, a second risk score associated with the entity, wherein the second model determines the second risk score based at least in part on clustering the second set of entity attributes; and generating an aggregate risk score based at least in part on the first risk score and the second risk score.

16. The computing system of claim 15, wherein the aggregate risk score comprises a credit washing risk score, predictive of a likelihood that the entity is engaging in an act of credit washing.

17. The computing system of claim 16, wherein the operations further comprise transmitting to a remote computing device, a responsive message including at least the credit washing risk score and associated information for the entity for use in controlling access of the entity to one or more interactive computing environments.

18. The computing system of claim 15, wherein the candidate data includes a third set of entity attributes, and wherein the operations further comprise: generating by a third model, a third risk score associated with the entity; and generating the aggregate risk score further based on the third risk score, wherein the third model determines the third risk score based at least in part on the third set of entity attributes, wherein the third model determines the third risk score at least in part based on comparing the third set of entity attributes with one or more insightful attributes including: a payment-interval attribute; a proxy-dispute attribute; or a PII variation attribute.Attorney Docket No. 096923-1529769EFX-203WO19. The computing system of claim 15, wherein the reduced attribute-based rules are generated by the first model, wherein the first model generates the reduced attribute-based rules through operations comprising: receiving a set of training labels; receiving a set of training attributes; generating an initial set of attribute-based rules based at least in part on the set of training labels and set of training attributes; and cleaning the initial set of attribute-based rules to reduce the initial set of attributebased rules to generate the reduced attribute-based rules.

20. The computing system of claim 15, wherein clustering the second set of entity attributes comprises: receiving a set of dispute activities corresponding to a set of entities; generating a first set of clusters based at least in part on the set of dispute activities; generating a set of cluster-level attributes based at least in part on the first set of clusters; generating a second set of clusters based at least in part on the set of cluster-level attributes; and generating the second risk score based at least in part on the second set of clusters.