Fund chain gang identification method and device, computing device, and medium

By acquiring the transaction chain associated with black seeds and using a dual representation model with directed edge attention, low-risk transactions are eliminated, and Ponzi scheme groups are identified. This solves the problems of low identification accuracy and high reliance on manpower in existing technologies, and achieves efficient and comprehensive identification of Ponzi scheme groups.

CN115271939BActive Publication Date: 2026-06-12ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Filing Date
2022-06-20
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing technologies have low accuracy in identifying Ponzi schemes, making it difficult to quickly and comprehensively identify clear fund flows, and they rely on high human resources costs.

Method used

By acquiring predetermined black seeds, using full-domain transaction data to determine the transaction chains associated with the black seeds, eliminating transactions with low risk, and employing a directed edge attention dual-representation model to identify the money chain gangs, the money chain gangs are identified based on the transaction chains associated with the black seeds.

🎯Benefits of technology

It improves the accuracy and coverage of identifying Ponzi scheme groups, reduces labor costs, and enables rapid and clear identification of fund flows, with high timeliness.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115271939B_ABST
    Figure CN115271939B_ABST
Patent Text Reader

Abstract

The embodiment of the specification provides a fund chain gang identification method and device, a computing device and a medium. The method comprises the following steps: acquiring a predetermined black seed; the black seed is a customer who is identified as having a money laundering risk after trial; determining a transaction chain associated with the black seed according to global transaction data; determining a first fund chain according to the transaction chain associated with the black seed; the first fund chain is a fund chain associated with the black seed; determining the risk degree of each transaction in the first fund chain, and removing the transaction with a risk degree lower than a preset risk degree in the first fund chain to obtain a second fund chain; and determining a fund chain gang to which the black seed belongs according to the second fund chain. The application can improve the identification accuracy of the fund chain gang.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This specification relates to one or more embodiments in the field of anti-money laundering technology, and in particular to a method and apparatus for identifying money laundering gangs, a computing device, and a computer-readable storage medium. Background Technology

[0002] Money laundering through suspicious transactions takes many forms. After receiving illicit funds, money launderers use various methods to obscure the source of the funds, making these illegally obtained funds appear usable. This is what we commonly refer to as money laundering. The aforementioned money laundering crimes have a clear organizational structure, with different roles of clients collaborating to conduct illegal transactions. They are characterized by their wide impact and high degree of harm.

[0003] In the field of anti-money laundering, the detection of suspicious transactions is a crucial component of risk control. The core element of money laundering is the abnormal flow of funds; therefore, identifying abnormal transactions and gaining a clear understanding of the groups behind the fund flows is a key objective of anti-money laundering operations. Summary of the Invention

[0004] This specification describes one or more embodiments of a method and apparatus for identifying Ponzi scheme groups, a computing device, and a computer-readable storage medium, which can improve the accuracy of identifying Ponzi scheme groups.

[0005] According to the first aspect, a method for identifying Ponzi scheme groups is provided, including:

[0006] Obtain a pre-determined black seed; the black seed refers to customers identified as having money laundering risks after review.

[0007] Based on the overall transaction data, determine the transaction chain associated with the black seed;

[0008] Based on the transaction chain associated with the black seed, a first funding chain is determined; the first funding chain is the funding chain associated with the black seed.

[0009] Determine the risk level of each transaction in the first capital chain, and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain;

[0010] Based on the second funding chain, the funding chain group to which the black seed belongs was identified.

[0011] According to the second aspect, a device for identifying financial chain gangs is provided, comprising:

[0012] The first acquisition module is used to acquire a pre-determined black seed; the black seed refers to customers who have been identified as having money laundering risks after review.

[0013] The first determining module is used to determine the transaction chain associated with the black seed based on the global transaction data;

[0014] The second determining module is used to determine the first funding chain based on the transaction chain associated with the black seed; the first funding chain is the funding chain associated with the black seed.

[0015] The third determining module is used to determine the risk level of each transaction in the first capital chain and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain.

[0016] The fourth determining module is used to determine the funding chain group to which the black seed belongs based on the second funding chain.

[0017] According to a third aspect, a computing device is provided, including a memory and a processor, wherein executable code is stored in the memory, and when the processor executes the executable code, it implements the method provided in the first aspect.

[0018] According to a fourth aspect, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method provided in the first aspect.

[0019] The method, apparatus, computing device, and computer-readable storage medium for identifying money laundering groups provided in the embodiments of this specification first acquire black seeds (transactions identified as having money laundering risks after review), then determine the transaction chains associated with the black seeds from the overall transaction data, determine a first money chain based on the transaction chains associated with the black seeds, remove transactions with lower risk in the first money chain to obtain a second money chain, and finally determine the money chain group to which the black seeds belong based on the second money chain. It can be seen that the method provided by the embodiments of this invention starts from black seeds, searches for relevant transaction chains, then determines the corresponding money chains through these transaction chains, and determines the money chain group to which the black seeds belong based on the money chains associated with them. This method requires almost no operational manpower. Moreover, a clear flow of funds can be seen in the obtained money chain groups, i.e., it has a strong ability to interpret fund flows. It can identify the corresponding money chain group for each black seed, avoiding omissions of money chain groups and having the advantage of comprehensive coverage. Since the method provided by the embodiments of this invention relies almost entirely on manual labor, it can improve the accuracy of identification and can also quickly identify money chain groups. As can be seen, the method provided by the embodiments of the present invention has the advantages of saving labor costs, comprehensive coverage, high accuracy, strong interpretability of fund flow, and high timeliness. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0021] Figure 1 This is a flowchart illustrating a method for identifying money chain gangs in one embodiment of this specification;

[0022] Figure 2 This is a flowchart illustrating one implementation of step S400 in one embodiment of this specification.

[0023] Figure 3 This is a flowchart illustrating one implementation of determining the risk level of each transaction in the first funding chain in S800 of one embodiment of this specification.

[0024] Figure 4 This is a flowchart illustrating each iteration process in S820 of one embodiment of this specification;

[0025] Figure 5 This is a flowchart illustrating one implementation of S1000 in one embodiment of this specification;

[0026] Figure 6 This is a schematic diagram of each of the second funding chains in one embodiment of this specification;

[0027] Figure 7 This is a flowchart illustrating one implementation of determining the optimal funding chain among the second funding chains corresponding to the target state in S1040 of one embodiment of this specification.

[0028] Figure 8 This is an example schematic diagram of a return matrix in one embodiment of this specification;

[0029] Figure 9 This is a schematic diagram of the initial Q matrix in one embodiment of this specification;

[0030] Figure 10 This is a structural block diagram of a money chain gang identification device in one embodiment of this specification. Detailed Implementation

[0031] The solution provided in this specification will now be described with reference to the accompanying drawings.

[0032] The core element of money laundering is the abnormal flow of funds. Identifying unusual transactions and identifying groups with clear fund flows is a crucial requirement for anti-money laundering operations. Previously, various solutions for identifying money laundering groups have been explored, such as the Louvain algorithm for identifying criminal groups. While these solutions have shown some effectiveness on domestic websites, their accuracy rate is relatively low. Currently, controlling the level of money laundering risk has entered a critical stage, making accurate and rapid risk detection extremely important.

[0033] Therefore, according to the first aspect, embodiments of this specification provide a method for identifying money laundering groups. First, a predetermined "black seed" is obtained; the black seed is a customer identified as having money laundering risk after review. Then, based on the overall transaction data, the transaction chain associated with the black seed is determined. Next, based on the transaction chain associated with the black seed, a first money chain is determined; the first money chain is the money chain associated with the black seed. Then, the risk level of each transaction in the first money chain is determined, and transactions with a risk level lower than a preset risk level in the first money chain are removed to obtain a second money chain. Finally, based on the second money chain, the money chain group to which the black seed belongs is determined.

[0034] The following describes the specific implementation of the above concept.

[0035] Figure 1 This is a flowchart illustrating a method for identifying Ponzi scheme groups in one embodiment of the present invention. It is understood that this method can be executed by any device, equipment, platform, or cluster of devices with computing and processing capabilities. See also... Figure 1 The method for identifying Ponzi scheme groups includes the following steps S200 to S1000:

[0036] S200. Obtain a pre-determined "black seed"; the "black seed" refers to customers identified as having a money laundering risk after review.

[0037] S400. Based on the global transaction data, determine the transaction chain associated with the black seed;

[0038] S600. Based on the transaction chain associated with the black seed, determine the first funding chain; the first funding chain is the funding chain associated with the black seed.

[0039] S800: Determine the risk level of each transaction in the first capital chain, and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain;

[0040] S1000. Based on the second funding chain, determine the funding chain group to which the black seed belongs.

[0041] exist Figure 1The illustrated method for identifying money laundering groups first obtains black market funds deemed to pose a money laundering risk after review. Then, it identifies the transaction chains associated with these black market funds from the overall transaction data. Based on these transaction chains, a first money laundering chain is determined. Transactions with lower risk in the first money laundering chain are removed to obtain a second money laundering chain. Finally, the money laundering group to which the black market fund belongs is identified based on the second money laundering chain. As can be seen, the method provided by this embodiment starts with black market funds, searches for relevant transaction chains, determines the corresponding money laundering chains through these chains, and identifies the money laundering group to which the black market fund belongs based on the associated money laundering chains. This method requires almost no operational manpower. Furthermore, the obtained money laundering groups show clear fund flows, demonstrating strong fund flow interpretation capabilities. For each black market fund, a corresponding money laundering group can be identified, avoiding omissions and offering comprehensive coverage. Since the method provided by this embodiment relies almost entirely on manual labor, it improves the accuracy of identification and allows for rapid identification of money laundering groups. Therefore, the method provided by this embodiment has the advantages of saving manpower costs, comprehensive coverage, high accuracy, strong fund flow interpretation capabilities, and high timeliness.

[0042] The following description Figure 1 The execution method of each step.

[0043] S200. Obtain a pre-determined "black seed"; the "black seed" refers to customers identified as having a money laundering risk after review.

[0044] Understandably, so-called "black seeds" are clients identified as having money laundering risks through manual or platform-based screening. Money laundering takes many forms, and these illegal transactions can also be termed "crimes," meaning illegal transactions that constitute money laundering.

[0045] In this context, trading platforms typically monitor and review clients through platform programs or manual processes. After a period of monitoring and review, if certain clients are found to pose a money laundering risk, they are flagged accordingly. For example, if a client has engaged in illegal fundraising transactions, they are flagged for illegal fundraising. In S200, clients with relevant flags can be obtained from the trading platform as "black seeds" (or "blacklisted clients").

[0046] S400. Based on the global transaction data, determine the transaction chain associated with the black seed;

[0047] The "full-domain transaction data" refers to all transaction data. Extracting the transaction chains associated with black seeds from all transaction data can avoid omissions. Of course, since the full-domain transaction data is very large, transaction data within a certain period can be selected.

[0048] Among them, the transaction chain associated with black seeds refers to a chain formed by one or more transactions that have upstream and downstream transaction relationships with black seeds.

[0049] For specific implementation, see Figure 2 The above-mentioned S400 may include the following steps S420 to S460:

[0050] S420. Extract the transaction data within the most recent preset time period from the global transaction data;

[0051] For example, transaction data from the most recent month or the most recent 90 days can be extracted from the overall transaction data.

[0052] The total transaction data includes all transaction data from domestic websites. This transaction data is stored in a database, allowing the extraction of transaction data within the most recent preset time period from all the data stored in the database.

[0053] S440. Convert the transaction data within the most recent preset time period into a corresponding transaction graph; wherein, the nodes in the transaction graph are customers, the edges between the nodes are transaction information between customers with transaction relationships, and the transaction information includes the transaction amount;

[0054] The transaction data is presented in tabular form. To make subsequent operations more intuitive and convenient, the tables are transformed into a transaction graph. The transaction graph contains many nodes and edges. Nodes represent customers, and the edges between two customers represent transaction information such as transaction time and amount. The edges are directional, determined by the flow of funds. For example, in a transaction where the transaction amount flows from customer A to customer B, the edge direction is from the node corresponding to customer A to the node corresponding to customer B.

[0055] S460. Track the flow of transaction funds related to the black seed in the transaction graph to obtain the transaction chain associated with the black seed.

[0056] Specifically, the node corresponding to the black seed can be found from the transaction graph. Then, based on the transaction relationships between the node corresponding to the black seed and other nodes, the upstream and downstream nodes of the black seed, the upstream nodes of the upstream nodes, the downstream nodes of the downstream nodes, and so on, until the final downstream node and the initial upstream node are reached. The black seed and all its upstream and downstream nodes form a transaction chain associated with the black seed.

[0057] In specific implementation, before executing S460, the method may further include: screening transaction chains with money laundering risks from the transaction graph according to a preset screening strategy; the preset screening strategy is determined based on the transaction characteristics of money laundering behavior. Correspondingly, S460 may specifically include: tracking the flow of transaction funds related to the black seed in the transaction chains with money laundering risks to obtain the transaction chains associated with the black seed.

[0058] Based on experience gained from anti-money laundering operations, transactions with money laundering risks generally exhibit the following three characteristics: ① High transaction amounts: The average transaction amount per customer is relatively large, and the inflow and outflow amounts are close within a relatively short period. ② High transaction speed: Due to the time cost of funds, the inflow and outflow time intervals in most money laundering transactions are within 3 hours, and the funds remain in the account for a very short time. ③ Abnormal transaction patterns: The fund flow of money laundering transactions is mostly more than two hops and involves fund exchanges between multiple customers.

[0059] Based on the above three transaction characteristics, a preset screening strategy can be determined: the transaction amount exceeds a certain amount, the inflow and outflow time interval is within a certain time, and the fund flow has more than two hops. Based on this preset screening strategy, transaction chains with money laundering risks are screened from the transaction graph. Then, in step S460, the flow of funds related to the black seed in the transaction chains with money laundering risks can be tracked to obtain the transaction chains associated with the black seed, thus further accurately identifying abnormal transaction chains related to the black seed.

[0060] By following the steps S420 to S460 above, the transaction chain associated with the black seed can be found conveniently and quickly.

[0061] S600. Based on the transaction chain associated with the black seed, determine the first funding chain; the first funding chain is the funding chain associated with the black seed.

[0062] In this transaction chain, there may be multiple transactions involving the same clients. This is because the transaction chain associated with the black seed is determined based on transaction data over a period of time. Within this period, the two clients may have conducted multiple transactions, resulting in multiple transactions. For example, in the transaction data of one day, one transaction involves clients A and B, with client A transferring 2 million RMB to client B. In the transaction data of another day, there is also a transaction involving clients A and B, with client B transferring 3 million RMB to client A. In this case, these two transactions can be combined into one transaction. Because the transaction directions are different, the combined amount is 1 million RMB, with the transaction direction being client B to client A. That is, the funds from the two transactions are combined, and the direction of the combined transaction is determined by the direction of the combined fund flow.

[0063] In specific implementation, S600 may specifically include: in the transaction chain associated with the black seed, summarizing the transaction funds of each transaction in which both parties have the same customer to obtain the first fund chain.

[0064] For example, the transaction chains within 3 days are as follows:

[0065] Day 1: The transaction chain associated with Black Seed A is ABCD;

[0066] Day 2: The transaction chain associated with Black Seed A is ABC;

[0067] Day 3: The transaction chain associated with Black Seed A is AB;

[0068] The funds from transactions A and B on days 1, 2, and 3 are aggregated to obtain the aggregated funds for clients A and B. Similarly, the funds from transactions B and C on days 1 and 2 are aggregated to obtain the aggregated funds for clients B and C. After aggregation, a fund chain ABCD is obtained. In this fund chain, the funds between A and B represent the aggregated funds from transactions A and B on days 1, 2, and 3; the funds between B and C represent the aggregated funds from transactions B and C on days 1 and 2.

[0069] As can be seen, the first funding chain mentioned above can be obtained through the above methods.

[0070] S800: Determine the risk level of each transaction in the first capital chain, and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain;

[0071] In this step, lower-risk transactions are removed from the first funding chain, leaving only higher-risk transactions to obtain the second funding chain. In the second funding chain, because low-risk transactions have been eliminated, the scope of the funding group is further narrowed.

[0072] In practical implementation, various methods can be used to determine the risk level of each transaction in the first capital chain. Here, one optional method is provided: the risk level of each transaction in the first capital chain is determined by using a dual representation model of directed edge attention. The dual representation model of directed edge attention is a network model obtained by machine learning based on the influence of neighborhood node information and neighborhood edge information on the node.

[0073] That is, the directed edge attention dual representation model takes into account the influence of neighboring nodes and neighboring edges on nodes, and is a neural network model trained by machine learning.

[0074] Understandably, in risk control scenarios, transactions often have certain connections, such as transitive relationship networks, device relationship networks, and regional relationship networks. Utilizing this graph topology information to generate feature representation data would bring additional benefits to risk control. Moreover, automatically generating these features by learning graph structure representations avoids the tediousness of manual design and can uncover more advanced hidden data information than manual methods. Currently, whether it's graph representation models based on random walks or graph neural networks represented by graph convolutional networks and graph neural networks, most models analyze undirected graphs or simply transfer undirected graph methods to directed graphs, thus losing directional information. However, most graphs in real life are directed graphs, such as social network follow relationships and academic paper citation relationships. Directionality is particularly important in risk control scenarios. For example, an edge where a victim transfers money to a fraudster is marked as a fraudulent transaction, while a reverse transfer, such as a fraudster's everyday transfers to a merchant, is not. On the other hand, risk control platforms typically deal with security risk control scenarios with hundreds of millions / billions / tens of billions of edges. In these scenarios, organizing data in a directed graph manner can reduce the amount of data by 50% compared to organizing data in an undirected graph manner, which has significant engineering value for accelerating the extraction and representation of complex graph features and model scoring.

[0075] Therefore, in the identification scheme of financial chain gangs, the directionality on complex graphs is taken as the entry point to study the attention mechanism for directed graphs, explain the importance of graph directionality in practical applications, and then propose a dual representation model of directed edge attention in the embodiment of the present invention, which can be simply referred to as the DADEdge model.

[0076] The directed edge attention dual-representation model aims to characterize the information differences caused by different directions by independently computing the information entering and leaving the two parts of the domain nodes and the information entering and leaving the two parts of the domain edges. That is, it involves two aspects of information: one part is the dual-representation information of the convergence of domain node information, and the other part is the dual-representation information of the convergence of domain edge information.

[0077] For a node in the first funding chain, it may be a source node, a target node, or both. That is, each node in the first funding chain may act as a target node in one or more branches, and simultaneously as a source node in another one or more branches. To better distinguish the representation vectors in the two directions, a dual representation method is used to represent a node. For example, when node i is a source node, its representation vector is si; when node i is a target node, its representation vector is ti. When node i is a target node, its target representation vector ti is a neighborhood aggregation of the source representation vectors sj from each neighboring node. When node i is a source node, its source representation vector si is a neighborhood aggregation of the target representation vectors tj from each neighboring node. Finally, the two representation vectors updated by neighborhood aggregation form the node's new dual representation vector.

[0078] See Figure 3 The process of determining the risk level of each transaction in the first funding chain in S800 may include the following steps S820 to S840:

[0079] S820. Based on the neighboring node information and the neighboring edge information, determine the representation vector of each target node and each source node in the first capital chain; in a transaction of the first capital chain, the transaction direction is from the source node to the target node.

[0080] S840. Based on the representation vectors of each target node and each source node, determine the risk level of each transaction in the transaction chain associated with the black seed.

[0081] In step S820 above, a node's neighboring nodes refer to nodes that have transaction relationships with that node. For example, if node A has transaction relationships with nodes B, C, and D, then nodes B, C, and D are all neighboring nodes of node A. The edges between nodes A and B, C and D, and D are all neighboring edges of node A. Therefore, a node's neighboring node information refers to information about nodes that have transaction relationships with that node, such as the node's representation vector. A node's neighboring edge information refers to information about the edges between that node and each of its neighboring nodes, such as the edge's representation vector.

[0082] In S820, a node is represented based on its neighboring node information and neighboring edge information. When a node is only a source node, its representation vector is the source node representation vector. When a node is only a target node, its representation vector is the target node vector. When a node is both a source node and a target node, it has a corresponding source representation vector when acting as a source node and a corresponding target node representation vector when acting as a target node. That is, the node's representation vector includes both the source node representation vector and the target node representation vector, which is called dual representation in this case.

[0083] It is understood that, in this embodiment of the invention, in a transaction, the direction of the transaction is from the source node to the target node, in order to distinguish the target node from the source node.

[0084] Understandably, step S820 enables the representation of each node, i.e., obtaining the representation vector of each node, and then determining the risk level of each transaction based on the representation vectors of each node in the first funding chain. Specifically, the representation vectors of each node can be used as input information for the neural network model. That is, after inputting the representation vectors of each node into the neural network model, the risk level of each transaction can be obtained. In this process, the neural network model uses the representation vectors of each node to achieve multi-classification of risk levels. For example, the risk levels are divided into 11 categories: 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0.

[0085] As can be seen, the dual representation model of directed edge attention consists of two parts: one part is used to represent each node with vectors, and the other part is used to output the risk level of each transaction based on the representation vector of each node, which is the neural network model mentioned above.

[0086] The neural network model used to output the risk level can be trained in advance by converting each node in a transaction sample with risk labels into a representation vector. These risk labels include crime-related labels and / or penalty labels. Penalty labels, such as asset freeze labels, correspond to different risk levels.

[0087] In practical implementation, S820 may include updating the representation vector of each target node and each source node through multiple iterations. After multiple iterations, the final representation vector of each target node and each source node is obtained. These representation vectors are then input into the neural network model mentioned above to obtain the risk level of each transaction in the first funding chain.

[0088] Among them, see Figure 4Each iteration process includes the following steps S822 to S836:

[0089] S822. Determine the initial representation vector of each target node and each source node in this iteration process;

[0090] In the first iteration, the initial representation information of a node can be determined based on its own characteristic information. For example, customer name, transaction time, and transaction amounts. This characteristic information is then transformed into an initial representation vector, thus achieving the initial representation of a node. For instance, for a target node, this could include the target node's name, the transaction amounts entering that node, and the relevant transaction times.

[0091] In non-first iterations, the initial representation vector of the target node is the final representation vector of the target node in the previous iteration, and the initial representation vector of the source node is the final representation vector of the source node in the previous iteration.

[0092] Of course, before performing step S824, a linear transformation can be performed on the initial representation vector. For example, let W be the linear transformation matrix corresponding to the target representation vector and U be the linear transformation matrix corresponding to the source representation vector. Then, the representation vector can be linearly transformed using these transformation matrices. The linear transformation matrix can be a fully connected layer matrix. U will be used in the following text. (l) W (l) Let be the transformation matrix during the l-th iteration.

[0093] For example, in the l-th iteration, for the source node: For the target node: in, For the initial representation vector The representation vector after linear transformation. For the initial representation vector The representation vector after linear transformation, where i and j are the node numbers and l is the iteration number.

[0094] Understandably, linear transformations can uncover hidden information within vectors, allowing the representation vector to better represent the nodes. If a linear transformation is applied to the initial representation vector, then S824 and S826 will use the representation vector obtained after the linear transformation.

[0095] S824. For each target node, calculate the attention matrix of the neighboring nodes for that target node based on the initial representation vectors of each source node associated with that target node and the initial representation vector of that target node.

[0096] Here, each target node is analyzed as a central node. The central node needs to enter and exit the neighboring node information of two parts to update the target representation vector and the source representation vector of the node respectively.

[0097] When the central node is used as the target node, the attention matrix can be calculated based on the initial representation vectors of each source node associated with the target node and the initial representation vector of the target node.

[0098] Alternatively, the attention matrix can be calculated based on the linearly transformed representation vectors of each source node associated with the target node and the linearly transformed representation vector of the target node itself. For example, It can be represented in the following way:

[0099]

[0100] Where, τ (i) This represents the set of all source nodes associated with the target node i. For attention parameters, W (l) Let be a transformation matrix in the l-th iteration. Let U be the initial representation vector of the target node i in the l-th iteration. (l) Let be another transformation matrix in the l-th iteration. Let j be the initial representation vector of the source node j associated with the target node i during the l-th iteration. For the initial representation vector The representation vector after linear transformation For the initial representation vector The representation vector after linear transformation.

[0101] S826. For each source node, calculate the attention matrix of the neighboring nodes to the source node based on the initial representation vectors of each target node associated with the source node and the initial representation vector of the source node.

[0102] When the central node is used as the source node, the attention matrix can be calculated based on the initial representation vectors of each target node associated with the source node and the initial representation vector corresponding to the source node.

[0103] Alternatively, the attention matrix can be calculated based on the linearly transformed representation vectors of each target node associated with the source node and the linearly transformed representation vector of the source node itself. For example, It can be represented in the following way:

[0104]

[0105] Among them, s (i) This represents the set of all target nodes associated with source node i. For attention parameters, W (l) Let U be a transformation matrix in the l-th iteration. (l) Let be another transformation matrix in the l-th iteration. Let i be the initial representation vector of the source node i during the l-th iteration. Let be the initial representation vector of the target node j associated with the source node i during the l-th iteration. For the initial representation vector The representation vector after linear transformation For the initial representation vector The representation vector after linear transformation.

[0106] In practice, after obtaining the attention matrices for the two branches, the softmax function can be used to normalize the attention matrices for each branch, resulting in normalized attention matrices:

[0107] Understandably, the attention matrix above takes into account the information of neighboring nodes, that is, the attention matrix reflects the influence of neighboring nodes on the central node.

[0108] S828. Determine the initial representation vector of each edge connected to the target node and the initial representation vector of each edge connected to each source node.

[0109] Understandably, the main idea behind edge attention mechanisms is to treat edges connected to the central node as information sources relative to the central node. The attention of the central node to its neighboring edges is calculated using an attention mechanism, and then weighted and aggregated onto the central node. In a directed graph, the attention of each central node's incoming and outgoing edges is calculated on two separate branches. This aligns well with real-world physics. For example, when customer A transfers money to customer B, the characteristics of this transaction have different impacts on the source node A and the destination node B. In a fraud scenario, if the transaction is marked as fraudulent, the recipient B can easily be identified as a fraudster based on the transaction's anomalies. However, if the transaction's anomalies also affect A, the victim A will be misclassified. Therefore, edge attention needs to be calculated separately for each direction.

[0110] Understandably, the initial representation vector of an edge can be transformed into a representation vector based on information such as the two clients, transaction time, and transaction amount of the edge, thus achieving an initial representation of the edge. Since the initial edge attributes exist only in one space, we map them to two spaces through two different transformations. The edge representation vector in one space is used to update the target representation vector of the central node, and the edge representation vector in the other space is used to update the source representation vector of the central node. For example, the initial representation vector corresponding to the edge between nodes i and j acting on the source node is represented as... This initial representation vector can also be called the initial representation vector of the edge connecting the source node. The initial representation vector corresponding to the edge between node i and node j acting on the target node is expressed as: This initial representation vector can also be called the initial representation vector of the edges connected to the target node.

[0111] Of course, a linear transformation can be performed before executing the steps below, specifically using a linear transformation matrix. For example, in the l-th iteration, a linear transformation is performed on the initial representation vector of the edge. The linear transformation of the initial representation vector of the edge acting on the source node is expressed as: The linear transformation of the initial representation vector of the edge acting on the target node is expressed as: Q and P are both linear transformation matrices, P (l) Q (l) Let be the transformation matrix during the l-th iteration.

[0112] S830. For each target node, calculate the attention matrix of the neighborhood edges to the target node based on the initial representation vectors of each edge connected to the target node and the initial representation vector of the target node.

[0113] When a node is used as the target node, the attention matrix can be calculated using the initial representation vectors of the edges connected to the target node and the initial representation vector of the target node itself.

[0114] Alternatively, the attention matrix can be calculated using the linearly transformed representation vectors of each edge connected to the target node and the linearly transformed representation vector of the target node itself. Attention Matrix It can be represented as:

[0115]

[0116] in, Let i be the attention vector between node i and node j. For attention parameters, W (l) Let T be a transformation matrix in the l-th iteration, where T is the inverted sign. Let P be the initial representation vector of the target node i in the l-th iteration. (l) Let be a transformation matrix in the l-th iteration process. Let be the initial representation vector corresponding to the edge between node i and node j acting on the target node during the l-th iteration. For the corresponding initial representation vector t (l) The representation vector after linear transformation The representation vector is the linear transformation of the initial representation vector of the edge between nodes i and j.

[0117] S832. For each source node, calculate the attention matrix of the neighboring edges to the source node based on the initial representation vectors of each edge connected to the source node and the initial representation vector of the source node.

[0118] When a node is used as a source node, the attention matrix can be calculated using the initial representation vectors of the edges connected to the source node and the initial representation vector of the source node itself.

[0119] Alternatively, the attention matrix can be calculated using the linearly transformed representation vectors of the edges connected to the source node and the linearly transformed representation vector of the source node itself. Attention Matrix Attention vector between node i and node j It can be represented as:

[0120]

[0121] in, U is the attention parameter. (l) Let be a transformation matrix in the l-th iteration. Let Q be the initial representation vector of source node i during the l-th iteration. (l) Let be a transformation matrix in the l-th iteration process. Let be the initial representation vector corresponding to the edge between node i and node j acting on the source node. To The representation vector after linear transformation For the initial representation vector The representation vector after linear transformation.

[0122] Understandably, edge attention and node attention use different parameters because although edge features have been mapped to the same dimension as node features, the edge representation and node representation in the actual vector space may be far apart. Therefore, independent attention parameters are used for edges.

[0123] In practice, after calculating the attention matrices for the two branches, the softmax function can be used to normalize the attention matrices for each branch, resulting in normalized attention matrices.

[0124] Understandably, these two attention matrices take into account neighborhood edge information, meaning that the attention matrices reflect the influence of neighborhood edges on the central node.

[0125] S834. For each target node, determine the final representation vector of the target node in this iteration process based on the attention matrix of the neighboring nodes and the attention matrix of the neighboring edges to the target node.

[0126] Understandably, for a target node, after calculating the attention matrices of the neighborhood nodes and neighboring edges for that target node through the above steps, these two attention matrices can be used to calculate the final representation vector of the target node in this iteration. Considering the aggregation of neighborhood node information and neighborhood edge information, the updated representation vector of the target node is as follows:

[0127]

[0128] in, Actually, it's an attention matrix. Attention Matrix The sum of, That is, the attention parameters mentioned above. That is, the above text σ is a function that serves as the activation function.

[0129] S836. For each source node, determine the final representation vector of the source node in this iteration process based on the attention matrix of the neighboring nodes and the attention matrix of the neighboring edges to the source node.

[0130] Understandably, for a source node, after calculating the attention matrices of the neighboring nodes and their edges respectively through the above steps, these two attention matrices can be used to calculate the final representation vector of the source node in this iteration. Considering the aggregation of neighboring node information and the aggregation of neighboring edge information, the updated representation vector of the source node is as follows:

[0131]

[0132] in, Actually, it's an attention matrix. Attention Matrix The sum of, That is, the attention parameters mentioned above. That is, the above text σ

[0133] The function is the activation function.

[0134] After obtaining the final representation vector of each node in this iteration, if a next iteration is needed, the final representation vector of the node in this iteration is used as the initial representation vector for the next iteration. If this iteration is the last iteration, the final representation vector obtained through this iteration is the final iteration vector for the entire iteration process.

[0135] Understandably, after a predetermined number of iterations, the target representation vector for each target node and the source representation vector for each source node can be obtained. Then, the target representation vector and source representation vector for each node are input into the neural network model to obtain the risk level of each transaction in the first capital chain.

[0136] Specifically, a preset risk level can be set as a threshold. Transactions with a risk level higher than the threshold are retained, while transactions with a risk level lower than the threshold are removed, thus obtaining a second funding chain.

[0137] Understandably, when removing low-risk transactions, a node may lose connection with the black seed. In this case, the node that has lost connection can be discarded and will not be considered again in subsequent processes.

[0138] In other words, the method provided in this embodiment of the invention further includes: after removing transactions with a risk level lower than a preset risk level in the first capital chain, determining whether there are disconnected nodes; if so, discarding the disconnected nodes.

[0139] S1000. Based on the second funding chain, determine the funding chain group to which the black seed belongs.

[0140] Further screening of the aforementioned second funding chain yielded the final funding chain group. This group includes information such as clients, aggregated transaction amounts among clients, and the overall direction of these transactions.

[0141] In practice, reinforcement learning algorithms can be used to identify the funding chain groups to which the black seeds belong.

[0142] For details, see Figure 5 Specifically, S1000 may include the following steps:

[0143] S1020. For each target state, determine whether the number of second capital chains corresponding to the target state is greater than 1; the initial state of each second capital chain is the black seed, and the target state is the node in each second capital chain that is not adjacent to the black seed.

[0144] S1040. If so, determine the best funding chain among the second funding chains corresponding to the target state, and retain the best funding chain.

[0145] S1060. Otherwise, retain the second funding chain corresponding to the target state.

[0146] S1080. Based on the second funding chain retained for each target state, determine the funding chain group to which the black seed belongs.

[0147] exist Figure 6 In the displayed funding chains, node R is the black seed. Starting from node R, there are 5 funding chains: R->A->D->F, R->B->D->F, R->B->E->F, R->E->F, and R->C. Nodes D, E, and F are not directly connected to the black seed; therefore, nodes D, E, and F are the target nodes.

[0148] For each target state, determine if the corresponding second funding chain is greater than one. For example, for node D, the second funding chains are R->A->D and R->B->D, indicating two second funding chains. We need to select one of these two second funding chains as the optimal funding chain from node R to node D. The optimal funding chain refers to the one with the highest overall risk. For target state D, we retain the optimal funding chain and discard the other second funding chains. Similarly, for node E, there are two corresponding second funding chains: R->B->E and R->E. We also need to select the optimal funding chain from these two. For target state E, we retain the optimal funding chain and discard the other second funding chains. For node F, there are four second funding chains: R->A->D->F, R->B->D->F, R->B->E->F, and R->E->F. We also need to select the optimal funding chain from these four. For target state F, we retain the optimal funding chain and discard the other second funding chains. Finally, the remaining second funding chains for each target state will form the funding chain gang where the black seed is located.

[0149] In reality, there may also exist funding chains with only two nodes, for example... Figure 6In the R->C example, although this second funding chain is relatively short, the transaction risk of this second funding chain is relatively high. This is because after the step of eliminating low-risk transactions mentioned above, what remains are high-risk transactions. Such a short second funding chain cannot be ignored.

[0150] Based on the above considerations, S1080 can specifically include the following steps:

[0151] Determine whether there exists a second funding chain in each second funding chain that consists of only two nodes, including the black seed; if so, determine the funding chain group to which the black seed belongs based on the second funding chains retained for each target state and the second funding chains consisting of only two nodes; otherwise, determine the funding chain group to which the black seed belongs based on the second funding chains retained for each target state.

[0152] In other words, if there is no second funding chain consisting of only two nodes, for example... Figure 6 In the R->C approach, the second funding chain retained for each target state is directly used to form the funding chain group containing the black seed. If a second funding chain exists consisting of only two nodes, then both the second funding chain retained for each target state and the second funding chain consisting only of two nodes are used to form the funding chain group containing the black seed. This method avoids the omission of funding chains.

[0153] In practice, Q-Learning reinforcement learning algorithms can be used to select the optimal funding chain. See also... Figure 7 The process of determining the optimal funding chain among the second funding chains corresponding to the target state in S1040 can specifically include the following steps S1042 to S1046:

[0154] S1042. Construct a return matrix based on the risk level of each transaction in each of the second capital chains;

[0155] The risk level of each transaction can be obtained from step S800 above. Based on these risk levels, a structure can be constructed as follows: Figure 8 The reward matrix shown is also called the Reward matrix, or simply the R matrix. Figure 8 The R matrix shown represents the current node (STATE) and the node's next action (ACTION). For... Figure 6 The five funding chains shown, along with the risk levels of each transaction within those chains, form a... Figure 8 The R matrix is ​​shown. If there is no association between two nodes, the corresponding positions are filled with -1. For example, in Figure 8In the diagram, the node corresponding to the current state is node R, and the node corresponding to the next action is node C. The risk of the transaction from node R to node C is 0.9. Therefore, the corresponding position where the state STATE is R and the next action ACTION is C is filled with 0.9.

[0156] In other words, in the R matrix, the row names represent the current state (STATE), and the column names represent the available actions (ACTION) in the current state. Specifically, for the funding path R->A->D->F, the risk levels are 0.6, 0.7, and 0.8, respectively. For non-existent transactions, the elements of the R matrix are uniformly marked as -1.

[0157] S1044. For the target state, construct an initial Q matrix, and iteratively update the Q matrix according to the reward matrix until the Q matrix satisfies the convergence condition.

[0158] In this step, an initial Q-matrix is ​​constructed to represent the knowledge learned from experience; the Q-matrix can be understood as a knowledge matrix. The Q-matrix is ​​of the same order as the R-matrix, with rows representing states and columns representing actions. Initially, the agent has no knowledge of the outside world, so the Q-matrix is ​​initialized to zero. For example, see... Figure 9 The initial Q matrix is ​​shown.

[0159] against Figure 6 The target state D shown has two second funding chains: R->A->D and R->B->D. The Q matrix corresponding to the target state D is updated using the risk level of each transaction in these two second funding chains. Since the risk level of each transaction in these two second funding chains is already reflected in the aforementioned return matrix, it can also be understood as updating the Q matrix based on the aforementioned return matrix.

[0160] Specifically, the Q matrix is ​​iteratively updated using the following rules:

[0161] Q(s,a)=R(s,a)+γmax{Q(s′,a′)}

[0162] Where s and a represent the current state and action, and s′ and a′ represent the next state and action, with discount coefficient γ∈[0,1]. Q(s,a) is the Q value corresponding to the current state s and action a. R(s,a) is the R value corresponding to the current state s and action a, i.e., the risk level, which can be found in the reward matrix.

[0163] First, let the discount factor γ = 0.8, the initial state be R, the target state be D, and the initial Q matrix be an all-zero matrix. Observing the first row of matrix R, corresponding to node R, the next state has 4 possible actions: node A, B, C, or E.

[0164] For the second funding chain R->A->D, the next action is A, R(R,A) is 0.6, and the next action of node A is D, R(A,D) is 0.7. Q(A,D) = R(A,D) = 0.7, Q(R,A) = R(R,A) + 0.8 × max{Q(A,D)} = 0.6 + 0.8 × 0.7 = 1.16, thus completing one update of the Q matrix. Similarly, for the second funding chain: R->B->D, an update is performed in the same way. It can be understood that for a target state, the Q matrix is ​​updated once based on all the second funding chains corresponding to that target state, i.e., one iteration. Multiple iterations are performed in this way until the change in the Q matrix between two iterations is very small, meeting the convergence condition, at which point the iteration stops.

[0165] S1046. Select the maximum value from the Q matrix that satisfies the convergence condition, and take the second capital chain corresponding to the maximum value as the best capital chain among the second capital chains corresponding to the target state.

[0166] After stopping the iteration, the best funding chain can be selected from the Q matrix corresponding to a target state. The specific selection method is to select the maximum value from each element of the Q matrix of a target state, and take the second funding chain containing the maximum value element as the best funding chain corresponding to the target state.

[0167] It is understood that the method provided in the embodiments of the present invention has an identification accuracy of over 90%, and has high timeliness and interpretability. It can strengthen the ability to prevent and control money laundering risks, improve the timeliness of operational review and reporting, and has significant business value. It can save operational review manpower and is of great significance for improving the efficiency of intelligent review of money laundering.

[0168] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0169] According to a second aspect, embodiments of the present invention provide a device for identifying money laundering groups. See also... Figure 10 The device includes:

[0170] The first acquisition module 200 is used to acquire a pre-determined black seed; the black seed refers to customers who have been identified as having money laundering risks after review.

[0171] The first determining module 400 is used to determine the transaction chain associated with the black seed based on the global transaction data;

[0172] The second determining module 600 is used to determine the first funding chain based on the transaction chain associated with the black seed; the first funding chain is the funding chain associated with the black seed.

[0173] The third determining module 800 is used to determine the risk level of each transaction in the first capital chain and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain.

[0174] The fourth determining module 1000 is used to determine the funding chain group to which the black seed belongs based on the second funding chain.

[0175] In one embodiment, the first determining module includes:

[0176] The first extraction unit is used to extract transaction data within the most recent preset time period from the global transaction data;

[0177] The data conversion unit is used to convert the transaction data within the most recent preset time period into a corresponding transaction graph; wherein, the nodes in the transaction graph are customers, the edges between the nodes are transaction information between customers with transaction relationships, and the transaction information includes the transaction amount;

[0178] The transaction tracking unit is used to track the flow of transaction funds related to the black seed in the transaction graph, and obtain the transaction chain associated with the black seed.

[0179] In one embodiment, the second determining module is specifically used to: in the transaction chain associated with the black seed, aggregate the transaction funds of each transaction in which both parties have the same customer to obtain the first fund chain.

[0180] In one embodiment, the third determining module is specifically used to: determine the risk level of each transaction in the first capital chain using a directed edge attention dual representation model; the directed edge attention dual representation model is a network model obtained by machine learning based on the influence of neighborhood node information and neighborhood edge information on the node.

[0181] Furthermore, the third determining module includes:

[0182] The node representation unit is used to determine the representation vector of each target node and each source node in the first capital chain based on the neighboring node information and the neighboring edge information; in a transaction of the first capital chain, the transaction direction is from the source node to the target node.

[0183] The risk determination unit is used to determine the risk level of each transaction in the transaction chain associated with the black seed based on the representation vectors of each target node and each source node.

[0184] Furthermore, the node representation unit is specifically used to: update the representation vectors of each target node and each source node through multiple iterations. Each iteration includes: determining the initial representation vectors of each target node and each source node in the current iteration; for each target node, calculating the attention matrix of neighboring nodes for that target node based on the initial representation vectors of each source node associated with that target node and the initial representation vector of that target node; for each source node, calculating the attention matrix of neighboring nodes for that source node based on the initial representation vectors of each target node associated with that source node and the initial representation vector of that source node; and determining the initial representation vector of each edge connected to the target node and the initial representation vector of each edge connected to each source node. Initial representation vector; for each target node, calculate the attention matrix of neighboring edges to the target node based on the initial representation vectors of each edge connected to the target node and the initial representation vector of the target node itself; for each source node, calculate the attention matrix of neighboring edges to the source node based on the initial representation vectors of each edge connected to the source node and the initial representation vector of the source node itself; for each target node, determine the final representation vector of the target node in this iteration based on the attention matrices of neighboring nodes and the attention matrices of neighboring edges to the target node; for each source node, determine the final representation vector of the source node in this iteration based on the attention matrices of neighboring nodes and the attention matrices of neighboring edges to the source node.

[0185] In one embodiment, the fourth determining module includes:

[0186] The first judgment unit is used to: for each target state, determine whether the number of second capital chains corresponding to the target state is greater than 1; the initial state of each second capital chain is the black seed, and the target state is the node in each second capital chain that is not adjacent to the black seed;

[0187] The first retention unit is used to determine the best funding chain among the second funding chains corresponding to the target state if the number of second funding chains corresponding to the target state is greater than 1, and to retain the best funding chain.

[0188] The second retention unit is used to retain the second capital chain corresponding to the target state if the number of the second capital chains corresponding to the target state is equal to 1.

[0189] The gang identification unit is used to identify the funding chain gang to which the black seed belongs based on the second funding chain retained for each target state.

[0190] Furthermore, the group's identifying units include:

[0191] The first judgment subunit is used to: determine whether there is a second capital chain in each second capital chain that consists of only two nodes; the two nodes include the black seed.

[0192] The first determining subunit is used to: if there is a second funding chain formed by only two nodes, determine the funding chain group to which the black seed belongs based on the second funding chain reserved for each target state and the second funding chain formed by only two nodes;

[0193] The second determining subunit is used to: if there is no second funding chain formed by only two nodes, determine the funding chain group to which the black seed belongs based on the second funding chain reserved for each target state.

[0194] Furthermore, the first reserved unit includes:

[0195] The matrix construction sub-unit is used to: construct a return matrix based on the risk level of each transaction in each of the second funding chains;

[0196] The matrix update subunit is used to: construct an initial Q matrix for the target state, and iteratively update the Q matrix according to the reward matrix until the Q matrix satisfies the convergence condition;

[0197] The funding chain screening subunit is used to select the maximum value in the Q matrix that satisfies the convergence condition, and to take the second funding chain corresponding to the maximum value as the best funding chain among the second funding chains corresponding to the target state.

[0198] According to a third aspect, one embodiment of this specification provides a computer-readable storage medium having a computer program stored thereon that, when executed in a computer, causes the computer to perform the methods of any embodiment of the specification.

[0199] According to a fourth aspect, one embodiment of this specification provides a computing device including a memory and a processor, wherein the memory stores executable code, and the processor, when executing the executable code, implements the method of any embodiment of the specification.

[0200] It is understood that the structures illustrated in the embodiments of this specification do not constitute a specific limitation on the apparatus of the embodiments of this specification. In other embodiments of the specification, the above-described apparatus may include more or fewer components than illustrated, or combine some components, or split some components, or have different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

[0201] The information interaction and execution process between the modules in the above-mentioned device and system are based on the same concept as the method embodiments in this specification, and the specific details can be found in the descriptions in the method embodiments in this specification, so they will not be repeated here.

[0202] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the apparatus embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0203] Those skilled in the art will recognize that, in one or more of the examples above, the functions described in this invention can be implemented using hardware, software, widgets, or any combination thereof. When implemented in software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium.

[0204] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made on the basis of the technical solution of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for identifying a Ponzi scheme group, comprising: Obtain a predetermined black seed; The black seeds mentioned are customers who have been identified as having money laundering risks after review; Based on the overall transaction data, determine the transaction chain associated with the black seed; Based on the transaction chain associated with the black seed, a first funding chain is determined; the first funding chain is the funding chain associated with the black seed. Determine the risk level of each transaction in the first capital chain, and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain; Based on the second funding chain, the funding chain group to which the black seed belongs was identified; The step of determining the transaction chain associated with the black seed based on global transaction data includes: Extract transaction data within the most recent preset time period from the overall transaction data; The transaction data within the most recent preset time period is transformed into a corresponding transaction graph; wherein, the nodes in the transaction graph are customers, the edges between the nodes are transaction information between customers with transaction relationships, and the transaction information includes the transaction amount; The flow of funds related to the black seeds in the transaction graph is tracked to obtain the transaction chain associated with the black seeds; The step of tracing the flow of transaction funds related to the black seed in the transaction graph to obtain the transaction chain associated with the black seed includes: Find the node corresponding to the black seed in the transaction graph, and then determine the upstream and downstream nodes of the black seed, the upstream node of the upstream node, and the downstream node of the downstream node, based on the transaction relationship between the node corresponding to the black seed and other nodes, until the final downstream node and the initial upstream node are reached; in the transaction graph, the black seed and each upstream and downstream node form the transaction chain associated with the black seed.

2. The method according to claim 1, wherein, The step of determining the first funding chain based on the transaction chain associated with the black seed includes: In the transaction chain associated with the black seed, the transaction funds of each transaction where both parties have the same customer are aggregated to obtain the first fund chain.

3. The method according to claim 1, wherein, Determining the risk level of each transaction in the first funding chain includes: The risk level of each transaction in the first capital chain is determined by a dual representation model of directed edge attention; the dual representation model of directed edge attention is a network model obtained by machine learning based on the influence of neighborhood node information and neighborhood edge information on the node.

4. The method according to claim 3, wherein, The method of determining the risk level of each transaction in the first funding chain using a dual representation model with directed edge attention includes: Based on the neighboring node information and the neighboring edge information, the representation vector of each target node and each source node in the first capital chain is determined; in a transaction of the first capital chain, the transaction direction is from the source node to the target node. Based on the representation vectors of each target node and each source node, the risk level of each transaction in the transaction chain associated with the black seed is determined.

5. The method according to claim 4, wherein, The step of determining the representation vector of each target node and each source node in the first capital chain based on neighboring node information and neighboring edge information includes: updating the representation vector of each target node and each source node through multiple iterations, each iteration including: Determine the initial representation vector for each target node and each source node in this iteration process; For each target node, the attention matrix of the neighboring nodes to the target node is calculated based on the initial representation vectors of each source node associated with the target node and the initial representation vector of the target node itself. For each source node, the attention matrix of the neighboring nodes to that source node is calculated based on the initial representation vectors of each target node associated with that source node and the initial representation vector of that source node. Determine the initial representation vector of each edge connected to the target node and the initial representation vector of each edge connected to each source node; For each target node, the attention matrix of the neighborhood edges to the target node is calculated based on the initial representation vectors of each edge connected to the target node and the initial representation vector of the target node itself. For each source node, calculate the attention matrix of the neighboring edges to the source node based on the initial representation vectors of each edge connected to the source node and the initial representation vector of the source node. For each target node, the final representation vector of the target node in this iteration is determined based on the attention matrix of the neighboring nodes and the attention matrix of the neighboring edges to the target node. For each source node, the final representation vector of the source node in this iteration is determined based on the attention matrix of the neighboring nodes and the attention matrix of the neighboring edges.

6. The method according to claim 1, wherein, The step of determining the funding group to which the black seed belongs based on the second funding chain includes: For each target state, determine whether the number of second funding chains corresponding to that target state is greater than 1; the initial state of each second funding chain is the black seed, and the target state is the node in each second funding chain that is not adjacent to the black seed; If so, determine the best funding chain among the second funding chains corresponding to the target state, and retain the best funding chain; Otherwise, retain the second funding chain corresponding to the target state; Based on the second funding chain retained for each target state, the funding chain group to which the black seed belongs is identified.

7. The method according to claim 6, wherein, The step of determining the funding group to which the "black seed" belongs based on the second funding chain retained for each target state includes: Determine whether any of the second funding chains consists of only two nodes; the two nodes include the black seed. If so, then the funding chain group to which the black seed belongs is determined based on the second funding chain retained for each target state and the second funding chain formed by only two nodes; Otherwise, the funding chain group to which the black seed belongs is determined based on the second funding chain retained for each target state.

8. The method according to claim 6, characterized in that, Determining the optimal funding chain among the second funding chains corresponding to the target state includes: Construct a return matrix based on the risk level of each transaction in each of the second funding chains; For this target state, an initial Q matrix is ​​constructed, and the Q matrix is ​​iteratively updated according to the reward matrix until the Q matrix satisfies the convergence condition; The maximum value is selected from the Q matrix that satisfies the convergence condition, and the second capital chain corresponding to the maximum value is taken as the best capital chain among the second capital chains corresponding to the target state.

9. A device for identifying a Ponzi scheme group, comprising: The first acquisition module is used to acquire a pre-determined black seed; The black seeds mentioned are customers who have been identified as having money laundering risks after review; The first determining module is used to determine the transaction chain associated with the black seed based on the global transaction data; The second determining module is used to determine the first funding chain based on the transaction chain associated with the black seed; the first funding chain is the funding chain associated with the black seed. The third determining module is used to determine the risk level of each transaction in the first capital chain and remove transactions in the first capital chain whose risk level is lower than the preset risk level to obtain the second capital chain. The fourth determining module is used to determine the funding chain group to which the black seed belongs based on the second funding chain; The first determining module includes: The first extraction unit is used to extract transaction data within the most recent preset time period from the global transaction data; The data conversion unit is used to convert the transaction data within the most recent preset time period into a corresponding transaction graph; wherein, the nodes in the transaction graph are customers, the edges between the nodes are transaction information between customers with transaction relationships, and the transaction information includes the transaction amount; The transaction tracking unit is used to track the flow of transaction funds related to the black seed in the transaction graph, and obtain the transaction chain associated with the black seed; The step of tracing the flow of transaction funds related to the black seed in the transaction graph to obtain the transaction chain associated with the black seed includes: finding the node corresponding to the black seed in the transaction graph, and then determining the upstream and downstream nodes of the black seed, the upstream node of the upstream node, and the downstream node of the downstream node, based on the transaction relationship between the node corresponding to the black seed and other nodes, until the final downstream node and the initial upstream node are reached; in the transaction graph, the black seed and each upstream and downstream node form the transaction chain associated with the black seed.

10. A computing device comprising a memory and a processor, wherein the memory stores executable code, and the processor, when executing the executable code, implements the method of any one of claims 1-8.

11. A computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method of any one of claims 1-8.