A method and system for intelligent tax risk assessment and early warning

By constructing a corporate transaction network graph and identifying closed-loop transaction paths, the problem of insufficient in-depth exploration of inter-corporate relationships in the existing tax risk assessment system has been solved, enabling intelligent identification and risk warning of abnormal transaction patterns.

CN122199175APending Publication Date: 2026-06-12HENAN POLYTECHNIC

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HENAN POLYTECHNIC
Filing Date
2026-02-28
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

The existing tax risk assessment system is unable to delve into the relationships between enterprises and cannot identify abnormal patterns hidden beneath normal transactions, especially closed-loop transaction chains built by multiple levels and multiple entities, making it difficult for regulatory authorities to provide effective early warnings.

Method used

By extracting transaction details from the tax declaration data of the companies under investigation, constructing a transaction network map of the companies, identifying closed-loop transaction paths, extracting fund return characteristics and identifying abnormal transaction patterns, and finally conducting a quantitative risk assessment to generate a risk warning report.

🎯Benefits of technology

It enables in-depth analysis of the transaction network topology, identifies various abnormal transaction behaviors, empowers the system to proactively identify problems, and improves the efficiency of tax supervision.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122199175A_ABST
    Figure CN122199175A_ABST
Patent Text Reader

Abstract

The present application relates to a kind of intelligent tax risk assessment and early warning method and system, it is related to tax assessment technical field, including the following steps, transaction details are extracted from tax declaration data, the correlation matrix is formed by statistical analysis of upstream and downstream enterprise relationship and transaction network atlas is constructed.Path search is carried out to identify closed loop transaction, extract the characteristics of fund reflux and abnormal mode to obtain suspicious closed loop set.Associated enterprises are quantitatively evaluated to generate a score table, and when the score of the enterprise to be detected exceeds the threshold, it is marked as high risk and an early warning report is generated.The present application solves the technical problems that the prior art lacks depth analysis capability for transaction network topology and lacks tracking mechanism for fund reflux path.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of tax assessment technology, and in particular to intelligent tax risk assessment and early warning methods and systems. Background Technology

[0002] Existing tax risk assessment systems have significant limitations, primarily in their insufficient depth of analysis into inter-company relationships, making it difficult to uncover abnormal patterns hidden beneath seemingly normal transactions. Traditional methods often focus only on the reported data of individual companies, neglecting the characteristics of fund flows between these companies and their upstream and downstream trading partners. This makes it difficult to identify closed-loop transaction chains constructed through multiple levels and entities. In particular, meticulously designed fraudulent transaction networks often involve the coordinated efforts of multiple related companies, with funds circulating among these companies and ultimately returning to the starting point, forming a seemingly compliant but substantively illegal transaction loop. Such highly concealed risk behaviors are difficult to detect using traditional financial indicator analysis alone and require more intelligent technological support.

[0003] Tax authorities urgently need an automated and intelligent assessment method to analyze corporate transaction networks and quickly identify high-risk enterprises. While some regions have begun to experiment with data analytics, most remain at the level of simple indicator comparison, lacking in-depth analysis of transaction network topology and mechanisms for tracking fund flow. This technological gap often leaves regulatory authorities helpless when faced with complex related-party transactions, unable to provide effective early warnings, and only discovering problems during post-event investigations, missing the optimal regulatory opportunity. Therefore, building a complete intelligent system encompassing data extraction, network construction, risk identification, and early warning scoring is crucial to improving the effectiveness of tax supervision. Summary of the Invention

[0004] The purpose of this invention is to at least partially solve one of the technical problems existing in the prior art.

[0005] To achieve the above objectives, this invention provides an intelligent tax risk assessment and early warning method, comprising the following steps:

[0006] Transaction details are extracted from the tax declaration data of the companies to be tested to obtain transaction detail data;

[0007] Based on the transaction details data, statistical analysis is performed on the relationships between upstream and downstream enterprises of the enterprise to be tested to obtain an enterprise association matrix, and an enterprise transaction network graph is constructed based on the enterprise association matrix.

[0008] The enterprise transaction network graph is identified by a path search method to obtain closed-loop transaction paths. Based on the closed-loop transaction paths, fund return features are extracted and abnormal transaction patterns are identified to obtain a set of suspicious closed-loop transactions.

[0009] Based on the set of suspicious closed-loop transactions, a risk quantification assessment is performed on all related parties in the closed-loop transaction path to obtain an enterprise risk score table. If the risk score exceeds a preset threshold, the enterprise to be detected is marked as a high-risk enterprise, and a risk warning report is generated.

[0010] Furthermore, the transaction details of the tax declaration data of the enterprise to be tested are extracted to obtain transaction detail data, including:

[0011] The multi-source heterogeneous tax declaration data of the enterprise to be tested is processed to unify the format to obtain unified format declaration data, and the transaction record field in the unified format declaration data is identified and located to obtain transaction record location data.

[0012] Based on the transaction record location data, the transaction information of both parties, transaction amount, and transaction time in the unified format declaration data are extracted one by one to obtain preliminary transaction details data. Duplicate transaction records in the preliminary transaction details data are then deduplicated to obtain the final transaction details data.

[0013] Furthermore, the statistical analysis of the relationships between upstream and downstream enterprises of the enterprise under test based on the transaction details data yields an enterprise association matrix, including:

[0014] The names of the companies of both parties in the transaction details are extracted to obtain a set of company names. The companies in the set of company names are then classified into the company to be detected and its upstream and downstream companies to obtain a set of company classifications.

[0015] Based on the enterprise classification set, each transaction in the transaction details data is associated and marked, and the upstream and downstream enterprise relationships involved in each transaction are marked to obtain associated transaction mark data. The associated transaction mark data is then sorted according to the transaction amount to obtain sorted associated transaction data.

[0016] Based on the sorted and associated transaction data, the number of transactions and total transaction amount between each upstream and downstream enterprise and the enterprise to be tested are counted to obtain inter-enterprise transaction statistics. Based on the inter-enterprise transaction statistics, a two-dimensional enterprise association matrix is ​​constructed with enterprises as row and column identifiers and transaction number and total transaction amount as elements.

[0017] Furthermore, based on the aforementioned enterprise association matrix, an enterprise transaction network graph is constructed, including:

[0018] The company names in the company association matrix are extracted to obtain a list of company names, and each company in the list of company names is uniquely identified and encoded to obtain a set of company codes;

[0019] Based on the enterprise code set, the number of transactions and the total transaction amount in the enterprise association matrix are mapped and transformed, and the number of transactions and the total transaction amount are converted into weighted connection relationships to obtain enterprise connection weight data. The enterprise connection weight data is then sorted according to the weight to obtain sorted enterprise connection data.

[0020] Based on the enterprise code set and sorted enterprise connection data, an enterprise transaction network graph is constructed with enterprises as nodes and weighted connection relationships as edges.

[0021] Furthermore, the step of identifying closed-loop transactions in the enterprise transaction network graph using a path search method to obtain closed-loop transaction paths includes:

[0022] The enterprise transaction network graph is traversed to obtain traversed nodes. Then, the path is explored by taking each enterprise node in the traversed nodes as the starting point and recording the transaction path formed by the connection relationship with different enterprise nodes from the starting point and with weights, so as to obtain the initial transaction path.

[0023] For each transaction path in the initial transaction path, the endpoint is determined, and directed transaction paths with the same endpoint and starting point are selected to obtain preliminary closed-loop paths. The weights of each closed-loop path in the preliminary closed-loop path are accumulated, and the total weight of all connection relationships on each closed-loop path is calculated to obtain the total weight of the closed-loop path.

[0024] The initial closed-loop paths are filtered based on the sum of the closed-loop path weights, and closed-loop paths with a sum of weights greater than a preset threshold are retained to obtain a set of closed-loop transaction paths.

[0025] Furthermore, all enterprise nodes in the enterprise transaction network graph are traversed to obtain the traversed nodes, including:

[0026] Node attributes are extracted from the enterprise nodes in the enterprise transaction network graph to obtain a set of node attributes. The nodes in the set of node attributes are then sorted by the number of edges connected to them to obtain nodes sorted by the number of edges.

[0027] Extract the node with the most edges from the nodes sorted by edge count, and starting from the node with the most edges, sequentially mark each node in the nodes sorted by edge count as visited, record the visited nodes, and number the visited nodes according to the visiting order to obtain the traversed nodes.

[0028] Furthermore, based on the closed-loop transaction path, fund return characteristics are extracted and abnormal transaction patterns are identified to obtain a set of suspicious closed-loop transactions, including:

[0029] Analyze the transaction time interval for each path in the closed-loop transaction path, calculate the time difference between each transaction and the previous transaction to obtain a transaction time interval sequence, and perform statistical analysis on the transaction time interval sequence to filter out transaction pairs with time intervals less than a preset time threshold to obtain a set of short-time transaction pairs.

[0030] Based on the short-term trading pair set, the flow of transaction amount in the closed-loop trading path is tracked, the source and destination of each transaction amount are recorded, and the fund flow tracking data is obtained. The fund flow tracking data is summarized and analyzed to statistically analyze the fund inflow and outflow of each enterprise in a short period of time, and to obtain the enterprise fund flow summary data.

[0031] Based on the aggregated corporate cash flow data, the transaction patterns in the closed-loop transaction path are identified, and transaction paths with characteristics of cash return and abnormal transaction amounts are selected to obtain a set of suspicious closed-loop transactions.

[0032] Furthermore, based on the set of suspicious closed-loop transactions, a risk quantification assessment is performed on all related parties in the closed-loop transaction path to obtain an enterprise risk scoring table, including:

[0033] For each closed-loop transaction path in the suspicious closed-loop transaction set, enterprise nodes are traversed to extract the node position and transaction direction of each related enterprise in the closed-loop path, thereby obtaining enterprise node position data. Based on the enterprise node position data, the transaction participation of each related enterprise is statistically analyzed, including the number of suspicious closed loops in which each enterprise participates, the total transaction amount involved, and the ratio of the number of times it acts as a fund inflow party to an outflow party, thus obtaining an enterprise participation statistics table.

[0034] Based on the enterprise participation statistics table, risk factors are weighted for each related enterprise. The number of participating closed loops, the total transaction amount involved, and the ratio of the number of times funds flow in and out are multiplied by the corresponding preset weight coefficients and then summed to obtain the initial risk score of the enterprise. The initial risk score of the enterprise is then normalized according to the average transaction size of the industry to which the enterprise belongs. The initial risk score is divided by the industry average transaction size coefficient to obtain the enterprise risk scoring table.

[0035] This invention also provides an intelligent tax risk assessment and early warning system, comprising:

[0036] The extraction module is used to extract transaction details from the tax declaration data of the enterprise to be tested, and obtain transaction detail data;

[0037] The analysis module is used to perform statistical analysis on the relationships between upstream and downstream enterprises of the enterprise to be tested based on the transaction details data, obtain an enterprise association matrix, and construct an enterprise transaction network graph based on the enterprise association matrix;

[0038] The identification module is used to identify closed-loop transactions in the enterprise transaction network graph through a path search method, obtain closed-loop transaction paths, and extract fund return features and identify abnormal transaction patterns based on the closed-loop transaction paths to obtain a set of suspicious closed-loop transactions.

[0039] The assessment module is used to conduct a risk quantification assessment of all related party enterprises in the closed-loop transaction path based on the suspicious closed-loop transaction set, obtain an enterprise risk score table, and mark the enterprise to be detected as a high-risk enterprise when the risk exceeds a preset threshold, and generate a risk warning report at the same time.

[0040] This invention provides an intelligent tax risk assessment and early warning method, comprising the following steps: extracting transaction details from the tax declaration data of the enterprise to be tested to obtain transaction detail data; statistically analyzing the relationships between upstream and downstream enterprises of the enterprise to be tested based on the transaction detail data to obtain an enterprise association matrix, and constructing an enterprise transaction network graph based on the enterprise association matrix; identifying closed-loop transactions in the enterprise transaction network graph using a path search method to obtain closed-loop transaction paths, and extracting fund return features and identifying abnormal transaction patterns based on the closed-loop transaction paths to obtain a set of suspicious closed-loop transactions; performing risk quantification assessment on all related enterprises in the closed-loop transaction paths based on the set of suspicious closed-loop transactions to obtain an enterprise risk score table, and marking the enterprise to be tested as a high-risk enterprise when the risk exceeds a preset threshold, while generating a risk warning report. This method solves the technical problems of existing technologies lacking in-depth analysis capabilities of transaction network topology and lacking a tracking mechanism for fund return paths, realizing the functions of fund return feature extraction and abnormal pattern identification, and giving the system the ability to proactively discover problems. The system no longer relies on simple, pre-set rules. Instead, through in-depth analysis of transaction characteristics, it can identify various mutated and disguised abnormal trading behaviors. This intelligent identification mechanism is highly adaptable; even as illegal methods are constantly innovated, the system can still maintain a high detection rate through feature learning. Attached Figure Description

[0041] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0042] Figure 1 This is a schematic diagram of the steps of an intelligent tax risk assessment and early warning method in one embodiment of the present invention;

[0043] Figure 2 This is a schematic diagram of an intelligent tax risk assessment and early warning system according to an embodiment of the present invention;

[0044] The objectives, features, and advantages of this invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0045] The embodiments of the present invention are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention. The step numbers in the following embodiments are set only for ease of explanation, and there is no limitation on the order between the steps. The execution order of each step in the embodiments can be adaptively adjusted according to the understanding of those skilled in the art.

[0046] The following describes in detail, with reference to the accompanying drawings, an intelligent tax risk assessment and early warning method proposed according to an embodiment of the present invention.

[0047] Figure 1 This invention provides an intelligent tax risk assessment and early warning method, comprising the following steps:

[0048] Step S1: Extract transaction details from the tax declaration data of the enterprise to be tested to obtain transaction detail data.

[0049] Specifically, this step first requires retrieving basic materials submitted by the company to be tested from the tax system, such as VAT returns, corporate income tax returns, and financial statements. These materials are often stored in a structured or semi-structured format in a database. Next, these declaration materials are parsed, and each purchase and sale transaction recorded is extracted separately. In specific operations, the invoice information fields are read, including the taxpayer identification number of the trading partner, company name, transaction amount, invoice date, tax amount, and category of goods or services. Duplicate records and entries with incorrect formats are removed through data cleaning. Then, the data is organized according to a unified data model. For example, if a manufacturing company lists records of purchasing raw materials from 20 suppliers and selling products to 30 customers in a certain month's declaration, the system will read the complete information of these 50 transactions one by one, extract key elements such as the purchaser, seller, amount, and time, and store them in a transaction details data table. This table records the details of all the company's external transactions, providing raw materials for the subsequent construction of the company's association matrix. The entire process relies on an accurate understanding of the declaration data structure and a reasonable setting of field mapping rules to ensure that the extracted details data completely and accurately reflects the company's real transaction activities.

[0050] Step S2: Based on the transaction details data, perform statistical analysis on the relationships between upstream and downstream enterprises of the enterprise to be tested to obtain an enterprise association matrix, and construct an enterprise transaction network graph based on the enterprise association matrix.

[0051] Specifically, the statistical analysis step involves scanning each transaction detail data entry to summarize the interactions between the company under test and its trading partners. The process begins by establishing a two-dimensional matrix structure, where rows and columns represent the participating companies. Then, each record in the transaction details is iterated through. When a transaction is found between company A and company B, the number of transactions and the cumulative amount are recorded at the corresponding position in the matrix. For example, if the aforementioned manufacturing company had 8 purchase transactions with supplier C within a quarter, totaling 5 million yuan, then the matrix entry for company C would be filled with the values ​​8 and 5 million yuan. Similarly, if the company under test sold to customer D 12 times, totaling 3 million yuan, then... After marking the corresponding positions with 12 and 3 million, and processing all transaction records in this way, a corporate relationship matrix is ​​formed. This matrix clearly shows which companies have business connections and the strength of those connections. The next step is to transform the matrix into a graph. The method is to treat each company as a node in the graph, and the transaction relationship between companies is represented by the edges connecting the nodes. The thickness or weight of the edges can be determined by the transaction amount. For example, the company to be tested is the central node, and several edges extend outward to connect to other nodes such as supplier C and customer D. In this way, the originally abstract numerical matrix is ​​transformed into an intuitive and visible corporate transaction network graph. The entire network structure reflects the company's position in the industry chain and its connection status with upstream and downstream parties.

[0052] Step S3: The enterprise transaction network graph is identified by a path search method to obtain closed-loop transaction paths. Based on the closed-loop transaction paths, fund return features are extracted and abnormal transaction patterns are identified to obtain a set of suspicious closed-loop transactions.

[0053] Specifically, the path search method operates on the enterprise transaction network graph by starting from the enterprise to be tested and traversing outwards along the edges of the graph to find paths that eventually return to the starting point. In practice, a depth-first or breadth-first search algorithm is used, recording the sequence of visited nodes. When a path is found to return to the enterprise to be tested after passing through several intermediate nodes, a closed loop is considered found. For example, the enterprise to be tested pays supplier C, supplier C then transfers part of the funds to trading company E, and trading company E then returns the money to the enterprise to be tested under the guise of purchasing. This forms a closed-loop transaction path: "Enterprise to be tested → C → E → Enterprise to be tested". The system stores all such paths found and then processes them one by one. Analyzing the characteristics of each closed loop mainly involves looking at indicators such as path length, number of companies involved, time interval of fund flow, and patterns of transaction amount changes. For example, if funds in a closed loop quickly circulate within 3 days, the registered addresses of several companies involved highly overlap, and the categories of goods traded are contradictory, these signs indicate a fund return flow. In addition, the rationality of the transactions must be checked. For instance, if 4.8 million of the 5 million in payments flowed out from C but returned to the starting point without any substantial flow of goods in between, or if the transaction price deviates significantly from market prices, these are considered abnormal transaction patterns. By identifying and summarizing closed loops with these characteristics, a set of suspicious closed loop transactions is obtained. The entire identification process relies on the analysis of the path structure and the judgment of the logic of the transaction behavior.

[0054] Step S4: Based on the suspicious closed-loop transaction set, perform a risk quantification assessment on all related party enterprises in the closed-loop transaction path to obtain an enterprise risk score table. If the risk score exceeds a preset threshold, the enterprise to be detected is marked as a high-risk enterprise, and a risk warning report is generated.

[0055] Specifically, the risk quantification assessment step involves scoring each company involved in every path within the suspicious closed-loop transaction set. This is done by first setting several scoring dimensions, including the number of times a company participates in the closed loop, its role within the loop, the percentage of the transaction amount involved, and the company's historical compliance record. Then, weighting coefficients are assigned to each dimension. For example, if a company appears in three different suspicious closed loops simultaneously, it receives 15 points for the number of participations. Supplier C, although only in one closed loop, receives 20 points for the percentage of transaction amount due to its abnormally high transaction volume (60% of its annual turnover). Trading company E, acting as an intermediary and established for less than six months, receives 25 points for its suspicious role. The scores of each company across different dimensions are then multiplied by their corresponding weights. The overall risk score of the enterprise is then calculated by summing the scores of all participating enterprises to form an enterprise risk scoring table. The table lists the enterprise to be tested, C, E, and all related parties in descending order of their scores. The score of the enterprise to be tested is then compared with a preset threshold. If the threshold is set at 80 points and the enterprise actually scores 92 points, the marking mechanism is triggered, and the system automatically labels it as a high-risk enterprise. Subsequently, a report template is called to fill in the basic information of the enterprise, details of the closed-loop path it participated in, the scoring basis, and the amount involved to generate a risk warning report. The report will clearly state specific facts such as "the enterprise to be tested participated in multiple fund return closed loops with an amount of 5 million yuan," which facilitates subsequent investigators to quickly understand the risk points and take verification measures.

[0056] In a specific embodiment, the step of extracting transaction details from the tax declaration data of the enterprise to be tested to obtain transaction detail data includes:

[0057] The multi-source heterogeneous tax declaration data of the enterprise to be tested is processed to unify the format to obtain unified format declaration data, and the transaction record field in the unified format declaration data is identified and located to obtain transaction record location data.

[0058] Based on the transaction record location data, the transaction information of both parties, transaction amount, and transaction time in the unified format declaration data are extracted one by one to obtain preliminary transaction details data. Duplicate transaction records in the preliminary transaction details data are then deduplicated to obtain the final transaction details data.

[0059] Specifically, this detailed step is a further elaboration of the "transaction details extraction" element in the higher-level solution. In practice, the tax declaration data of the companies to be tested often comes from different systems. Some are in XML format for VAT declarations, some are in Excel spreadsheets for corporate income tax, and some financial statements may be submitted in PDF format. These multi-source, heterogeneous data first need to be converted into formats. The various file formats such as XML, Excel, and PDF are parsed into structured records that the database can read. For example, the text in the PDF spreadsheet is recognized by OCR and then reorganized by rows and columns. The values ​​in the XML tags are extracted to the corresponding fields. After the conversion, all the data becomes declaration data in a unified format and is stored in a temporary table. Next, it is necessary to find the fields that actually contain transaction information in these unified format records. Because the declaration form contains summary data, detailed data, and also includes basic company information and various notes, it is necessary to determine the fields based on the field name and data type, such as "seller name," "buyer's taxpayer identification number," "invoice amount," and "invoice date." The field names clearly belong to transaction records. The system scans the header of the standardized declaration data and marks the positions of these key fields to form transaction record location data. With the location information, the system can accurately extract the core elements of the transaction parties, transaction amount, and transaction time from each row of records. Suppose a row of data shows that the seller is "XX Machinery Factory", the buyer is the company to be tested, the amount column is 500,000, and the date column is March 15, 2025. The system combines these four items into a record and puts it into the preliminary transaction details data. However, companies sometimes fill in the same business repeatedly in different declaration forms, or data redundancy is caused by correcting declarations. Therefore, the preliminary transaction details data must be deduplicated. The specific method is to compare the transaction parties, amount, and time of each record. When two records are found to have completely identical three fields, they are judged as duplicates and one of them is deleted. After deduplication, the final transaction details data no longer contains duplicates. Each record corresponds to a real and independent transaction. This completes the entire transformation process from raw multi-source data to clean details data.

[0060] In a specific embodiment, the step of statistically analyzing the relationships between upstream and downstream enterprises of the enterprise under test based on the transaction details data to obtain an enterprise association matrix includes:

[0061] The names of the companies of both parties in the transaction details are extracted to obtain a set of company names. The companies in the set of company names are then classified into the company to be detected and its upstream and downstream companies to obtain a set of company classifications.

[0062] Based on the enterprise classification set, each transaction in the transaction details data is associated and marked, and the upstream and downstream enterprise relationships involved in each transaction are marked to obtain associated transaction mark data. The associated transaction mark data is then sorted according to the transaction amount to obtain sorted associated transaction data.

[0063] Based on the sorted and associated transaction data, the number of transactions and total transaction amount between each upstream and downstream enterprise and the enterprise to be tested are counted to obtain inter-enterprise transaction statistics. Based on the inter-enterprise transaction statistics, a two-dimensional enterprise association matrix is ​​constructed with enterprises as row and column identifiers and transaction number and total transaction amount as elements.

[0064] Specifically, this refined plan is a deeper expansion of the two elements in the higher-level steps: "statistical analysis of business relationships" and "construction of enterprise relationship matrix." During execution, the first step is to extract the names of both the buyer and seller companies involved in each record of the transaction details. After traversing all transaction records, a set containing dozens or even hundreds of unique company names will be obtained. This is the enterprise name set. Next, it is necessary to determine the relationship between these companies and the company under test. The method is to check the position of the company under test in each transaction. If a company sells goods to the company under test, it is a supplier and belongs to the upstream; conversely, if a company purchases goods from the company under test, it is a customer and belongs to the downstream. For example, supplier C mentioned earlier... The company being tested provides raw materials and is therefore categorized as an upstream company. Customer D purchases products from the company being tested, thus becoming a downstream company. Classifying all members in the company name set in this way creates a company classification set, clearly indicating whether each company is the company being tested itself or its upstream or downstream. With this classification foundation, further processing of transaction details is possible. When reading each transaction record, each transaction is tagged according to the company classification set. Transactions like "the company being tested purchases from supplier C" are marked as upstream purchasing relationships, while "the company being tested sells to customer D" is marked as downstream selling relationships. After all transactions are tagged, related transaction tagging data is obtained, and then sorted by amount from largest to smallest. The marked transactions are sorted out because larger transactions often have more concentrated risk and deserve priority. The resulting sorted transaction data prioritizes the 5 million purchase transaction, while smaller transactions of a few thousand yuan are ranked lower. However, the statistical process must cover all transactions without omission. The system counts the number of transactions between each upstream and downstream company and the company under test, and sums the total amount. For example, supplier C made 8 transactions totaling 5 million with the company under test in a year, and customer D made 12 purchases totaling 3 million. These data are recorded in the inter-company transaction statistics. Finally, these statistical results are filled into a matrix, with rows and columns named after the companies. The matrix uses identifiers, such as the first row representing the company to be tested, the second row representing supplier C, and the third row representing customer D. The columns are arranged in the same way. When you want to show the relationship between supplier C and the company to be tested, you fill in the number of transactions (8) and the total transaction amount (5 million) in the first column of the second row of the matrix. For customer D, you fill in the number of transactions (12) and the total transaction amount (3 million) in the first column of the third row. After filling in the transaction information between all the companies according to this rule, a complete two-dimensional enterprise relationship matrix is ​​formed. The position in the matrix with a value indicates that the corresponding two companies have business dealings. The value reflects the frequency and scale of the dealings. This matrix provides quantitative basic data for the subsequent construction of enterprise transaction network graphs.

[0065] In a specific embodiment, constructing an enterprise transaction network graph based on the enterprise association matrix includes:

[0066] The company names in the company association matrix are extracted to obtain a list of company names, and each company in the list of company names is uniquely identified and encoded to obtain a set of company codes;

[0067] Based on the enterprise code set, the number of transactions and the total transaction amount in the enterprise association matrix are mapped and transformed, and the number of transactions and the total transaction amount are converted into weighted connection relationships to obtain enterprise connection weight data. The enterprise connection weight data is then sorted according to the weight to obtain sorted enterprise connection data.

[0068] Based on the enterprise code set and sorted enterprise connection data, an enterprise transaction network graph is constructed with enterprises as nodes and weighted connection relationships as edges.

[0069] Specifically, this detailed step is the technical implementation process of the "constructing an enterprise transaction network graph" element in the higher-level solution. The operation begins by collecting all the enterprise names that appear in the row and column headings of the enterprise association matrix. The first row and first column of the matrix represent the enterprise to be tested, the second row corresponds to supplier C, the third row to customer D, and so on, potentially including trading companies like E and other entities involved in the transaction. Arranging these names in order into a list gives us the enterprise name list. However, enterprise names are often quite long and may contain special characters, making subsequent calculations and processing inconvenient. Therefore, each enterprise needs to be assigned a short and unique identifier. Common practices include using numerical codes or alphanumeric codes, such as the one for the enterprise to be tested... Company ID is N001, supplier C is N002, customer D is N003, and trading company E is N004. This processed set of company codes includes both the original company names and the newly assigned codes, facilitating rapid indexing during graph construction. Next, the transaction counts and total transaction amounts in the matrix need to be transformed into a connection format recognizable by the graph structure. Specifically, each non-zero element in the matrix is ​​interpreted as representing an edge between two companies, with the edge weight determined by both the transaction count and amount. For example, if the position from supplier C to the company under test contains 8 transactions totaling 5 million yuan, a weight calculation formula can be designed to normalize the count and amount. Adding them together, the weight value is calculated as the number of transactions multiplied by 0.3, plus the transaction amount divided by 1 million, and then multiplied by 0.7. Therefore, the weight of the edge from supplier C to the company under test is 8 × 0.3 + 500 / 100 × 0.7 = 5.9. Similarly, the weight of the 12 transactions totaling 3 million from customer D to the company under test is calculated as 12 × 0.3 + 300 / 100 × 0.7 = 5.7. This transformation is repeated for all positions in the matrix containing transaction records to obtain the enterprise connection weight data. This data records the connection between each pair of enterprises and its corresponding weight value. These connections are then sorted from highest to lowest weight, with connections involving close financial transactions placed first, forming a convenient sorting system for enterprise connection data. Subsequently, important relationships will be prioritized. During the graph construction phase, nodes will be created based on the enterprise code set. On the visualization interface, each enterprise will be represented by a circle or square. The enterprise to be tested, N001, will be drawn in the center, supplier C (N002) will be placed on the left, customer D (N003) will be placed on the right, and trading company E's node N004 will be placed at the top. The nodes will be connected by lines to indicate that there is a transaction relationship. An arrow drawn from N002 to N001 represents supplier C supplying goods to the enterprise to be tested. The thickness or color of this line is determined by the previously calculated weight of 5.9. The higher the weight, the thicker the line, indicating a closer relationship. Similarly, an arrow with a weight of 5 will be drawn from N001 to N003.7. If trading company E has dealings with both the company under test and supplier C, then N004 will have multiple edges connecting to form a more complex network structure. After all the edges recorded in the ranking company connection data are added to the graph, the entire company transaction network graph is fully presented. The graph clearly shows the position of the company under test, which companies are closely connected to it, and whether the overall shape of the transaction network is star-shaped or chain-shaped.

[0070] In a specific embodiment, the step of identifying closed-loop transactions in the enterprise transaction network graph using a path search method to obtain closed-loop transaction paths includes:

[0071] The enterprise transaction network graph is traversed to obtain traversed nodes. Then, the path is explored by taking each enterprise node in the traversed nodes as the starting point and recording the transaction path formed by the connection relationship with different enterprise nodes from the starting point and with weights, so as to obtain the initial transaction path.

[0072] For each transaction path in the initial transaction path, the endpoint is determined, and directed transaction paths with the same endpoint and starting point are selected to obtain preliminary closed-loop paths. The weights of each closed-loop path in the preliminary closed-loop path are accumulated, and the total weight of all connection relationships on each closed-loop path is calculated to obtain the total weight of the closed-loop path.

[0073] The initial closed-loop paths are filtered based on the sum of the closed-loop path weights, and closed-loop paths with a sum of weights greater than a preset threshold are retained to obtain a set of closed-loop transaction paths.

[0074] Specifically, this refined solution is the concrete implementation process of the "closed-loop transaction identification" element in the higher-level steps. During execution, it first visits all nodes in the enterprise transaction network graph one by one. The graph may contain several nodes such as the enterprise to be detected (N001), supplier C (N002), customer D (N003), and trading company E (N004). The system will create a list to record all these nodes as traversal nodes, and then take them out one by one as the starting point for the search. For example, N001 is selected as the starting point, and the system moves forward along the arrowed edges starting from N001. If N001 has an edge with a weight of 5.7 pointing to N003, this path "N001→N003" is recorded. Then, the system continues to explore from N003 to see which edges it connects to. If N003 has an edge pointing to trading company E (N004), the path extends to "N001→N003→N004". Starting from N004, it might connect back to supplier C (N002), and the path becomes "N001→N003→N004→N002". This exploration continues until a previously visited node is encountered or there are no more outgoing edges. Each step involves saving the entire path, including the node numbers and edge weights. After exploring all possible routes, the initial transaction path is obtained. This path can be short (two or three nodes) or complex (seven or eight intermediate steps). The next step is to check the endpoints of these paths. Taking the previously mentioned "N001→N003→N004→N002" path as an example... The path "3→N004→N002" doesn't qualify as a closed loop because the endpoint N002 is different from the starting point N001. However, if we extend the path further and find an edge with a weight of 5.9 pointing back to N001 from N002, the complete path becomes "N001→N003→N004→N002→N001". In this case, both the endpoint and starting point are N001, satisfying the closed loop condition. The system will select such paths with consecutive endpoints and add them to the initial closed loop path set. Searching from different starting points may yield several closed loops, such as "N002→N004→N001→N002" or "N003→N004→N002→N001→N003". Each discovered loop is recorded, but not all are. All closed loops deserve attention. Loops involving small amounts and infrequent transactions may simply represent normal business dealings, so it's necessary to calculate the importance of each closed loop. The method is to add up the weights of each edge on the path. For example, in the closed loop "N001→N003→N004→N002→N001", the weight of the first segment from N001 to N003 is 5.7, the weight of the second segment from N003 to N004 is assumed to be 4.2, the weight of the third segment from N004 to N002 is 3.8, and the weight of the final segment from N002 back to N001 is 5.9. The total weight of these four segments is 5.7 + 4.2 + 3.8 + 5.9, which equals 19.6. Other closed loops can be calculated using the same method, and then a threshold, such as 15, can be set.0. Only closed loops with a total weight exceeding this value are retained. For example, the closed loop with a weight of 19.6 was selected because it exceeds 15. If another closed loop only has a weight of 12.3, it would be filtered out. After this filtering process, only important closed loops with large transaction amounts and frequent transactions remain, ultimately forming a set of closed-loop transaction paths for subsequent analysis. This set clearly lists which enterprise nodes each closed loop passes through, which edges it connects to, and its total weight, providing a precise target range for identifying suspicious fund repatriation.

[0075] In a specific embodiment, all enterprise nodes in the enterprise transaction network graph are traversed to obtain the traversed nodes, including:

[0076] Node attributes are extracted from the enterprise nodes in the enterprise transaction network graph to obtain a set of node attributes. The nodes in the set of node attributes are then sorted by the number of edges connected to them to obtain nodes sorted by the number of edges.

[0077] Extract the node with the most edges from the nodes sorted by edge count, and starting from the node with the most edges, sequentially mark each node in the nodes sorted by edge count as visited, record the visited nodes, and number the visited nodes according to the visiting order to obtain the traversed nodes.

[0078] Specifically, this detailed step is a technical breakdown of the "traversing all enterprise nodes" operation in the upper-level solution. In actual execution, it first reads the basic information carried by each node in the enterprise transaction network graph. This information includes the enterprise number represented by the node, the enterprise name, and the number of incoming and outgoing edges. The system scans the graph and records the attributes of the enterprise N001 to be tested. It finds that it has two incoming edges from supplier C (N002) and trading company E (N004), and one outgoing edge pointing to customer D (N003). Therefore, N001 has a total of 3 connected edges. Next, it checks node N002 and finds that it has one incoming edge from N004 and two outgoing edges pointing to N001 and another node, bringing the total number of edges to [missing information]. There are 3 edges. Customer D (N003) has 1 incoming edge from N001 and 1 outgoing edge connecting to N004, for a total of 2 edges. Trading company E (N004) has 1 incoming edge from N003 and 2 outgoing edges to N001 and N002 respectively, also for a total of 3 edges. Summarizing all these attribute information for each node forms a node attribute set. This set clearly records the node's number, name, and number of connecting edges. Then, these nodes need to be rearranged according to the number of edges. It's found that N001, N002, and N004 are all tied for first place with 3 edges, while N003 only has 2 edges and is ranked later. However, in actual operation, N001 might have slightly higher attribute values ​​because it is both the object to be monitored and located in the center. To improve the ranking, or to break the tie based on other secondary indicators, let's assume the final ranking is N001 first, N004 second, N002 third, and N003 fourth. This arrangement gives us the nodes ranked by edge count. The subsequent traversal doesn't start arbitrarily; instead, it prioritizes the most densely connected nodes, as they are often key hubs in the transaction network. Therefore, we start with the first node in the edge-count-ranked list, N001. The system marks N001 as visited and assigns it a number, such as traversal sequence number 1. Then, we move to N004, the second node in the ranking, marking it as visited and assigning it the number 2. Next, we mark the third node, N002. After numbering 3, N003 is finally marked and assigned the number 4. This process is repeated, traversing all nodes in descending order of edge count. Each visit is recorded in the visited node list, which includes the node number, visit timestamp, and traversal sequence number. Once all nodes in the graph have been visited, the visited node list, arranged in the order of visit, becomes the traversed node list. This list not only includes all enterprise nodes in the graph but also retains the order of visit. Subsequent path searches use N001, N004, N002, and N003 as starting points to explore closed-loop paths. This strategy of processing complex nodes first allows the algorithm to more quickly identify suspicious transaction loops involving multiple enterprises.Because nodes with more edges tend to participate in more transaction chains and are more likely to form closed loops, and the order in which nodes are traversed makes it easier to track which nodes have already been searched and avoid duplicate calculations, the entire process ensures that every enterprise in the graph is checked and processed in order of importance.

[0079] In a specific embodiment, based on the closed-loop transaction path, fund return feature extraction and abnormal transaction pattern identification are performed to obtain a set of suspicious closed-loop transactions, including:

[0080] Analyze the transaction time interval for each path in the closed-loop transaction path, calculate the time difference between each transaction and the previous transaction to obtain a transaction time interval sequence, and perform statistical analysis on the transaction time interval sequence to filter out transaction pairs with time intervals less than a preset time threshold to obtain a set of short-time transaction pairs.

[0081] Based on the short-term trading pair set, the flow of transaction amount in the closed-loop trading path is tracked, the source and destination of each transaction amount are recorded, and the fund flow tracking data is obtained. The fund flow tracking data is summarized and analyzed to statistically analyze the fund inflow and outflow of each enterprise in a short period of time, and to obtain the enterprise fund flow summary data.

[0082] Based on the aggregated corporate cash flow data, the transaction patterns in the closed-loop transaction path are identified, and transaction paths with characteristics of cash return and abnormal transaction amounts are selected to obtain a set of suspicious closed-loop transactions.

[0083] Specifically, this detailed step is an in-depth expansion of the two elements in the higher-level solution: "extraction of fund return characteristics" and "identification of abnormal transaction patterns." During implementation, it first involves analyzing each of the previously identified closed-loop transaction paths. For example, the closed loop "N001→N003→N004→N002→N001" contains four transactions: the first is N001 paying N003 3 million on March 15th; the second is N003 transferring 2.8 million to N004 on March 18th; the third is N004 remitting 2.7 million to N002 on March 19th; and the fourth is N002 returning 5 million to N001 on March 21st. The system will arrange these transactions chronologically and then calculate the interval between adjacent transactions. The time intervals are as follows: the first transaction to the second is March 18th minus March 15th, which is 3 days; the second transaction to the third is March 19th minus March 18th, which is only 1 day; and the third transaction to the fourth is March 21st minus March 19th, which is 2 days. These time differences are recorded to form a transaction interval sequence, i.e., "3 days, 1 day, 2 days". Then, a time threshold is set, such as 5 days. All adjacent transaction pairs in the sequence with intervals less than 5 days are selected. For example, the first and second transactions with an interval of 3 days meet the condition and are paired; the second and third transactions with an interval of 1 day also meet the condition; and the third and fourth transactions with an interval of 2 days also meet the condition. These three pairs of transactions are categorized into a short-time transaction pair set. This set reflects which fund flows occur particularly quickly and may be suspected of being manipulated. Then, we need to track how these funds flow within a short period of time. Taking the closed loop mentioned earlier as an example, the first transaction of 3 million flows out of N001 and into N003. The system records "N001→N003: 3 million outflow," while N003 records "N003: 3 million inflow, source N001." The second transaction of 2.8 million flows out of N003 and into N004, so we record "N003→N004: 2.8 million outflow" and "N004: 2.8 million inflow, source N003." This process continues, clearly marking the source and destination of each transaction. All these records are then compiled into the fund flow tracking data. This data details which corporate account each transaction in the short-term trading pair set leaves and enters. Based on this flow information, we can further calculate how much money each company received and spent within those few days. For example, in this closed loop, N003 received 3 million on March 18th and 2.8 million flowed out on the same day, resulting in a net inflow of 200,000. N004 received 2.8 million on March 18th and spent 2.7 million the next day, resulting in a net inflow of 100,000. N002 received 2.7 million on March 19th and spent 5 million two days later, resulting in a net outflow of 2.3 million. However, considering that it may have received funds in other transactions, we need to calculate comprehensively. After calculating the fund inflows and outflows of all companies involved in short-term transactions, we obtain a summary of corporate fund flow data. This summary data will show that some companies' large amounts of funds flowing in and out rapidly in a short period of time are clearly inconsistent with normal operating rhythms.The final step is to determine which closed loops are indeed problematic based on the summarized data. This mainly involves two aspects: First, the characteristics of fund return—that is, the money goes around in circles and returns to the starting point with little change in amount. For example, N001 initially paid 3 million and finally received 5 million, seemingly making a profit of 2 million, but N002 already owed it money. If normal business transactions are deducted, it turns out that the money actually just went around in circles and came back—this is a typical example of fund return. Second, abnormal transaction amounts—for example, a transaction amount far exceeding the company's normal level or a price significantly deviating from market prices. For instance, N003 buys a certain commodity from N004 at three times the market price, or... The goods sold by N004 to N002 are completely outside their business scope. Such transactions are considered abnormal. The system filters out closed-loop paths that simultaneously exhibit both fund return characteristics and abnormal transaction amounts. For example, the path "N001→N003→N004→N002→N001" was found to have rapidly circulating funds within 6 days, and the transaction price from N003 to N004 was inflated. Meeting these suspicious characteristics, it was included in the suspicious closed-loop transaction set. Each path in this set is accompanied by detailed evidence, including time intervals, fund flows, and abnormal transaction details, providing sufficient basis for subsequent risk assessment.

[0084] In a specific embodiment, a risk quantification assessment is performed on all related party enterprises in the closed-loop transaction path based on the suspicious closed-loop transaction set to obtain an enterprise risk scoring table, including:

[0085] For each closed-loop transaction path in the suspicious closed-loop transaction set, enterprise nodes are traversed to extract the node position and transaction direction of each related enterprise in the closed-loop path, thereby obtaining enterprise node position data. Based on the enterprise node position data, the transaction participation of each related enterprise is statistically analyzed, including the number of suspicious closed loops in which each enterprise participates, the total transaction amount involved, and the ratio of the number of times it acts as a fund inflow party to an outflow party, thus obtaining an enterprise participation statistics table.

[0086] Based on the enterprise participation statistics table, risk factors are weighted for each related enterprise. The number of participating closed loops, the total transaction amount involved, and the ratio of the number of times funds flow in and out are multiplied by the corresponding preset weight coefficients and then summed to obtain the initial risk score of the enterprise. The initial risk score of the enterprise is then normalized according to the average transaction size of the industry to which the enterprise belongs. The initial risk score is divided by the industry average transaction size coefficient to obtain the enterprise risk scoring table.

[0087] Specifically, this detailed step is the technical implementation path of the "risk quantification assessment" element in the higher-level solution. During execution, each path in the suspicious closed-loop transaction set needs to be broken down to examine which companies are involved. Taking the closed loop "N001→N003→N004→N002→N001" as an example, the system will traverse the path from beginning to end and find that N001 is both the starting and ending point; N003 is the second node on the path, with funds flowing from N001 to it; N004 is the third node, receiving funds from N003 and then transferring them to the next party; and N002 occupies the fourth node as the final link before repayment. The position of each company in the closed loop and its direction of receiving or paying money must be recorded in detail. For example, N001 is on this path... In the first scenario, N003 acts as both an outflowing party (paying out 3 million) and an inflowing party (receiving 5 million). N003 is an inflowing party receiving 3 million while simultaneously outflowing 2.8 million. Summarizing this location information and transaction direction gives us the enterprise node location data. If there are two other paths in the suspicious closed-loop set: "N002→N004→N001→N002" and "N003→N004→N002→N001→N003", the system will continue to traverse these paths and mark the location of each enterprise. Next, we need to calculate the participation of each enterprise. We find that N001 appears in all three suspicious closed loops, so its participation count is 3. In the first closed loop, it paid out 3 million and received 5 million, resulting in a net change of 2 million. Assuming an expenditure of 4 million in the second path and an income of 3.5 million in the third, the total transaction amount involved in N001 across the three paths is 15.5 million. Then, we count that N001 appears twice as an inflow party, receiving 5 million and 3.5 million respectively, and twice as an outflow party, paying out 3 million and 4 million respectively. The inflow-outflow ratio is 2:2, which equals 1.0. Similarly, we calculate that N003 participates in two closed loops involving 9.3 million, with an inflow-outflow ratio of 0.67; N004 participates in three closed loops involving 11.8 million, with a ratio of 1.5; and N002 participates in two closed loops involving 10.7 million, with a ratio of 0.8. All these data for all companies are compiled into a table to represent the companies' participation... The statistical table shows the risk score for each enterprise. Each row represents one enterprise and includes three key indicators: the number of closed-loop participations, the total transaction amount involved, and the inflow-outflow ratio. With this basic data, the risk score can be calculated. Assuming the weighting coefficients for the number of closed-loop participations are pre-set to 0.4, the weighting for the total transaction amount involved to 0.35, and the weighting for the inflow-outflow ratio to 0.25, then the initial risk score for N001 is equal to 3 multiplied by 0.4, plus 1550 divided by 100, then multiplied by 0.35, plus 1.0 multiplied by 0.25, which equals 1.2 plus 5.425 plus 0.25, equaling 6.875. The score for N003 is 2 × 0.4 + 930 / 100 × 0.35 + 0.67 × 0.25, approximately equal to 4.222. The score for N004 is 3 × 0.4 + 1180 / 100 × 0.35 + 1.5 × 0.25 yields 5.705. N002 is calculated to be 4.747. However, the transaction scale varies greatly among companies in different industries. Manufacturing companies may have an average annual transaction volume of hundreds of millions, while small trading companies may only have a few million. Directly comparing absolute scores is unfair, so normalization is necessary. A query of the industry database reveals that the average transaction scale coefficient for the manufacturing industry, where N001 is located, is 1.2. Therefore, N001's 6.875 is divided by 1.2 to obtain a normalized score of 5.729. Assuming the coefficient for the trading industry, where N003 and N004 are located, is 0.8, their scores become 4. 0.222 / 0.8 equals 5.278 and 5.705 / 0.8 equals 7.131. N002, belonging to the logistics industry, has a coefficient of 1.0 and a score of 4.747 that remains unchanged. After normalization, the risk scores of each company are comparable under the same standard. The final company risk score table is arranged from highest to lowest score, showing N004 with a score of 7.131 ranking first, N001 with a score of 5.729 ranking second, N003 with a score of 5.278 ranking third, and N002 with a score of 4.747 ranking last. This table clearly shows which companies have a higher risk level in suspicious closed-loop transactions, providing a quantitative basis for subsequent early warning decisions.

[0088] The above describes an intelligent tax risk assessment and early warning method according to an embodiment of the present invention. The following describes an intelligent tax risk assessment and early warning system according to an embodiment of the present invention. Please refer to [link / reference]. Figure 2 One embodiment of the intelligent tax risk assessment and early warning system of the present invention includes:

[0089] Extraction module 21 is used to extract transaction details from the tax declaration data of the enterprise to be tested, and obtain transaction detail data;

[0090] Analysis module 22 is used to perform statistical analysis on the relationships between upstream and downstream enterprises of the enterprise to be tested based on the transaction details data, obtain an enterprise association matrix, and construct an enterprise transaction network graph based on the enterprise association matrix;

[0091] The identification module 23 is used to identify closed-loop transactions in the enterprise transaction network graph through a path search method, obtain closed-loop transaction paths, and extract fund return features and identify abnormal transaction patterns based on the closed-loop transaction paths to obtain a set of suspicious closed-loop transactions.

[0092] The assessment module 24 is used to conduct a risk quantification assessment of all related party enterprises in the closed-loop transaction path based on the suspicious closed-loop transaction set, obtain an enterprise risk score table, and mark the enterprise to be detected as a high-risk enterprise when the risk exceeds a preset threshold, and generate a risk warning report at the same time.

[0093] In this embodiment, the specific implementation of each module in the above system embodiment is described in the above method embodiment, and will not be repeated here.

Claims

1. A method for intelligent tax risk assessment and early warning, characterized in that, Includes the following steps: Transaction details are extracted from the tax declaration data of the companies to be tested to obtain transaction detail data; Based on the transaction details data, statistical analysis is performed on the relationships between upstream and downstream enterprises of the enterprise to be tested to obtain an enterprise association matrix, and an enterprise transaction network graph is constructed based on the enterprise association matrix. The enterprise transaction network graph is identified by a path search method to obtain closed-loop transaction paths. Based on the closed-loop transaction paths, fund return features are extracted and abnormal transaction patterns are identified to obtain a set of suspicious closed-loop transactions. Based on the set of suspicious closed-loop transactions, a risk quantification assessment is performed on all related parties in the closed-loop transaction path to obtain an enterprise risk score table. If the risk score exceeds a preset threshold, the enterprise to be detected is marked as a high-risk enterprise, and a risk warning report is generated.

2. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, The process involves extracting transaction details from the tax return data of the companies to be tested, resulting in transaction detail data, including: The multi-source heterogeneous tax declaration data of the enterprise to be tested is processed to unify the format to obtain unified format declaration data, and the transaction record field in the unified format declaration data is identified and located to obtain transaction record location data. Based on the transaction record location data, the transaction information of both parties, transaction amount, and transaction time in the unified format declaration data are extracted one by one to obtain preliminary transaction details data. Duplicate transaction records in the preliminary transaction details data are then deduplicated to obtain the final transaction details data.

3. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, The statistical analysis of the relationships between upstream and downstream enterprises of the enterprise under test based on the transaction details data yields an enterprise association matrix, including: The names of the companies of both parties in the transaction details are extracted to obtain a set of company names. The companies in the set of company names are then classified into the company to be detected and its upstream and downstream companies to obtain a set of company classifications. Based on the enterprise classification set, each transaction in the transaction details data is associated and marked, and the upstream and downstream enterprise relationships involved in each transaction are marked to obtain associated transaction mark data. The associated transaction mark data is then sorted according to the transaction amount to obtain sorted associated transaction data. Based on the sorted and associated transaction data, the number of transactions and total transaction amount between each upstream and downstream enterprise and the enterprise to be tested are counted to obtain inter-enterprise transaction statistics. Based on the inter-enterprise transaction statistics, a two-dimensional enterprise association matrix is ​​constructed with enterprises as row and column identifiers and transaction number and total transaction amount as elements.

4. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, Constructing an enterprise transaction network graph based on the aforementioned enterprise association matrix includes: The company names in the company association matrix are extracted to obtain a list of company names, and each company in the list of company names is uniquely identified and encoded to obtain a set of company codes; Based on the enterprise code set, the number of transactions and the total transaction amount in the enterprise association matrix are mapped and transformed, and the number of transactions and the total transaction amount are converted into weighted connection relationships to obtain enterprise connection weight data. The enterprise connection weight data is then sorted according to the weight to obtain sorted enterprise connection data. Based on the enterprise code set and sorted enterprise connection data, an enterprise transaction network graph is constructed with enterprises as nodes and weighted connection relationships as edges.

5. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, The step of identifying closed-loop transactions in the enterprise transaction network graph using a path search method to obtain closed-loop transaction paths includes: The enterprise transaction network graph is traversed to obtain traversed nodes. Then, the path is explored by taking each enterprise node in the traversed nodes as the starting point and recording the transaction path formed by the connection relationship with different enterprise nodes from the starting point and with weights, so as to obtain the initial transaction path. For each transaction path in the initial transaction path, the endpoint is determined, and directed transaction paths with the same endpoint and starting point are selected to obtain preliminary closed-loop paths. The weights of each closed-loop path in the preliminary closed-loop path are accumulated, and the total weight of all connection relationships on each closed-loop path is calculated to obtain the total weight of the closed-loop path. The initial closed-loop paths are filtered based on the sum of the closed-loop path weights, and closed-loop paths with a sum of weights greater than a preset threshold are retained to obtain a set of closed-loop transaction paths.

6. The intelligent tax risk assessment and early warning method according to claim 5, characterized in that, Traversing all enterprise nodes in the enterprise transaction network graph yields the traversed nodes, including: Node attributes are extracted from the enterprise nodes in the enterprise transaction network graph to obtain a set of node attributes. The nodes in the set of node attributes are then sorted by the number of edges connected to them to obtain nodes sorted by the number of edges. Extract the node with the most edges from the nodes sorted by edge count, and starting from the node with the most edges, sequentially mark each node in the nodes sorted by edge count as visited, record the visited nodes, and number the visited nodes according to the visiting order to obtain the traversed nodes.

7. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, Based on the closed-loop transaction path, fund return characteristics are extracted and abnormal transaction patterns are identified to obtain a set of suspicious closed-loop transactions, including: Analyze the transaction time interval for each path in the closed-loop transaction path, calculate the time difference between each transaction and the previous transaction to obtain a transaction time interval sequence, and perform statistical analysis on the transaction time interval sequence to filter out transaction pairs with time intervals less than a preset time threshold to obtain a set of short-time transaction pairs. Based on the short-term trading pair set, the flow of transaction amount in the closed-loop trading path is tracked, the source and destination of each transaction amount are recorded, and the fund flow tracking data is obtained. The fund flow tracking data is summarized and analyzed to statistically analyze the fund inflow and outflow of each enterprise in a short period of time, and to obtain the enterprise fund flow summary data. Based on the aggregated corporate cash flow data, the transaction patterns in the closed-loop transaction path are identified, and transaction paths with characteristics of cash return and abnormal transaction amounts are selected to obtain a set of suspicious closed-loop transactions.

8. The intelligent tax risk assessment and early warning method according to claim 1, characterized in that, Based on the set of suspicious closed-loop transactions, a risk quantification assessment is performed on all related parties in the closed-loop transaction path to obtain an enterprise risk scoring table, including: For each closed-loop transaction path in the suspicious closed-loop transaction set, enterprise nodes are traversed to extract the node position and transaction direction of each related enterprise in the closed-loop path, thereby obtaining enterprise node position data. Based on the enterprise node position data, the transaction participation of each related enterprise is statistically analyzed, including the number of suspicious closed loops in which each enterprise participates, the total transaction amount involved, and the ratio of the number of times it acts as a fund inflow party to an outflow party, thus obtaining an enterprise participation statistics table. Based on the enterprise participation statistics table, risk factors are weighted for each related enterprise. The number of participating closed loops, the total transaction amount involved, and the ratio of the number of times funds flow in and out are multiplied by the corresponding preset weight coefficients and then summed to obtain the initial risk score of the enterprise. The initial risk score of the enterprise is then normalized according to the average transaction size of the industry to which the enterprise belongs. The initial risk score is divided by the industry average transaction size coefficient to obtain the enterprise risk scoring table.

9. An intelligent tax risk assessment and early warning system, characterized in that, The intelligent tax risk assessment and early warning method for executing any one of claims 1 to 8 includes: The extraction module is used to extract transaction details from the tax declaration data of the enterprise to be tested, and obtain transaction detail data; The analysis module is used to perform statistical analysis on the relationships between upstream and downstream enterprises of the enterprise to be tested based on the transaction details data, obtain an enterprise association matrix, and construct an enterprise transaction network graph based on the enterprise association matrix; The identification module is used to identify closed-loop transactions in the enterprise transaction network graph through a path search method, obtain closed-loop transaction paths, and extract fund return features and identify abnormal transaction patterns based on the closed-loop transaction paths to obtain a set of suspicious closed-loop transactions. The assessment module is used to conduct a risk quantification assessment of all related party enterprises in the closed-loop transaction path based on the suspicious closed-loop transaction set, obtain an enterprise risk score table, and mark the enterprise to be detected as a high-risk enterprise when the risk exceeds a preset threshold, and generate a risk warning report at the same time.