Method and device for denoising error transaction abnormal fluctuation data of a business platform
By combining a large language model with historical transaction data to identify misjudged abnormal fluctuation data, the problem of misjudgment in automatic detection is solved, and flexible and efficient noise reduction effect is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ALIPAY COM CO LTD
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies are insufficient to effectively filter out normal data that is mistakenly identified as abnormal fluctuations during automatic detection, leading to a waste of resources.
By combining a large language model with historical long-term transaction data, the system determines whether abnormal fluctuation data is misjudged as normal data based on the anomaly detection task description information, and then filters it out.
It improves the flexibility and reusability of noise reduction for abnormal fluctuation data, adapts to different business scenarios, and reduces misjudgment and resource waste.
Smart Images

Figure CN122309918A_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of computer technology, and more particularly to a method and apparatus for denoising erroneous transaction abnormal fluctuation data of a business platform. Background Technology
[0002] In the payment sector, platforms involved in financial services can connect with external institutions (such as banks and shopping platforms), playing a crucial role in shielding external entities from institutional differences and providing unified fund and information exchange services internally. Some platforms may handle massive transaction volumes. For example, one platform's financial network processes an average of 1.1 billion transactions daily, with over 10 million erroneous transactions daily, 18,000 existing channel changes, and over 800 new channels added annually. With such a massive scale of business, even minor errors can be amplified exponentially, severely impacting user experience. Some errors within a reasonable range may be permissible. Therefore, it is necessary to detect erroneous transaction data and monitor for abnormal fluctuations, such as a surge in the number of erroneous transactions. Due to the enormous volume of erroneous transaction data and the continuously increasing number of channels, traditional manual screening and filtering methods are insufficient to meet business needs; therefore, automated processes are often used to detect abnormal fluctuations in erroneous transactions.
[0003] However, noise may occur during the automatic detection process, meaning normal data may be mistakenly identified as abnormal fluctuations. Misjudging fluctuating data can lead to a waste of human and material resources. Therefore, how to further filter noise from the abnormal fluctuation data obtained by automatic detection is an important technical problem. Summary of the Invention
[0004] This specification describes one or more embodiments of a method and apparatus for denoising erroneous transaction abnormal fluctuation data of a business platform, in order to solve one or more problems mentioned in the background art.
[0005] According to a first aspect, a method for denoising abnormal fluctuation data of erroneous transactions on a business platform is provided. The method includes: acquiring first abnormal fluctuation data of a first business; querying based on a first business identifier corresponding to the first business to acquire first historical long-cycle transaction data of the first business during a first predetermined time period; based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and preset abnormality judgment task description information, calling a large language model to determine whether the first abnormal fluctuation data is misjudged normal data; and, upon receiving feedback from the large language model that the first abnormal fluctuation data is misjudged normal data, filtering out the first abnormal fluctuation data as noise data.
[0006] In one embodiment, the first business identifier includes at least one of the following: traffic channel, business organization, and error type.
[0007] In one embodiment, the first abnormal fluctuation data, obtained after initial screening of erroneous transaction data, further includes: a first sequence consisting of the number of erroneous transactions of the first business in each of the m time units, and a second sequence consisting of the total number of transactions of the first business in each of the m time units; the predetermined period includes at least a time cycle consisting of n time units, where n is greater than m, and the first historical long-cycle transaction data includes a third sequence consisting of the number of erroneous transactions in each time unit of the predetermined period, and a fourth sequence consisting of the total number of transactions in each time unit of the predetermined period.
[0008] In one embodiment, the anomaly determination task description information includes anomaly determination conditions for erroneous transactions. The anomaly determination conditions include at least one of the following: the number of erroneous transactions exhibits a periodic peak, and the difference between the peak and the average number of erroneous transactions in a certain number of time units before and after is greater than a first predetermined value; the number of erroneous transactions increases continuously within X1 time units or exceeds a second predetermined value within X2 time units; the cumulative number of erroneous transactions exceeds a third predetermined value within S consecutive time units.
[0009] In one embodiment, the anomaly determination task description information includes a step of instructing a large language model to perform noise discrimination, specifically including: obtaining the cutoff time, number of erroneous transactions, and total number of transactions in the first abnormal fluctuation data, as well as the number of erroneous transactions and total number of transactions in each time unit of the first historical long-cycle transaction data; detecting the consistency between the first historical long-cycle transaction data and the first abnormal fluctuation data; and determining that the first abnormal fluctuation data is misjudged normal data if the first abnormal fluctuation data is consistent with the first historical long-cycle transaction data.
[0010] In one embodiment, detecting the consistency between the first historical long-term trading data and the first abnormal fluctuation data includes: extracting the number of erroneous transactions and the total number of transactions within each time unit that coincides with the time period of the first abnormal fluctuation data from the first historical long-term trading data, as first historical reference data; comparing the first historical reference data and the first abnormal fluctuation data in at least one aspect of the number of erroneous transactions, the total number of transactions, and the error rate to determine whether the changing trends are consistent, wherein the error rate is the ratio of the number of erroneous transactions to the corresponding total number of transactions; and determining that the first abnormal fluctuation data is normal data if the changing trends in all aspects are consistent.
[0011] In one embodiment, if the corresponding historical long-term transaction data cannot be found, the first abnormal fluctuation data is determined to be erroneous traffic data in a new business scenario, and no noise reduction is performed.
[0012] According to the second aspect, an apparatus is provided for denoising abnormal fluctuation data of erroneous transactions on a business platform, comprising:
[0013] The acquisition unit is configured to acquire the first abnormal fluctuation data of the first service.
[0014] The query unit is configured to perform a query based on the first service identifier corresponding to the first service, and obtain the first historical long-cycle transaction data of the first service in a first predetermined time period.
[0015] The judgment unit is configured to call a large language model based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and the preset abnormal judgment task description information, so that the large language model can determine whether the first abnormal fluctuation data is a misjudged normal data.
[0016] The filtering unit is configured to filter out the first abnormal fluctuation data as noise data when it receives feedback from the large language model that the first abnormal fluctuation data is misjudged as normal data.
[0017] According to a third aspect, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method of the first aspect.
[0018] According to a fourth aspect, a computing device is provided, including a memory and a processor, characterized in that the memory stores executable code, and when the processor executes the executable code, it implements the method of the first aspect.
[0019] The methods and apparatus provided in the embodiments of this specification, during the noise reduction process for abnormal fluctuation data in erroneous transactions on a business platform, compare the initially screened abnormal fluctuation data with corresponding historical long-term transaction data using a large language model to detect their consistency. If the data matches the historical long-term transaction data, it can be used as moving noise filter. This implementation method, using a large language model and historical long-term transaction data, has strong flexibility and scenario expansion capabilities. Attached Figure Description
[0020] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0021] Figure 1 This diagram illustrates an application scenario for denoising erroneous transaction data with abnormal fluctuations on a business platform.
[0022] Figure 2 This specification illustrates a specific applicable architecture diagram for denoising erroneous transaction abnormal fluctuation data of a business platform;
[0023] Figure 3 This is a flowchart of a method for denoising erroneous transaction abnormal fluctuation data of a business platform according to an embodiment of this specification;
[0024] Figure 4 This is a schematic block diagram of an apparatus for denoising erroneous transaction abnormal fluctuation data of a business platform according to one embodiment of this specification. Detailed Implementation
[0025] The solution provided in this specification will now be described with reference to the accompanying drawings.
[0026] Figure 1 An example application scenario of an embodiment of this specification is shown. For example... Figure 1 As shown, this application scenario involves a business platform, several user terminals, and multiple resource service platforms. A single resource service platform can provide at least one service among resource storage, conversion, and transaction; for example, a bank (for currency storage, conversion and transfer between different currencies) or a shopping platform (for money-for-goods transactions). The business platform is a platform involved in resource transfer services; for example, it could be a social platform with financial services, a comprehensive service platform integrating multiple mini-programs or micro-applications (such as food delivery mini-programs, shopping mini-programs, vehicle damage mini-programs, etc.), or a payment platform connecting multiple business institutions (resource service platforms), etc.
[0027] On the one hand, the business platform can connect to multiple user terminals via the network to provide users with at least one service involving resource storage, transformation, and transaction. On the other hand, the business platform can connect to the servers of multiple resource service platforms via the network, allowing users to process resources from multiple resource service platforms through the business platform. For example, a user can transfer money to a bank card at a certain bank through the current business platform, shop through a shopping mini-program within the business platform, pay for the purchase through the business platform using the bank card at a certain bank, or transfer resources based on the business platform, such as using a bank card opened at Bank A as the payment method to transfer money to a bank card opened at Bank B, and so on.
[0028] It is understandable that user terminals can also connect directly to the server of the corresponding resource service platform through the client of the resource service platform to conduct related business. Optionally, the current business platform can be one of multiple resource service platforms.
[0029] The platform involved in financial business mentioned in the background technology can be used as... Figure 1 The business platform in the middle, external organizations may include Figure 1 Resource service platforms (such as banks and shopping platforms) are involved in resource transactions. Since multiple resource service platforms may be involved, various errors may occur, such as insufficient balance, device risk, incorrect merchant parameters, asynchronous notification verification failure, bank card limits, abnormal card status, etc. Errors usually lead to transaction failures, and transactions with various errors are referred to as erroneous transactions in this manual. Depending on the resource processing capacity of the resource service platform, the number of erroneous transactions within a certain tolerance range is considered normal and usually permissible. When the number of erroneous transactions exceeds the normal range, it is considered an abnormal fluctuation, which may indicate platform service system problems, network problems, etc., requiring manual intervention or problem aggregation. Therefore, it is necessary to monitor the number of erroneous transactions to determine if abnormal fluctuations have occurred.
[0030] The detection of erroneous transactions can be carried out in various ways, such as through hard-coded (manual hard coding, where certain logic, rules, parameters, or data are directly written into the program code) or manual processing. This specification also provides an adaptive judgment method based on time-series model prediction. Manual judgment requires a large number of professionals with certain judgment experience, keen observation skills, and sustained focus. Therefore, in practice, hard coding is more commonly used, but the adaptive judgment method proposed in this specification can also be used. These automatic detection processes can filter out suspected abnormal fluctuation data. This data may be misjudged; misjudged data, as abnormal noise (noise from abnormal data, i.e., normal data), needs to be filtered out, i.e., noise reduction processing.
[0031] Figure 2 This document illustrates an application example of a specific architecture in detecting abnormal fluctuation data, based on embodiments of this specification. In this architecture, an adaptive judgment method based on time-series model prediction is used for initial screening. For example... Figure 2 As shown, the error transaction abnormal fluctuation detection scheme under this specific implementation architecture can be divided into four modules: data processing module, model training module, anomaly detection module, and result processing module.
[0032] The data processing module cleans transaction log data and statistically analyzes erroneous transactions within each time unit according to the business key (it can also statistically analyze the total number of transactions within each time unit). The resulting data is stored in tables, key-value pairs, and other formats. Within the data processing module, for a single business defined by a single business key, it can also obtain a continuous N+1 data sequence as training samples. Here, "N" represents the number of erroneous transactions over N consecutive time units, used as feature data, and "1" represents the number of erroneous transactions in the next time unit, used as label data. This approach, through a hierarchical data dimensionality reduction strategy, significantly reduces the daily volume of tens of millions of erroneous transaction data by aggregating it at the time unit (e.g., minute) level, reducing it to the hundreds of thousands level. With secondary aggregation based on the business key as the key dimension, the data volume can be further compressed to the thousands level, greatly improving processing efficiency.
[0033] In the model training module, the monitoring model can read data from the training samples obtained from the data processing module and perform supervised training. The time series model learns the cyclical fluctuation patterns under various business scenarios and predicts the number of transactions in the next time unit based on the data sequence. Notably, the time series model does not carry business scenario identifiers during the learning process, thus automatically adapting to relevant scenarios based on the acquired data sequence. This innovative approach employs a scenario-independent judgment mechanism, replacing the original scenario whitelist configuration mode. For example, in financial transaction scenarios, it does not rely on specific transaction channels or error codes, focusing instead on time series prediction inference based on transaction volume. This fundamentally solves the maintenance challenges caused by scenario switching (such as frequent additions or changes to channels in financial transactions), achieving dynamic adaptation.
[0034] Once the time-series model is trained, it can be provided to the anomaly detection module for detecting abnormal fluctuations. The anomaly detection module can... Figure 2 The illustrated process determines whether the number of erroneous transactions fluctuates abnormally. Specifically, the anomaly detection model obtains an N-data sequence from the real-time data processing module, processes the N-data sequence using a time-series model, and obtains a predicted number of erroneous transactions. This predicted number is then compared with the actual number obtained from the data processing module to determine whether the number of erroneous transactions has fluctuated abnormally.
[0035] Data identified as abnormal fluctuations by the anomaly detection module can be written to an anomaly data table or stored as key-value pairs. The result processing layer can then further process this data, performing noise filtering (to identify false alarms), issuing alarm notifications, and so on. Optionally, the anomaly data obtained by the result processing module can also be fed back to the model training module for parameter fine-tuning of the time-series model.
[0036] Thus, through the collaboration of various modules, abnormal fluctuations in erroneous transactions can be detected in real time. The technical solution in this manual includes a noise reduction process in the results processing module to determine whether the initially screened abnormal data is a false alarm (in... Figure 2 (Indicated by bolding and highlighting in bold) for discussion.
[0037] In conventional techniques, noise reduction is achieved by manually or through hard-coded noise assessment criteria preset in the code. This approach is typically highly customized, making it difficult to apply across different business scenarios. Furthermore, it is prone to missed or false filtering due to changes in periodic parameters of business volume (such as peak and trough values), thus the noise reduction effect needs improvement.
[0038] In light of this, this specification provides a solution for denoising abnormal transaction fluctuation data on a business platform. Leveraging the powerful language processing and understanding capabilities of a large language model, it obtains more historical periodic data based on the initially screened abnormal fluctuation data. This data, along with the anomaly detection task description, serves as prompt words, which are then used to call the large language model to determine whether the data is a misjudged normal data. Upon receiving feedback from the large language model that the corresponding abnormal fluctuation data is a misjudged normal data, the corresponding abnormal fluctuation data is filtered out as noise data.
[0039] Thus, the large language model, using historical data as a reference, determines whether the initially screened abnormal fluctuation data is noise. This has generalizability and can be directly reused in various business scenarios, effectively solving the problem of poor reusability of hard-coded noise reduction. On the other hand, the large language model can dynamically differentiate and treat abnormal data with different characteristics based on historical data, adapting to and being compatible with more types of erroneous transaction data, thus solving the problem of inflexibility in hard-coded noise reduction judgment.
[0040] The technical concept of this specification is described in detail below with reference to the accompanying drawings.
[0041] Figure 3 This specification illustrates a process for denoising erroneous transaction anomaly fluctuation data of a business platform, provided by one embodiment. The execution entity of this process can be a computer, device, or server with a certain computing power; more specifically, for example… Figure 1 The computing platform shown.
[0042] like Figure 3As shown in the figure, the process for noise reduction of abnormal fluctuation data of erroneous transactions on the business platform provided in this specification may include: Step 301, obtaining the first abnormal fluctuation data of the first business; Step 302, querying based on the first business identifier corresponding to the first business to obtain the first historical long-cycle transaction data of the first predetermined period; Step 303, based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and the preset abnormal judgment task description information, calling the big language model, and having the big language model determine whether the first abnormal fluctuation data is misjudged normal data; Step 304, upon receiving feedback from the big language model that the first abnormal fluctuation data is misjudged normal data, filtering out the first abnormal fluctuation data as noise data.
[0043] First, in step 301, the first abnormal fluctuation data of the first service is obtained.
[0044] It is understandable that a business platform may contain one or more transaction scenarios, such as transferring money from a bank card in Bank A to a bank card in Bank B, or transferring money from a bank card in Bank C to a bank card in Bank A, etc.
[0045] The term "business" can refer to any transaction scenario. For example, in detecting abnormal fluctuations in erroneous transactions, erroneous transactions can be detected separately by category, and different categories of erroneous transactions can originate from different business identifiers. The business identifier can include one or more of the following: transaction channel, business institution, and error type. The business institution can include the current business platform and its associated resource service platforms, etc. The error type is the specific source of the error, such as insufficient balance, device risk, incorrect merchant parameters, asynchronous notification verification failure, bank card limit, abnormal card status, etc., as described above. The transaction channel is the means of resource transfer transactions, such as UnionPay CUPS, NetsUnion NUCC, etc. As a specific example, the business key for an erroneous transaction is a composite key consisting of the business institution, the transaction channel, and the error type. In a business identifier “GLBANK~ glbanknucc907~01-PB520045”, “GLBANK” represents the business institution (a bank), “glbanknucc907” represents the transaction channel “bank identifier gl + NetsUnion nucc + payment type 907”, and “01-PB520045” is the error code (i.e., error type) reported by the failed transaction.
[0046] The abnormal fluctuation data here can be data that may exhibit abnormal fluctuations after initial screening of erroneous transactions according to business requirements. Abnormal fluctuation data can include the number of erroneous transactions for one or more time units (e.g., 30-minute units). Having erroneous transaction counts for multiple time units is helpful for recording abnormal situations related to erroneous transactions.
[0047] The first abnormal fluctuation can be any abnormal fluctuation data obtained through preliminary screening under the first business. The preliminary screening of abnormal fluctuation data can be done in various ways, such as hard-coding or adaptive judgment based on time-series model prediction. The hard-coding method uses preset logic, rules, parameters, etc., to perform preliminary screening based on the number of erroneous transactions acquired in real time.
[0048] The adaptive judgment method based on time-series model prediction is an innovative approach. One specific implementation involves using a pre-trained prediction model for processing time-series data, based on a data sequence formed by the number of erroneous transactions in the first business unit, to predict the number of erroneous transactions in subsequent time units. After obtaining the actual number of erroneous transactions in subsequent time units, the system determines whether there are abnormal fluctuations in the erroneous transactions of the corresponding business unit, based on the predicted and actual numbers. For example, an abnormal fluctuation is identified as a preliminary screening anomaly if at least one of the following abnormal fluctuation conditions is met: the cumulative number of abnormal events in each time unit within a predetermined duration exceeds a first threshold; the change in the periodic parameter of the actual number of erroneous transactions in each time unit exceeds a predetermined error range; or the number of consecutive abnormal events in each time unit exceeds a second threshold. Specifically, if the actual number is greater than the predicted number and the predetermined abnormal condition is met, it is recorded as an abnormal event. Predetermined abnormal conditions include, for example: the actual number in a single time unit exceeds the corresponding predicted number; the value by which the actual number in a single time unit exceeds the corresponding predicted number is greater than a predetermined threshold; the first error rate determined based on the actual number in a single time unit and the total number of transactions is greater than the second error rate determined based on the corresponding predicted number and the total number of transactions, and the difference between the two is greater than the predetermined error rate difference, etc.
[0049] The first abnormal fluctuation data for the first business can be any abnormal business data under the first business scenario. The first business data can correspond to the first business identifier, time point, number of erroneous transactions, and abnormality description information. Among them, the abnormality description information can be the error situation corresponding to the first abnormal fluctuation data, such as the duration of the abnormal fluctuation and the basis for judging the abnormal fluctuation (such as the number of erroneous transactions exceeding a predetermined threshold).
[0050] As a specific example, the first abnormal fluctuation data is as follows: "GLBANK~glbanknucc907~01-PB520045~2025-10-28" 20:53:00.000~error:92,77,74,88,66,67,76,71,68,66,68,56,67,49, 71,78,43,69,65,49,61,56,55,69,69,79,92,95,132,150,189~total:59 2,544,555,581,546,534,557,584,534,510,500,491,470,479,508,484 ,462,508,442,456,466,455,478,519,464,447,521,540,564,639,669". Among them, "GLBANK~glbanknucc907~01-PB520045" is the first business identifier, "2025-10-28 20:53:00.000" is the end time of the first abnormal fluctuation data, and "error:92,77,74,88,66,67,76,71,68,66,68,56,67,49,71,78,43,69,65,49,61,56,55,69,69,79,92,95,132,150,189" are the number of erroneous transactions in each minute 30 minutes from the end time. "total:592,544,555,581,546,534,557,584,534,510,500,491,470,479,508,484,462,508,442,456,466,455,478,519,464,447,521,540,564,639,669" represents the total number of transactions in the minutes remaining until the end time.
[0051] Then, through step 302, a query is performed based on the first business identifier corresponding to the first business to obtain the first historical long-cycle transaction data for the first predetermined time period.
[0052] The business identifier used to mark the first business can be denoted as the first business identifier. Based on the first business identifier, various data corresponding to the first business can be retrieved from historical data. Historical data can be pre-stored offline and can be processed in the form of data tables, key-value pairs, etc. Specifically, when processed in the form of data tables, the first business identifier can correspond to a single field or a combination of multiple fields; when stored in the form of key-value pairs, the first business identifier can correspond to the key in the key-value pair.
[0053] Here, historical long-cycle transaction data for the first predetermined time period can be obtained. This historical long-cycle transaction data is the historical data of the first business, and compared to the first abnormal fluctuation data, it covers a longer period, typically including at least one time period, reflecting the periodic changes in the number of transactions in the first business. The first predetermined time period can be a period determined according to pre-set rules, such as data from 0:00 to 24:00 the day before the first abnormal fluctuation data, or data from the 24 hours before the last time unit of the first abnormal fluctuation data (e.g., 9:00 AM) (e.g., from 9:00 AM the previous day to 9:00 AM the current day), and so on.
[0054] The first historical long-term transaction data may include at least the number of erroneous transactions in each time unit within the first predetermined period, and may also include the total number of transactions in each time unit within the first predetermined period, which will not be elaborated here.
[0055] In some possible designs, if the corresponding historical long-term transaction data cannot be found, the first abnormal fluctuation data is identified as erroneous traffic data in the new business scenario, and no noise reduction is performed.
[0056] Next, in step 303, based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and the preset abnormality judgment task description information, the big language model is invoked, and the big language model determines whether the first abnormal fluctuation data is a misjudged normal data.
[0057] Large Language Models (LLMs), also known simply as large models, are natural language processing models based on deep learning techniques. Their parameter count typically ranges from billions to hundreds of billions or even higher, possessing powerful language understanding and generation capabilities. LLMs can employ the Transformer architecture or its variants (such as GPT and BERT), which utilizes an attention mechanism to globally model sequential data, efficiently handling long-distance dependencies and thus performing exceptionally well in natural language tasks. By pre-training on large-scale corpora, LLMs learn the statistical features and semantic relationships of language, enabling them to generalize effectively. The core capabilities of LLMs include, but are not limited to: understanding contextual semantics, generating coherent and grammatically correct text, performing logical reasoning, and handling multi-task scenarios. Their usage typically includes two modes: direct inference and fine-tuning. In direct inference mode, users design prompts to guide the LLM in generating specific outputs. These prompts can be textual descriptions of the task or instructions used to stimulate the LLM's semantic understanding and generation capabilities. In fine-tuning mode, large language models are further trained on small-scale datasets within a specific domain to optimize their performance on specific tasks. The powerful generalization capabilities and flexibility of large language models make them an important tool in the field of artificial intelligence, providing efficient and accurate solutions for automated text generation and understanding.
[0058] Here, the large language model is used in a direct reasoning mode. The first abnormal fluctuation data and the first historical long-term transaction data are used as available data for processing and can be provided to the large model as prompt words, which the large model then uses as a reference for the anomaly judgment task. In this specification, "anomaly" is an abbreviation for abnormal fluctuation.
[0059] The anomaly determination task description information is used to describe the anomaly task and can be used as part of the prompt information of the large language model. It instructs the large model to determine whether the first abnormal fluctuation data is a misjudged normal data based on the input data. It can include: determination steps, anomaly determination conditions (anomaly definition or determination rules), output data content and its format, etc.
[0060] In one embodiment, the anomaly detection criteria may include: the number of erroneous transactions exhibits periodic peaks, and the difference between the peak and the average number of erroneous transactions over a certain number of time units before and after the peak is greater than a first predetermined value. This indicates that intermittent but regular traffic spikes (spikes) may be caused by periodic interference or errors in the system.
[0061] In one embodiment, the anomaly determination criteria may include: the number of erroneous transactions continuously increases within X1 time units or exceeds a second predetermined value within X2 time units. This is because a continuous increase in the number of erroneous transactions may indicate a system malfunction, while exceeding the second predetermined value within multiple time units may indicate that the number of erroneous transactions remains high.
[0062] In one embodiment, the anomaly determination condition may include: the cumulative number of erroneous transactions exceeds a third predetermined value within S consecutive time units. This determination condition is not limited to the erroneous transaction number needing to meet a certain consistency constraint within a predetermined number of time units, such as continuously increasing within X1 time units or exceeding a second predetermined value within X2 time units. Instead, it considers the total number across multiple time units to avoid misjudgment due to the number of erroneous transactions in individual time units not meeting the consistency constraint during abnormal fluctuations.
[0063] In other embodiments, there may be more anomaly determination conditions, which will not be elaborated here.
[0064] In some optional implementations, the anomaly determination task description information may also include a determination step on whether the first abnormal fluctuation data is misjudged normal data, for example: detecting the consistency between the first abnormal fluctuation data and the first historical long-term trading data; if the first abnormal fluctuation data is consistent with the first historical long-term trading data, determining that the first abnormal fluctuation data is misjudged normal data, otherwise, determining that the first abnormal fluctuation data is abnormal fluctuation data.
[0065] According to one embodiment, the consistency between the first abnormal fluctuation data and the first historical long-term trading data is detected, for example, by fitting the first historical long-term trading data to obtain corresponding first-period parameters, fitting the first abnormal fluctuation data to obtain second-period parameters, and comparing whether the first-period parameters and the second-period parameters are consistent. The first-period parameters and the second-period parameters include, for example, at least one of the following: period length, peak value, trough value, etc. Consistency is considered to occur when all differences between the first-period parameters and the second-period parameters are within corresponding preset ranges, or the consistency is confirmed by the large language model according to its internal processing mechanism.
[0066] According to one embodiment, the consistency between the first abnormal fluctuation data and the first historical long-term trading data is detected, for example, by the following method: Extracting the number of erroneous transactions and the total number of transactions within each time unit that coincides with the time period of the first abnormal fluctuation data (e.g., both from 9:00 AM to 10:00 AM) from the first historical long-term trading data, as the first historical reference data; comparing the first historical reference data and the first abnormal fluctuation data in at least one aspect—the number of erroneous transactions, the total number of transactions, and the error rate—to determine whether the trends are consistent; if the trends in all aspects are consistent, determining that the first abnormal fluctuation data is normal data; and returning the judgment result according to a predetermined return format. Here, the error rate is the ratio of the number of erroneous transactions to the corresponding total number of transactions. Consistency in one aspect of the trend can be determined by the following methods: the difference in the one-sided trend is within a corresponding preset range; the vector similarity of the corresponding vectors of the one-sided data in the first historical reference data and the first abnormal fluctuation data is greater than a predetermined similarity threshold; or the consistency is confirmed by the large language model according to its internal processing mechanism.
[0067] In other embodiments, the consistency between the first abnormal fluctuation data and the first historical long-term transaction data can also be detected in other ways, such as by using the powerful language understanding and processing capabilities of the large language model to determine whether the first abnormal fluctuation data is a misjudged normal data, etc., which will not be elaborated here.
[0068] The description information of the anomaly detection task can also set the output format of the large language model's detection result, for example: return "N" for anomaly, return "Y" for normal data, etc.
[0069] As a specific example, when calling a large language model, the corresponding prompt message is as follows:
[0070] "'GLBANK~glbanknucc907~01-PB520045~2025-10-28 20:53:00.000~error:92,77,74,88,66,67,76,71,68,66,68,56,67,49,71,78,43, 69,65,49,61,56,55,69,69,79,92,95,132,150,189~total:592,544,555,581,546 ,534,557,584,534,510,500,491,470,479,508,484,462,508,442,456,466,455,478,519,464,447,521,540,564,639,669~pass:N~anomaly_report_json:{"Abnormal Combinations": "GLBANK~glbanknucc907~01-PB520045~2025-10-28 20:53:00.000 (End Time)", "Abnormality Type": "Continuous Rise / High Water Level", "Total Trading Volume per Minute": 669, "Number of Errors per Minute": 189, "Abnormality Level": "Critical", "Unified Standard Score": 9.53, "Judgment Basis": {"Persistence": {"Maximum Consecutive Minutes": 3, "Consecutive Threshold": 2, "Continuous Triggering": true}, "Intermittentness": {"Number of Abnormal Minutes": 3, "Cumulative Excess": 4.5, "Interval CV": 0.0, "Intermittent Triggering": true}, "Failure Rate": {"Current Failure Rate": 0.282511, "Recent Failure Rate": 0.149512}, "Excess Characteristics": {"Maximum Excess": 2.44, "Average Excess": 1.5}, "Notes": "Maximum consecutive minutes = 3, average excess = 1.5"}};'
[0071] Its format is 'Institution ID~Channel~Error Code~Deadline~Statistics of consecutive errors per minute~Statistics of total transactions per minute~Is the anomaly judgment passed (Y indicates no anomaly, N indicates anomaly)~Specific anomaly judgment report content'. GLBANK~glbanknucc907~01-PB520045 are the feature values of the combination, error is the number of consecutive errors per minute up to 2025-10-28 20:53:00.000 (X min), and total is the total number of consecutive transactions per minute up to 2025-10-28 20:53:00.000 (X min).
[0072] The historical long-term trading data is defined as follows: error is the error data from 00:00 to 23:59 of the previous day, totaling 1440 minutes; total is the total trading data from 00:00 to 23:59 of the previous day, totaling 1440 minutes.
[0073] # Task
[0074] You are given two sets of data. The first set represents the anomaly data that has been initially filtered out. The second set represents longer-period data filtered out based on the business identifier of this anomaly. You need to help me perform secondary filtering and noise reduction on the anomaly data based on the longer-period data to determine whether it is normal periodic fluctuation or a genuine anomaly, and analyze the data that you consider normal. The specific steps are as follows:
[0075] 1. Extract the cutoff time, error data, and total number of transactions from the abnormal data; extract the error data and total number of transactions from the historical long-period transaction data.
[0076] 2. Fit the abnormal data with historical data, compare the abnormal data with the data of the corresponding time period in the historical data, and determine whether the same trend of change occurred at this time point in the past or if there is an inconsistency. Use this as a benchmark to make a second judgment on the abnormality.
[0077] 3. The return format is: return "N" for abnormal data and "Y" for normal data. Do not return any unnecessary information, including the basis for the analysis.
[0078] Then, according to step 304, upon receiving feedback from the large language model that the first abnormal fluctuation data is a misjudged normal data, the first abnormal fluctuation data is filtered out as noise data.
[0079] It is understandable that if the large language model receives feedback that the first abnormal fluctuation data is a misjudged normal data (e.g., feedback Y is received), the first abnormal fluctuation data is likely to be abnormal noise (non-abnormal) data and can be filtered out as noise data. If the large language model receives feedback that the first abnormal fluctuation data is moving data (e.g., feedback N is received), the first abnormal fluctuation data is likely to be abnormal (non-abnormal) data and can be treated as abnormal data.
[0080] In reviewing the above process, during the noise reduction of abnormal fluctuation data in erroneous transactions on the business platform, historical long-term transaction data and a large language model can be used. The large model uses historical data as a reference to initially screen whether the abnormal fluctuation data is misjudged as normal data. Upon receiving feedback from the large language model that the initially screened abnormal fluctuation data is misjudged as normal data, the corresponding abnormal fluctuation data is filtered out as noise data. Thus, through the large model's judgment process, on the one hand, it possesses generalization capabilities, which can be directly reused in various business scenarios, improving the reusability of noise reduction; on the other hand, the large model can dynamically differentiate and treat erroneous transaction data under different business scenarios based on historical data, adapting to and being compatible with more types of erroneous transactions, improving the flexibility of noise reduction application scenarios.
[0081] According to another embodiment, an apparatus for denoising abnormal fluctuation data of erroneous transactions on a business platform is also provided, which can be installed in a computer, device, or server with a certain computing power.
[0082] Figure 4 An apparatus 400 for denoising erroneous transaction anomaly fluctuation data of a business platform, according to one embodiment, is shown. Figure 4 As shown, the device 400 for denoising abnormal fluctuation data of erroneous transactions on a business platform may include: an acquisition unit 401 configured to acquire first abnormal fluctuation data of a first business; a query unit 402 configured to query based on the first business identifier corresponding to the first business to acquire first historical long-cycle transaction data of the first business in a first predetermined time period; a judgment unit 403 configured to call a large language model based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and preset abnormal judgment task description information, and have the large language model determine whether the first abnormal fluctuation data is misjudged normal data; and a filtering unit 404 configured to filter out the first abnormal fluctuation data as noise data when receiving feedback from the large language model that the first abnormal fluctuation data is misjudged normal data.
[0083] It is worth noting that, Figure 4 The device 400 shown is Figure 3 The method embodiment shown corresponds to this, therefore, Figure 3 The corresponding descriptions in the method embodiments can also be applied to... Figure 4 The device 400 shown will not be described in detail here.
[0084] According to another embodiment, a computer-readable storage medium is also provided, on which a computer program is stored, which, when executed in a computer, causes the computer to perform a combination Figure 3 The methods described above.
[0085] According to another embodiment, a computing device is also provided, including a memory and a processor, wherein the memory stores executable code, and when the processor executes the executable code, it implements a combination... Figure 3 The methods described above.
[0086] Those skilled in the art will recognize that the functions described in the embodiments of this specification in one or more of the above examples can be implemented using hardware, software, firmware, or any combination thereof. When implemented in software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium.
[0087] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the technical concept in this specification. It should be understood that the above description is only a specific embodiment of the technical concept in this specification and is not intended to limit the scope of protection of the technical concept in this specification. Any modifications, equivalent substitutions, improvements, etc., made on the basis of the technical solutions of the embodiments in this specification should be included within the scope of protection of the technical concept in this specification.
Claims
1. A method for denoising erroneous transaction abnormal fluctuation data of a business platform, comprising: Obtain the first abnormal fluctuation data for the primary business; Based on the first business identifier corresponding to the first business, query to obtain the first historical long-cycle transaction data of the first business in the first predetermined time period. Based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and the preset abnormality judgment task description information, the large language model is invoked, and the large language model determines whether the first abnormal fluctuation data is a misjudged normal data. Upon receiving feedback from the large language model that the first abnormal fluctuation data is misjudged as normal data, the first abnormal fluctuation data is filtered out as noise data.
2. The method as described in claim 1, wherein, The first business identifier includes at least one of the following: traffic channel, business organization, and error type.
3. The method of claim 1, wherein: The first abnormal fluctuation data is obtained after initial screening of erroneous transaction data, and also includes: a first sequence consisting of the number of erroneous transactions of the first business in each of the m time units, and a second sequence consisting of the total number of transactions of the first business in each of the m time units; The predetermined time period includes at least a time cycle consisting of n time units, where n is greater than m. The first historical long-cycle transaction data includes a third sequence consisting of the number of erroneous transactions in each time unit of the predetermined time period, and a fourth sequence consisting of the total number of transactions in each time unit of the predetermined time period.
4. The method of claim 1, wherein, The description information of the anomaly determination task includes the anomaly determination conditions for erroneous transactions. The anomaly determination conditions include at least one of the following: the number of erroneous transactions shows a periodic peak, and the difference between the peak and the average number of erroneous transactions in a certain number of time units before and after is greater than a first predetermined value. The number of erroneous transactions increases continuously within X1 time units or exceeds the second predetermined value within X2 time units; the cumulative number of erroneous transactions exceeds the third predetermined value within S consecutive time units.
5. The method of claim 1, wherein, The description information of the anomaly detection task includes the steps of instructing the large language model to perform noise discrimination, specifically including: Obtain the cutoff time, number of erroneous transactions, and total number of transactions from the first abnormal fluctuation data, as well as the number of erroneous transactions and total number of transactions for each time unit from the first historical long-cycle transaction data; Detect the consistency between the first historical long-period transaction data and the first abnormal fluctuation data; If the first abnormal fluctuation data is consistent with the first historical long-term trading data, the first abnormal fluctuation data is determined to be normal data that has been misjudged.
6. The method of claim 1, wherein, The process of detecting the consistency between the first historical long-period trading data and the first abnormal fluctuation data includes: The number of erroneous transactions and the total number of transactions are extracted from the first historical long-cycle transaction data, within each time unit that is consistent with the period of the first abnormal fluctuation data, and used as the first historical reference data. By comparing the first historical reference data and the first abnormal fluctuation data in at least one of the following aspects: number of erroneous transactions, total number of transactions, and error rate, it is determined whether the trend of change is consistent, wherein the error rate is the ratio of the number of erroneous transactions to the corresponding total number of transactions; If all aspects of the change trend are consistent, the first abnormal fluctuation data is determined to be normal data.
7. The method of claim 1, wherein, If no corresponding historical long-term transaction data can be found, the first abnormal fluctuation data is determined to be erroneous traffic data in a new business scenario, and no noise reduction is performed.
8. An apparatus for denoising erroneous transaction data with abnormal fluctuations on a business platform, comprising: The acquisition unit is configured to acquire the first abnormal fluctuation data of the first service. The query unit is configured to perform a query based on the first service identifier corresponding to the first service, and obtain the first historical long-cycle transaction data of the first service in a first predetermined time period. The judgment unit is configured to call a large language model based on the first abnormal fluctuation data, the first historical long-cycle transaction data, and the preset abnormal judgment task description information, so that the large language model can determine whether the first abnormal fluctuation data is a misjudged normal data. The filtering unit is configured to filter out the first abnormal fluctuation data as noise data when it receives feedback from the large language model that the first abnormal fluctuation data is misjudged as normal data.
9. A computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method as described in any one of claims 1-7.
10. A computing device, comprising a memory and a processor, characterized in that, The memory stores executable code, and when the processor executes the executable code, it implements the method as described in any one of claims 1-7.