Rate limiter supporting prioritized batch traffic request
By designing a rate limiter that supports priority batch traffic requests, and utilizing a splitting module and a token management module, the problem of low pass rate of existing rate limiters in batch traffic requests is solved, achieving more efficient traffic management and resource utilization.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- ANXINWANGDUN TECHNOLOGY CO LTD
- Filing Date
- 2025-10-24
- Publication Date
- 2026-06-25
Smart Images

Figure CN2025129852_25062026_PF_FP_ABST
Abstract
Description
A rate limiter that supports priority-based bulk traffic requests
[0001] Cross-reference to related applications
[0002] This application claims priority to Chinese patent application No. 202411851913.4, filed on December 6, 2024, entitled "A Rate Limiter Supporting Priority Batch Traffic Requests", which is incorporated herein by reference in its entirety. Technical Field
[0003] This application relates to the field of server-side software resource allocation technology, specifically to a rate limiter that supports priority batch traffic requests. Background Technology
[0004] The server-side software provides services to the outside world through an API. Since the amount of requests from upstream callers cannot be controlled, a sudden surge in the number of requests can cause the server to consume too many resources, resulting in reduced response speed, timeouts, or even crashes, and may even trigger a cascading failure that renders the entire system unusable.
[0005] A rate limiter controls the flow of API requests, quickly rejecting, failing, or dropping requests that exceed the limit to prevent the system from crashing due to overload and to ensure the stability of the service and downstream resource systems.
[0006] Rate limiters can be implemented based on different algorithms and strategies. Industry-standard algorithms include fixed window, sliding window, token bucket, and leaky bucket algorithms. However, in some application scenarios, they still have shortcomings in terms of rate limiting effectiveness, expected functionality, and business development complexity. For example, to improve the efficiency of inter-service interaction, the server provides batch read / write interfaces. Replacing multiple requests with a single batch call saves network call overhead, and the server and database typically perform better with batch processing. However, existing algorithms have low batch traffic request success rates. When traffic quota resources are scarce, batch traffic requests are frequently limited, resulting in low success rates. In extreme cases, batch traffic requests may be continuously rejected, making it difficult to process business logic using rate limiters.
[0007] Therefore, a rate limiter that can improve the throughput of batch traffic requests is needed. Summary of the Invention
[0008] (I) Purpose of the Invention
[0009] The purpose of this application is to provide a rate limiter that supports priority batch traffic requests and can improve the pass rate of batch traffic requests.
[0010] (II) Technical Solution
[0011] To address the aforementioned issues, this application provides a rate limiter that supports priority-based batch traffic requests, comprising:
[0012] The system includes a rate limiter parameter configuration module, a traffic splitting module, a token management module, and a rate limit judgment and execution module.
[0013] The current limiter parameter configuration module is used to set the parameters of the current limiter;
[0014] The traffic splitting module is used to split batch traffic requests into multiple batch fragments;
[0015] The token management module is used to record and manage the accumulated current token quota in the token bucket;
[0016] The rate limiting judgment and execution module is used to perform rate limiting judgment on batch traffic requests based on the rate limiter parameters and the current token quota, and to perform actions to allow or split the batch traffic requests based on the judgment result. The action of splitting the batch traffic requests is implemented by the traffic splitting module.
[0017] In another aspect of this application, preferably,
[0018] The parameters of the flow limiter include: maximum bucket flow quota, flow limit cycle length, number of intervals per cycle, maximum allowable burst flow, maximum compensation quantity per interval, average quota per interval, and interval length.
[0019] The average quota per gap is obtained by dividing the maximum flow quota of the bucket by the number of gaps per cycle.
[0020] The duration of each gap is obtained by dividing the duration of the current limiting cycle by the number of gaps per cycle.
[0021] In another aspect of this application, preferably,
[0022] Rate limiting for batch traffic requests includes: rate limiting based on the first rule;
[0023] The first rule includes:
[0024] When the batch traffic request is 1;
[0025] If the number of the batch traffic requests is less than or equal to the average quota per gap of the rate limiter, the rate limiter will allow the requests directly.
[0026] If the number of batch traffic requests is greater than the average quota per gap of the rate limiter, the rate limiter splits the batch traffic requests into multiple batch fragments according to the first formula through the traffic splitting module, and the rate limiter allows them to pass in sequence.
[0027] The first formula is expressed as:
[0028] Where S represents the number of batch shards in the first formula, qps_batch_req represents the number of batch traffic requests, and qps_per_interval represents the average quota per interval.
[0029] In another aspect of this application, preferably,
[0030] The first rule also includes:
[0031] When the batch traffic request is 1, bursty instantaneous traffic is allowed;
[0032] If the number of the batch traffic requests is greater than the average quota per gap, but less than or equal to the maximum allowed burst traffic, the rate limiter will allow the requests to pass directly.
[0033] If the number of batch traffic requests exceeds the maximum allowed burst traffic, the rate limiter will split the batch traffic requests into multiple batch fragments according to the second formula through the traffic splitting module, and the rate limiter will allow them in sequence.
[0034] The second formula is expressed as:
[0035] Where U represents the number of batch shards in the second formula, qps_batch_req represents the number of batch traffic requests, and max_bust represents the maximum allowed burst traffic.
[0036] In another aspect of this application, preferably,
[0037] When the number of the batch traffic requests exceeds the maximum allowed burst traffic, the current token quota in the token bucket is used;
[0038] If the current token quota is less than the maximum allowed burst traffic, then the token quota in the subsequent time slot will be used for compensation, and the maximum compensation amount is the maximum compensation amount per slot in the flow limiter parameters.
[0039] In another aspect of this application, preferably,
[0040] Rate limiting for batch traffic requests also includes: rate limiting based on the second rule;
[0041] The second rule includes:
[0042] When the number of batch traffic requests is greater than 1, the batch traffic requests include a first batch traffic request and a second batch traffic request;
[0043] If the second batch of traffic requests obtains the token quota of the gap, the first batch of traffic requests will be split according to the remaining token quota of the gap to obtain the first batch fragment and the second batch fragment.
[0044] The first batch of shards is processed by the remaining token quota of the gap;
[0045] The second batch of shards is processed according to the first rule.
[0046] In another aspect of this application, preferably,
[0047] The second rule also includes:
[0048] When the batch traffic request is greater than 1, bursts of instantaneous traffic are allowed;
[0049] When the second batch of traffic requests obtains the gap token quota;
[0050] If the value of the maximum allowed burst traffic minus the number of the second batch of traffic requests is greater than or equal to the number of the first batch of traffic requests, the rate limiter will allow the flow directly.
[0051] If the value of the maximum allowed burst traffic minus the number of second batch traffic requests is less than the number of first batch traffic requests, it shall be processed according to the first rule.
[0052] In another aspect of this application, preferably,
[0053] It also includes a priority module;
[0054] The priority module sorts the batch traffic requests according to a preset first priority rule or a preset second priority rule.
[0055] In another aspect of this application, preferably,
[0056] The preset first priority rule includes:
[0057] When high-priority batch traffic needs to be split, all token quotas in each time slot are preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic, until all quotas are met.
[0058] In another aspect of this application, preferably,
[0059] The preset second priority rule includes:
[0060] When high-priority batch traffic needs to be split, a portion of the traffic quota for each time slot is preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic. The remaining quota is then fairly contested by the unsatisfied batch fragments resulting from the splitting of the high-priority batch traffic and other batch traffic.
[0061] (III) Beneficial Effects
[0062] The above-mentioned technical solution of this application has the following beneficial technical effects:
[0063] This application uses a traffic splitting module to divide batch traffic requests into multiple batch fragments, avoiding frequent throttling of batch traffic requests when the quota does not meet the expected quota; avoiding permanent throttling of batch traffic requests when the expected quota exceeds the maximum limit of the rate limiter; and splitting batch traffic into appropriately sized sub-batch traffic can make fuller use of the rate limiter's quota resources, avoiding waste when quota resources cannot meet the overall batch traffic, thus improving the pass rate of batch traffic requests. Attached Figure Description
[0064] Figure 1 is a schematic diagram of the overall structure of an embodiment of this application;
[0065] Figure 2 is a schematic diagram of a batch traffic request of 1 entering the rate limiter according to an embodiment of this application;
[0066] Figure 3 is a schematic diagram of a batch traffic request being split into 1 and passed through a rate limiter according to an embodiment of this application;
[0067] Figure 4 is a schematic diagram of a batch traffic request of 1 in an embodiment of this application, allowing sudden instantaneous traffic to pass through the flow limiter;
[0068] Figure 5 is a schematic diagram of a rate limiter in an embodiment of this application when the batch traffic request is greater than 1.
[0069] Figure 6 is a schematic diagram of a burst of instantaneous traffic passing through a rate limiter when the batch traffic request is greater than 1 according to an embodiment of this application.
[0070] Figure 7 illustrates a preset second priority rule passed through a current limiter according to an embodiment of this application.
[0071] Figure 8 is a schematic diagram of a batch traffic request greater than 1 in one embodiment of this application, after a sudden instantaneous traffic degradation and passing through a rate limiter. Detailed Implementation
[0072] To make the objectives, technical solutions, and advantages of this application clearer, the application will be further described in detail below with reference to specific embodiments and accompanying drawings. It should be understood that these descriptions are merely exemplary and not intended to limit the scope of this application. Furthermore, descriptions of well-known structures and technologies are omitted in the following description to avoid unnecessarily obscuring the concepts of this application.
[0073] Obviously, the described embodiments are only a part of the embodiments of this application, not all of them. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.
[0074] Furthermore, the technical features involved in the different embodiments of this application described below can be combined with each other as long as they do not conflict with each other.
[0075] The present application will now be described in more detail with reference to the accompanying drawings. In the various drawings, the same elements are indicated by similar reference numerals. For clarity, the various parts in the drawings are not drawn to scale.
[0076] Example 1
[0077] A rate limiter supporting priority batch traffic requests is provided. Figure 1 shows a schematic diagram of the overall structure of one embodiment of this application. As shown in Figure 1, it includes:
[0078] The system includes a rate limiter parameter configuration module, a traffic splitting module, a token management module, and a rate limit judgment and execution module.
[0079] The current limiter parameter configuration module is used to set the parameters of the current limiter. In this embodiment, the parameters of the current limiter include: maximum bucket flow quota, current limiting cycle time length, number of gaps per cycle, maximum allowable burst flow, maximum compensation amount per gap, average quota per gap, and time length per gap. The average quota per gap is obtained by dividing the maximum bucket flow quota by the number of gaps per cycle. The time length per gap is obtained by dividing the current limiting cycle time length by the number of gaps per cycle.
[0080] The parameters of the current limiter are represented by the following codes:
[0081] bucket_capacity: Maximum bucket capacity;
[0082] bucket_unit_time: The length of the rate limiting period;
[0083] interval_unit_count: Number of intervals per cycle;
[0084] max_bust: The maximum burst traffic allowed, provided that qps_per_interval <= max_bust;
[0085] interval_compensate_max_count: Maximum number of compensations per interval;
[0086] The condition is satisfied that interval_compensate_max_count <= qps_per_interval;
[0087] qps_per_interval: Average quota per interval;
[0088] qps_per_interval=bucket_capacity / interval_unit_count;
[0089] interval_unit_time: The length of each interval;
[0090] interval_unit_time=bucket_unit_time / interval_unit_count;
[0091] Use the following code to define the relevant content for batch traffic requests:
[0092] qps_batch_req: Number of batch traffic requests;
[0093] qps_batch_req_0...b: The batch traffic fragments to be split, numbered by integers from 0 to b;
[0094] split_fun: A splitting function that can split qps_batch_req_x traffic based on the desired traffic volume;
[0095] merge_fun: A merge function that can merge b+1 return values of qps_batch_req_0...b into a single return value;
[0096] token_num: To cope with sudden surges in traffic, unused tokens are accumulated using a token bucket. token_num represents the current token quota.
[0097] The traffic splitting module is used to split batch traffic requests into multiple batch fragments;
[0098] The token management module is used to record and manage the accumulated current token quota in the token bucket;
[0099] The rate limiting judgment and execution module is used to perform rate limiting judgment on batch traffic requests based on the rate limiter parameters and the current token quota, and to perform actions to allow or split the batch traffic requests based on the judgment result. The action of splitting the batch traffic requests is implemented by the traffic splitting module.
[0100] Furthermore, in this embodiment, the rate limiting judgment for batch traffic requests includes: rate limiting judgment based on the first rule; the batch traffic request can be a single request or multiple requests;
[0101] The first rule includes:
[0102] When the batch traffic request is 1; when the batch traffic request is a single request in a certain time slot;
[0103] If the number of batch traffic requests is less than or equal to the average quota per gap of the rate limiter, the rate limiter will allow the requests directly. Figure 2 shows a schematic diagram of a batch traffic request entering the rate limiter according to an embodiment of this application. As shown in Figure 2, the horizontal axis is the time axis, representing each time gap; the vertical axis is the number of traffic requests, and the dashed line represents the average quota per gap. As shown in Figure 2, the average quota per gap is 5. If the number of batch traffic requests is less than or equal to 5, the rate limiter will allow the requests directly.
[0104] If the number of batch traffic requests is greater than the average quota per gap of the rate limiter, the rate limiter splits the batch traffic requests into multiple batch fragments according to the first formula through the traffic splitting module, and the rate limiter allows them to pass in sequence.
[0105] The first formula is expressed as:
[0106] Where S represents the number of batch shards in the first formula, qps_batch_req represents the number of batch traffic requests, and qps_per_interval represents the average quota per interval.
[0107] Figure 3 illustrates a schematic diagram of the batch traffic requests being split and passed through a rate limiter according to an embodiment of this application. As shown in Figure 3, the number of batch traffic requests is 12. Time slot 1 provides 5 quotas, and the split function `split_fun` is used to split the batch traffic into 5 `qps_batch_req_0` requests, leaving 12 - 5 = 7 requests. Time slot 2 provides 5 quotas, and the batch traffic into 5 `qps_batch_req_1` requests, leaving 7 - 5 = 2 requests. Time slot 3 provides 5 quotas, and the remaining 2 requests directly constitute `qps_batch_req_2`. The entire batch traffic requests pass through the rate limiter sequentially.
[0108] Furthermore, in this embodiment, the first rule also includes:
[0109] When the batch traffic request is 1, burst traffic is allowed; the condition for allowing burst traffic is that there are no token quotas to be compensated.
[0110] If the quantity of the batch traffic requests is greater than the average quota per interval and less than or equal to the maximum allowed burst traffic, the rate limiter directly allows it to pass.
[0111] If the quantity of the batch traffic requests is greater than the maximum allowed burst traffic, the rate limiter splits the batch traffic requests into multiple batch shards according to the second formula through the traffic splitting module, and the rate limiter allows them to pass in sequence.
[0112] The second formula is expressed as:
[0113] Where, U represents the number of batch shards of the second formula, qps_batch_req represents the quantity of batch traffic requests, and max_bust represents the maximum allowed burst traffic.
[0114] Furthermore, in this embodiment,
[0115] When the quantity of the batch traffic requests is greater than the maximum allowed burst traffic, the current token quota in the token bucket is used.
[0116] If the number of tokens in the bucket token_num >= max_bust, then qps_batch_req_0 with the size of max_bust batch traffic is split out, and the rate limiter allows it to pass. Because the cumulative quota is used, no compensation is required in subsequent time slots.
[0117] If the current token quota is less than the maximum allowed burst traffic, the token quota in subsequent time slots is used for compensation, and the maximum compensation quantity is the maximum compensation quantity per interval in the rate limiter parameters. When the number of tokens in the bucket token_num < max_bust, the additional rate limiting quota used in this time slot max_bust - token_num - qps_per_interval is compensated evenly in the subsequent time slot quotas, and each time slot can compensate at most interval_compensate_max_count quotas. For the traffic quota used instantaneously in the burst, the current token quota is used first. If the current token quota is not enough, for the exceeded quota, the pre-allocated quota in the subsequent time slots is used. Before the additional used traffic quota is completely compensated, no new burst traffic exceeding the normal time slot quota is allowed to pass. Figure 4 shows a schematic diagram of allowing a batch traffic request of 1 and allowing burst instantaneous traffic to pass through the rate limiter in an embodiment of the present application. As shown in Figure 4, in time slot 1, 10 burst instantaneous traffic are allowed to pass. The batch traffic request is 10 and can pass directly without splitting, but 10 - 0 - 5 = 5 (pieces) of quotas are used additionally. Each subsequent time slot can compensate at most 1 quota, so a total of 5 quotas need to be compensated in 5 consecutive subsequent time slots. At the same time, the quota numbers of these 5 consecutive time slots change from 5 to 4. After the compensation is completed, the time slot quota number returns from 4 to 5.
[0118] Furthermore, in this embodiment, the rate limiting judgment for batch traffic requests also includes: performing rate limiting judgment according to the second rule;
[0119] The second rule includes:
[0120] When the number of batch traffic requests is greater than 1, the batch traffic requests include a first batch traffic request and a second batch traffic request; the batch traffic requests are not a single request in a certain time slot.
[0121] If the second batch of traffic requests obtains the token quota of the gap, the first batch of traffic requests will be split according to the remaining token quota of the gap to obtain the first batch fragment and the second batch fragment.
[0122] The first batch of shards is processed by the remaining token quota of the gap;
[0123] The second batch of shards is processed according to the first rule.
[0124] Figure 5 illustrates a schematic diagram of the rate limiter when the batch traffic request is greater than 1, according to an embodiment of this application. As shown in Figure 5, the second batch traffic request occupies two quotas in time slot 1. Therefore, the first batch traffic request of 12 can be split into three traffic qps_batch_req_0, five traffic qps_batch_req_1, and four traffic qps_batch_req_3, using time slots 1, 2, and 3 respectively. The entire batch traffic passes through the rate limiter.
[0125] Furthermore, in this embodiment,
[0126] The second rule also includes:
[0127] When the batch traffic request is greater than 1, bursts of instantaneous traffic are allowed;
[0128] When the second batch of traffic requests obtains the gap token quota;
[0129] If the value of the maximum allowed burst traffic minus the number of the second batch of traffic requests is greater than or equal to the number of the first batch of traffic requests, the rate limiter will allow the flow directly.
[0130] If the value of the maximum allowed burst traffic minus the number of second batch traffic requests is less than the number of first batch traffic requests, it shall be processed according to the first rule.
[0131] Figure 6 illustrates a schematic diagram of a burst of instantaneous traffic passing through a rate limiter in an embodiment of this application, where the batch traffic request is greater than 1. As shown in Figure 6, after the second batch traffic in time slot 1 occupies 2 quotas, the burst traffic quota is max_bust-2=10-2=8 (quotas). When the first batch traffic is 8, the rate limiter uses an additional 8+2-5=5 (quotas). Each subsequent time slot can only compensate for a maximum of 1 quota, so a total of 5 quotas need to be compensated for the next 5 consecutive time slots. At the same time, the quota number of these 5 consecutive time slots changes from 5 to 4. After compensation, the time slot quota number is restored from 4 to 5.
[0132] Furthermore, this embodiment also includes a priority module;
[0133] The priority module sorts the batch traffic requests according to a preset first priority rule or a preset second priority rule.
[0134] The preset first priority rule includes:
[0135] When high-priority batch traffic needs to be split, all token quotas in each time slot are preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic, until all quotas are met.
[0136] High-priority traffic is given absolute priority. When high-priority batch traffic does not require splitting, it normally preempts the traffic limit quota. When splitting is required, if the first high-priority batch traffic acquires a traffic quota, then all traffic quotas in each subsequent time slot are prioritized for that high-priority split traffic. After satisfying the high-priority split traffic, the remaining traffic quota is then allocated to other traffic. During this process, if higher-priority batch traffic arrives, the traffic quota is prioritized for that higher-priority batch traffic. The drawback of this strategy is that after high-priority traffic preempts the traffic quota, it will lengthen the overall call return time for low-priority batch split traffic; in extreme scenarios, if high-priority traffic continues to arrive, low-priority traffic will remain in a state of starvation. In practice, batch traffic that fails to acquire a traffic quota within a certain time can return an error message to the caller to avoid timeouts caused by the call not returning.
[0137] The preset second priority rule includes:
[0138] When high-priority batch traffic needs to be split, a portion of the traffic quota for each time slot is preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic. The remaining quota is then fairly contested by the unsatisfied batch fragments resulting from the splitting of the high-priority batch traffic and other batch traffic.
[0139] High-priority traffic is given relative priority. When high-priority batch traffic does not need to be split, it normally preempts the traffic limit quota. When splitting is required, if the first high-priority batch traffic acquires a traffic quota, then the traffic quota for each subsequent time slot is preferentially allocated to that high-priority split traffic. In each time slot, the remaining quota is then fairly competed for by unmet high-priority batch traffic and other batch traffic. The drawback of this strategy is that competition from other batch traffic can lengthen the overall call return time for high-priority batch traffic. The advantage is that it avoids low-priority batch traffic from being constantly starved.
[0140] When other traffic is competing for quotas, if there is no priority, all traffic competes for quotas fairly; if there is a priority, high-priority traffic has priority in slot 1, and quota competition in subsequent slots is determined according to the priority scheduling strategy. Figure 7 shows a preset second priority rule of this application through the rate limiter in one embodiment. As shown in Figure 7, when high-priority traffic is given relative priority, other traffic may compete for traffic quotas. If other traffic occupies two quotas in slot 2, then the batch traffic of 12 can be split into 5 traffic qps_batch_req_0, 3 traffic qps_batch_req_1, and 4 traffic qps_batch_req_3, using slots 1, 2, and 3 respectively. The entire batch traffic passes through the rate limiter.
[0141] Figure 8 illustrates a schematic diagram of a batch traffic request greater than 1 in one embodiment of this application, where the burst instantaneous traffic degrades and passes through the rate limiter. As shown in Figure 8, after other traffic in time slot 1 occupies 2 quotas, the burst traffic quota is max_bust-2=10-2=8 (numbers). When the batch traffic is 10, it cannot directly pass through the rate limiter. Therefore, it degrades to a normal split mode to pass through the rate limiter. It is split into qps_batch_req_0 of 3 traffic, qps_batch_req_1 of 5 traffic, and qps_batch_req_2 of 2 traffic, and then passes through the rate limiter.
[0142] This embodiment uses a traffic splitting module to divide batch traffic requests into multiple batch fragments, avoiding frequent throttling of batch traffic requests when the quota does not meet the expected quota; avoiding permanent throttling of batch traffic requests when the expected quota exceeds the maximum limit of the rate limiter; splitting batch traffic into appropriately sized sub-batch traffic can make fuller use of the rate limiter's quota resources, avoiding waste when quota resources cannot meet the overall batch traffic, and improving the pass rate of batch traffic requests.
[0143] Batch traffic requests are passed through the rate limiter in a controlled and orderly manner. In existing technologies, if the traffic exceeds the time slot quota, it will be rejected; if the batch traffic request exceeds the period quota, the batch traffic request will never pass. This embodiment can break down the batch traffic into batch fragments that are suitable for the current time slot's throughput size, and then pass them through the rate limiter.
[0144] At the rate limiter level, a traffic splitting module is introduced, allowing for finer-grained adjustment of batch traffic. Batch traffic is split into multiple batch traffic as needed using a splitting function, and the return values of each batch are merged by a merging function and returned to the caller in a unified manner.
[0145] Featuring a priority mechanism, high-priority traffic has a greater probability of competing for rate limiter quotas. It employs flexible and customizable priority scheduling strategies to meet the needs of real-world business scenarios for priority traffic to compete for quotas. Currently, industry approaches to priority support include constructing multiple rate limiters with different priorities; and pre-configuring traffic source priorities and predetermined quotas within a single rate limiter, comparing priorities and allocating specified quotas based on traffic source identification. This application eliminates the need for pre-configuration, utilizing the inherent priority attributes of the traffic itself and employing customizable priority scheduling strategies to allocate quotas for each time slot, thus maximizing resource utilization. It avoids situations where, even if a time slot still has quota available, traffic cannot be used due to reaching the predetermined traffic quota limit for that business, resulting in underutilization of resources. The native priority mechanism makes it easier for high-priority traffic to pass through the rate limiter, improving interface response speed and meeting the needs of business scenarios.
[0146] It supports scenarios with sudden bursts of traffic and introduces a quota compensation mechanism, combining the characteristics of the token bucket algorithm in handling sudden traffic surges with the traffic uniformity of the leaky bucket algorithm. For sudden bursts of traffic, the quota compensation mechanism prevents the quota within a period from being exhausted instantly, leaving no traffic in subsequent time slots, thus reducing the instantaneous pressure on the system. The quota compensation mechanism also reduces the traffic quota in subsequent time slots for a certain period, providing the system with time to alleviate pressure.
[0147] It should be understood that the specific embodiments described above are merely illustrative or explanatory of the principles of this application and do not constitute a limitation thereof. Therefore, any modifications, equivalent substitutions, improvements, etc., made without departing from the spirit and scope of this application should be included within the protection scope of this application. Furthermore, the appended claims are intended to cover all variations and modifications falling within the scope and boundaries of the appended claims, or equivalent forms of such scope and boundaries.
[0148] The present application has been described above with reference to embodiments thereof. However, these embodiments are for illustrative purposes only and are not intended to limit the scope of the present application. The scope of the present application is defined by the appended claims and their equivalents. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of the present application, and all such substitutions and modifications should fall within the scope of the present application.
[0149] Although the embodiments of this application have been described in detail, it should be understood that various changes, substitutions and modifications can be made to the embodiments of this application without departing from the spirit and scope of this application.
[0150] Obviously, the above embodiments are merely illustrative examples for clear explanation and are not intended to limit the implementation. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is neither necessary nor possible to exhaustively list all possible implementations here. However, obvious variations or modifications derived therefrom are still within the scope of protection of this application.
Claims
1. A rate limiter that supports priority-based batch traffic requests, comprising: The system includes a rate limiter parameter configuration module, a traffic splitting module, a token management module, and a rate limit judgment and execution module. The current limiter parameter configuration module is used to set the parameters of the current limiter; The traffic splitting module is used to split batch traffic requests into multiple batch fragments; The token management module is used to record and manage the current token quota accumulated in the token bucket; The rate limiting judgment and execution module is used to perform rate limiting judgment on batch traffic requests based on the rate limiter parameters and the current token quota, and to perform actions to allow or split the batch traffic requests based on the judgment result. The action of splitting the batch traffic requests is implemented by the traffic splitting module.
2. The current limiter according to claim 1, wherein, The parameters of the flow limiter include: maximum bucket flow quota, flow limit cycle length, number of intervals per cycle, maximum allowable burst flow, maximum compensation quantity per interval, average quota per interval, and interval length. The average quota per gap is obtained by dividing the maximum flow quota of the bucket by the number of gaps per cycle. The duration of each gap is obtained by dividing the duration of the current limiting cycle by the number of gaps per cycle.
3. The flow restrictor of claim 2, wherein, Rate limiting for batch traffic requests includes: rate limiting based on the first rule; The first rule includes: When the batch traffic request is 1; If the number of the batch traffic requests is less than or equal to the average quota per gap of the rate limiter, the rate limiter will allow the requests directly. If the number of batch traffic requests is greater than the average quota per gap of the rate limiter, the rate limiter splits the batch traffic requests into multiple batch fragments according to the first formula through the traffic splitting module, and the rate limiter allows them to pass in sequence. The first formula is expressed as: Where S represents the number of batch shards in the first formula, qps_batch_req represents the number of batch traffic requests, and qps_per_interval represents the average quota per interval.
4. The flow restrictor of claim 3, wherein, The first rule also includes: When the batch traffic request is 1, bursty instantaneous traffic is allowed; If the number of the batch traffic requests is greater than the average quota per gap, but less than or equal to the maximum allowed burst traffic, the rate limiter will allow the requests to pass directly. If the number of batch traffic requests exceeds the maximum allowed burst traffic, the rate limiter will split the batch traffic requests into multiple batch fragments according to the second formula through the traffic splitting module, and the rate limiter will allow them in sequence. The second formula is expressed as: Where U represents the number of batch shards in the second formula, qps_batch_req represents the number of batch traffic requests, and max_bust represents the maximum allowed burst traffic.
5. The current limiter according to claim 4, wherein, When the number of the batch traffic requests exceeds the maximum allowed burst traffic, the current token quota in the token bucket is used; If the current token quota is less than the maximum allowed burst traffic, then the token quota in the subsequent time slot will be used for compensation, and the maximum compensation amount is the maximum compensation amount per slot in the flow limiter parameters.
6. The flow restrictor of claim 5, wherein, Rate limiting for batch traffic requests also includes: rate limiting based on the second rule; The second rule includes: When the number of batch traffic requests is greater than 1, the batch traffic requests include a first batch traffic request and a second batch traffic request; If the second batch of traffic requests obtains the token quota of the gap, the first batch of traffic requests will be split according to the remaining token quota of the gap to obtain the first batch fragment and the second batch fragment. The first batch of shards is processed by the remaining token quota of the gap; The second batch of shards is processed according to the first rule.
7. The flow restrictor of claim 6, wherein, The second rule also includes: When the batch traffic request is greater than 1, bursts of instantaneous traffic are allowed; When the second batch of traffic requests obtains the gap token quota; If the value of the maximum allowed burst traffic minus the number of the second batch of traffic requests is greater than or equal to the number of the first batch of traffic requests, the rate limiter will allow the flow directly. If the value of the maximum allowed burst traffic minus the number of second batch traffic requests is less than the number of first batch traffic requests, it shall be processed according to the first rule.
8. The flow restrictor of claim 1, wherein, It also includes a priority module; The priority module sorts the batch traffic requests according to a preset first priority rule or a preset second priority rule.
9. The flow restrictor of claim 8, wherein, The preset first priority rule includes: When high-priority batch traffic needs to be split, all token quotas in each time slot are preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic, until all quotas are met.
10. The flow restrictor of claim 8, wherein, The preset second priority rule includes: When high-priority batch traffic needs to be split, a portion of the traffic quota for each time slot is preferentially allocated to the batch fragments resulting from the splitting of the high-priority batch traffic. The remaining quota is then fairly contested by the unsatisfied batch fragments resulting from the splitting of the high-priority batch traffic and other batch traffic.