A tenant profiling method and device of a cloud-native database

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By collecting resource consumption data in real time from a cloud-native database and generating tenant profiles, we can achieve fine-grained isolation and dynamic management of resources, solve the problem of resource contention in multi-tenant environments, improve the responsiveness of core businesses and the fairness of tenant experience, and promote the intelligence of resource management and operational efficiency.

CN122243535APending Publication Date: 2026-06-19HIGHGO SOFTWARE

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: HIGHGO SOFTWARE
Filing Date: 2026-03-09
Publication Date: 2026-06-19

Application Information

Patent Timeline

09 Mar 2026

Application

19 Jun 2026

Publication

CN122243535A

IPC: G06Q30/0201; G06F18/213; G06F18/25

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

In multi-tenant environments, cloud-native databases struggle to effectively guarantee the responsiveness of core businesses and the fairness of tenant service experiences due to resource contention caused by heterogeneous tenant businesses and dynamic load fluctuations. Traditional monitoring methods cannot explain the reasons for resource consumption and lack differentiated management tools.

⚗Method used

By collecting multi-tenant resource consumption data in real time, a fine-grained resource monitoring foundation is built, elastic resource thresholds are configured and logical partition management is performed, and dynamic token buckets and queuing and rate limiting mechanisms are used to achieve fine-grained isolation and elegant control of resources, and tenant profile tags are generated to drive personalized policy adjustments.

🎯Benefits of technology

It enables fine-grained and elastic isolation of core resources such as CPU and I/O, alleviates resource contention issues, ensures the responsiveness of core businesses and the fairness of tenant experience, provides data-driven differentiated resource management, and improves overall operational efficiency and service quality.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122243535A_ABST

Patent Text Reader

Abstract

This application discloses a method and device for generating tenant profiles for cloud-native databases, comprising: real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment during the operation of the cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs; based on the resource consumption data, configuring elastic resource thresholds for each tenant, and performing logical partitioning management of the overall resources of the cloud-native database according to the resource pool identifier, generating a real-time interaction pool and a batch processing pool; when the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, initiating a queuing and rate limiting mechanism based on priority and dynamic token bucket for subsequent resource requests of the specified tenant, and obtaining execution result data; analyzing the resource consumption data and execution result data, extracting the resource usage behavior characteristics of each tenant, and generating profile tags for each tenant based on the resource usage behavior characteristics.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and in particular to a method and device for generating tenant profiles for cloud-native databases. Background Technology

[0002] With the maturity of cloud computing technology and the widespread application of the Software as a Service (SaaS) model, cloud-native databases have become the core infrastructure supporting multi-tenant SaaS platforms. By sharing underlying hardware and software resources, they provide data storage, processing, and management services to hundreds or thousands of tenants (such as e-commerce merchants and enterprise customers). This model significantly reduces the deployment and maintenance costs for individual tenants and improves overall resource utilization.

[0003] In typical scenarios such as e-commerce SaaS, platforms need to serve tenant groups with vastly different scales and business models. These range from large retail enterprises with daily order volumes exceeding 100,000 and requiring minute-level full inventory synchronization to small and medium-sized merchants with daily orders of only a few thousand and primarily engaged in low-frequency queries and data entry. This heterogeneity of tenant businesses and the dynamic fluctuations in load (such as "tidal" daytime peaks and nighttime troughs) pose unprecedented challenges to the resource management of the underlying database. Summary of the Invention

[0004] This specification provides one or more embodiments of a method and device for generating tenant profiles for cloud-native databases, which are used to solve the technical problems mentioned in the background art.

[0005] One or more embodiments of this specification employ the following technical solutions: This specification provides one or more embodiments of a method for generating tenant profiles for a cloud-native database, the method comprising: Real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment when running a cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interactive services and a batch processing pool for low-priority batch processing tasks. Based on the resource consumption data, elastic resource thresholds are configured for each tenant, and the overall resources of the cloud-native database are logically partitioned and managed according to the resource pool identifier. The real-time interaction pool and the batch processing pool are generated. In response to the detection that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, quotas are automatically borrowed from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to the detection that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the borrowed quotas are automatically returned to the batch processing pool. When it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data. The resource consumption data and the execution result data are analyzed to extract the resource usage behavior characteristics of each tenant, and a profile tag for each tenant is generated based on the resource usage behavior characteristics.

[0006] It should be noted that this method constructs a fine-grained resource monitoring foundation by collecting and correlating multi-dimensional resource consumption data of tenants with resource pool identifiers in real time. Based on this data, elastic thresholds are configured for each tenant, and overall resources are logically partitioned for management. This achieves refined and elastic isolation of core runtime resources such as CPU and I / O, allowing high-priority real-time services to automatically borrow idle quotas from the low-priority pool when resources are scarce. This effectively alleviates the "resource preemption" problem caused by tenant business heterogeneity and dynamic load fluctuations, ensuring the responsiveness of core services and the fairness of service experience for all tenants. Simultaneously, when tenant resource usage exceeds limits or the resource pool is overloaded, a priority-based and dynamic token bucket-based queuing and rate-limiting mechanism gracefully manages subsequent requests, avoiding the damage to business continuity caused by abruptly interrupting tasks. Ultimately, by analyzing long-term resource consumption data and execution results such as rate limiting, the system automatically extracts behavioral characteristics of each tenant in terms of resource consumption intensity, resource pool preference, and load fluctuation, and generates profile tags. This breaks the management black box of traditional monitoring, which can only show "what" the resource consumption is but cannot explain "why" or "who consumes it in what mode." This enables service providers to accurately identify tenants with different behavioral patterns (such as tidal load type and high-value core business type), thereby providing a solid data-driven basis for implementing differentiated resource protection strategies, conducting forward-looking capacity planning, and optimizing the overall operational efficiency of the platform. It realizes a fundamental transformation from static, passive "one-size-fits-all" resource restrictions to dynamic, intelligent "tailored" resource management and service optimization.

[0007] Furthermore, configuring elastic resource thresholds for each tenant includes: Configure differentiated elastic resource thresholds for the same resource dimension for the same tenant, based on different time dimensions.

[0008] It's important to note that this method dynamically matches resource allocation strategies with the cyclical fluctuations of tenant business load by allowing different elastic threshold limits to be set for the same type of resource (such as CPU) for the same tenant based on different time periods (such as peak and off-peak periods). Compared to traditional solutions using a single, fixed threshold, this method enables resource allocation strategies to proactively adapt to the "tidal" patterns of tenant business (such as daytime access peaks and nighttime processing troughs). Therefore, during preset periods of high tenant business load, the system can automatically provide more abundant resource quotas to ensure the smoothness and responsiveness of their critical business operations, avoiding request throttling or blocking due to reaching a fixed low threshold; during periods of low load, quotas are automatically reverted to prevent resources from being idle and wasted due to long-term invalid reservations. Ultimately, this mechanism drives resource management from a rigid "one-size-fits-all" model to a flexible "on-demand adaptation" model that aligns with actual business needs, significantly improving the overall platform resource utilization efficiency and allocation rationality while prioritizing the service experience (SLA) of each tenant.

[0009] Furthermore, the step of automatically borrowing quotas from the idle resources of the batch processing pool in response to detecting that the resource utilization rate of the real-time interaction pool has reached a first preset threshold includes: When the resource utilization rate of the real-time interaction pool reaches a first preset threshold, the idle resource quota in the batch processing pool is determined, and borrowing is performed based on the idle resource quota. The borrowing process also includes locking the preset basic resource quota in the batch processing pool to prevent batch processing tasks from overflowing into the real-time interaction pool.

[0010] It's important to note that this mechanism first defines clear resource scheduling trigger conditions and refined operations: when high-priority real-time interaction pool resources are scarce, the system doesn't simply reject requests. Instead, it automatically checks for idle resources in the low-priority batch processing pool and securely borrows them. This provides additional elastic resource guarantees for critical real-time services during peak business periods, effectively alleviating performance bottlenecks that might result from static resource allocation. Crucially, this "borrowing" action is not unrestrained resource misappropriation; it simultaneously implements protective measures to lock the basic quota of the batch processing pool. This design ensures that resource scheduling is unidirectional and controlled. While borrowing idle resources to address urgent real-time business needs, it fundamentally eliminates the possibility of large background tasks in the batch processing pool competing for or encroaching on real-time interaction pool resources, thus firmly upholding the bottom line of isolation that guarantees core business performance and determinism. Therefore, this mechanism achieves a balance between the often contradictory goals of "resource elasticity" and "core pool isolation," realizing intelligent and secure cross-resource pool allocation. This improves overall resource utilization while ensuring the deterministic service quality of critical businesses.

[0011] Furthermore, the step of automatically returning the borrowed quota to the batch processing pool in response to detecting that the resource utilization rate of the real-time interaction pool has fallen back to the second preset threshold includes: When the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the quota amount to be returned to the batch processing pool is determined, and the return is performed based on the quota amount. The execution return process also includes releasing the borrowed quota from the quota of the real-time interaction pool and restoring it to the available resources of the batch processing pool.

[0012] It should be noted that this return mechanism, together with the aforementioned borrowing mechanism, constitutes a complete, closed-loop resource elastic scheduling cycle. When the business pressure on the real-time interaction pool decreases and resource utilization returns to a safe level, this mechanism is automatically triggered, accurately calculating and executing the return operation of previously borrowed resources. This process not only promptly releases the real-time interaction pool from the additional resource holdings incurred due to temporary borrowing, restoring its resource status to normal, but more importantly, through the "release" and "restore" operations, the borrowed resources are completely and accurately returned to the available resource pool of the batch processing pool. This fundamentally guarantees the temporary and reversible nature of resource scheduling, ensuring the resource integrity of the batch processing pool and the predictability of its background task execution capabilities. Therefore, this mechanism, in conjunction with the borrowing mechanism, enables on-demand and secure bidirectional flow of resources between the two pools. This ensures that elastic scheduling does not disrupt the established logical isolation and guarantee baseline between the resource pools. After successfully handling sudden loads from real-time business operations, it can automatically restore the original resource structure of the system, thereby sustainably and stably improving the overall resource utilization rate in long-term operation and guaranteeing the service quality commitments of both types of business.

[0013] Furthermore, the execution result data includes at least one of the following quantifiable data: The actual available quota data for tenants across various resource dimensions; Record events where a tenant triggers rate limiting, enters a queue, or has their request rejected; Resource usage distribution data of the real-time interaction pool and the batch processing pool; The log of resource quota borrowing and reclamation behavior between the real-time interaction pool and the batch processing pool.

[0014] It is important to note that this method systematically defines and collects a complete set of quantifiable execution result data from multiple perspectives, constructing a panoramic monitoring data foundation encompassing static resource quotas, dynamic management events, and inter-pool scheduling behavior. Specifically, the actual available quota data for tenants reflects the real-time execution status of resource isolation strategies; rate limiting, queuing, and rejection event records accurately capture the specific timing and objects of resource contention; resource pool occupancy distribution data depicts the macro-level allocation and load of global resources; and resource borrowing and reclamation behavior logs comprehensively record the historical trajectory of the elastic scheduling mechanism. The aggregation of this structured data allows the system to transcend traditional monitoring that only focuses on instantaneous resource consumption, enabling comprehensive and objective auditing and backtracking of the execution effectiveness of resource management strategies, tenant resource acquisition experience, and the actual operation of the elastic scheduling mechanism. This provides a solid, multi-dimensional data basis for subsequent in-depth root cause analysis, strategy effectiveness evaluation, and tenant behavior pattern mining, upgrading resource management from experience-based, reactive operations to a scientific decision-making process based on full-data facts that is analyzable and optimizable.

[0015] Furthermore, based on the resource usage behavior characteristics, at least one of the following features is extracted: Resource consumption intensity characteristics are determined based on the frequency of events that trigger rate limiting or queuing by tenants; Resource pool preference characteristics are determined based on the ratio of tenants' historical resource usage distribution data in the real-time interaction pool and the batch processing pool. Load fluctuation characteristics are determined based on the time periods of events that occur in the event log set where tenant resource consumption reaches or exceeds their elastic resource threshold. Elastic demand characteristics are determined based on the frequency of tenant behavior logs that trigger resource pool borrowing or are restricted from expansion.

[0016] It should be noted that this method systematically defines and associates four specific resource usage behavior characteristics with underlying observable system events and data, transforming and abstracting raw, discrete operation logs (such as rate limiting events, resource pool occupancy records, threshold exceedance records, and borrowing behavior logs) into tenant behavior pattern labels with clear business semantics. This process elevates the understanding of tenant behavior from a simple "resource consumption statistics" level to a "behavioral intent and pattern recognition" level. By analyzing event frequency (consumption intensity, elastic demand), resource distribution ratio (resource pool preference), and the temporal pattern of event occurrence (load fluctuation), the system can automatically distinguish which tenants are high-resource-consuming, which prefer real-time interaction or batch processing, which businesses exhibit regular tidal fluctuations, and which have a sustained high demand for resource elasticity. This feature profiling based on objective behavioral data provides service providers with precise tenant insights that go beyond subjective experience-based judgments. This enables subsequent operational decisions, such as resource quota pre-allocation, flexible strategy adjustments, service level agreement (SLA) customization, and capacity planning, to shift from a passive "one-size-fits-all" or "remedial" approach to a proactive and intelligent management model of "tailored solutions" and "pre-emptive prediction." Ultimately, this achieves a high degree of alignment between resource allocation and actual business needs, improving overall service efficiency and operational sophistication.

[0017] Furthermore, the method also includes: The profile tags of each tenant are fed back to the step of configuring elastic resource thresholds and initiating queuing and flow control mechanisms for each tenant to optimize the resource isolation control strategy.

[0018] It's important to note that this method, by establishing a closed-loop pathway that feeds tenant profile tags back to preceding resource isolation control steps, fundamentally transforms resource management from static configuration based on general rules to dynamic, personalized optimization based on the understanding of individual tenant behavior. The system first analyzes historical resource consumption and control event data to abstract tenants into profile tags with specific behavioral patterns (such as "tidal load type" and "high-value core business type"). Then, these tags, imbued with business characteristics and needs, are directly applied to adjust the two core levers of the resource isolation control strategy: elastic resource thresholds and queuing / rate limiting priorities. For example, for identified "tidal load type" tenants, resource thresholds are automatically and temporarily increased during predicted peak business periods to preventatively adapt to their increased demand; or "high-value core business type" tenants are given higher scheduling priority during resource contention to ensure the continuity of their critical business. This closed-loop mechanism ensures that resource control strategies are no longer pre-set and fixed, but rather continuously evolve and precisely adapt as the understanding of tenant behavior deepens. Ultimately, the system evolves from a passive, extensive resource-constrained framework into an intelligent agent capable of continuously learning, predicting, and proactively optimizing resource allocation. This systematically improves the rationality of resource allocation, the fairness of service experience, and the refinement and intelligence of platform operation in complex multi-tenant heterogeneous load environments.

[0019] Furthermore, optimizing the resource isolation control strategy includes at least one of the following methods: Dynamically adjust the elastic resource threshold for the corresponding tenant; Optimize the priority of resource requests from the corresponding tenant in the queuing and rate limiting mechanism.

[0020] It's important to note that this method achieves an intelligent evolution of resource management strategies from general, static configuration to personalized, dynamically adaptable approaches by applying tenant behavior-based profile tags to two core control levers: dynamically adjusting elastic resource thresholds and optimizing priority in queuing mechanisms. Based on tenant behavior patterns revealed by the profile tags (such as "tidal load type" and "high-value core business type"), the system proactively adjusts the resource quota limits for specific tenants (e.g., temporarily increasing thresholds during predicted peak business periods) or assigns differentiated scheduling priorities when resource requests are contested. This transforms resource isolation control strategies from fixed or passive responses based solely on instantaneous load to proactive and precise adjustments tailored to the inherent, continuous business needs and value differences of different tenants. Ultimately, this mechanism significantly improves the matching accuracy between resource allocation and actual business needs while ensuring overall system stability and fairness, thereby optimizing the service continuity experience of critical businesses and driving overall resource utilization towards greater efficiency and intelligence.

[0021] Furthermore, the initiation of the priority-based and dynamic token bucket-based queuing and rate limiting mechanism includes: Configure an independent token bucket for each resource dimension of each tenant, and the capacity of the token bucket corresponds to the elastic resource threshold of the tenant in that resource dimension; The token generation rate of the token bucket is dynamically adjusted based on the tenant's historical peak resource consumption and the real-time utilization rate of the resource pool. When a tenant initiates a resource request, it needs to obtain a token from the corresponding token bucket; If the acquisition is successful, the corresponding operation can be executed; If the acquisition fails, the resource request is placed in a queue, and the resource requests in the queue are sorted according to their preset priority. When the token bucket generates new tokens according to the token generation rate, they are preferentially allocated to the resource requests with the highest priority in the queue.

[0022] It's important to note that this queuing and rate limiting mechanism achieves fine-grained and precise resource management by independently configuring token buckets for each resource dimension of each tenant and mapping bucket capacity to elastic resource thresholds. Its core innovation lies in combining dynamic rate control with static priority scheduling, forming an intelligent and differentiated traffic shaping and queue management strategy. The mechanism dynamically adjusts the token generation rate based on historical load and real-time system utilization, enabling traffic control to adapt to changes in overall system load and smooth out sudden traffic surges. When a request cannot immediately obtain a token, it is not simply rejected but instead placed in a queue ordered by preset priority. This design ensures that high-priority business requests (such as real-time transactions) receive priority waiting positions in the queue during resource-scarce periods. Finally, when system resources (in token form) become available again, they are allocated strictly according to the priority order in the queue, thus guaranteeing the access latency and service continuity of high-priority requests. Therefore, while effectively preventing system overload and achieving graceful rate limiting, this mechanism solves the problem that the traditional token bucket algorithm's "first-come, first-served" mode cannot distinguish the importance of businesses by introducing priority scheduling. This enables critical businesses to still obtain deterministic service quality assurance in resource contention scenarios, achieving a balance between the fairness of resource isolation and the differentiation of business assurance.

[0023] This specification provides one or more embodiments of a cloud-native database tenant profile generation device, comprising: At least one processor and bus; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, which, when executed by the at least one processor, enable the at least one processor to: Real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment when running a cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interactive services and a batch processing pool for low-priority batch processing tasks. Based on the resource consumption data, elastic resource thresholds are configured for each tenant, and the overall resources of the cloud-native database are logically partitioned and managed according to the resource pool identifier. The real-time interaction pool and the batch processing pool are generated. In response to the detection that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, quotas are automatically borrowed from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to the detection that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the borrowed quotas are automatically returned to the batch processing pool. When it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data. The resource consumption data and the execution result data are analyzed to extract the resource usage behavior characteristics of each tenant, and a profile tag for each tenant is generated based on the resource usage behavior characteristics.

[0024] The above-described at least one technical solution adopted in the embodiments of this specification can achieve the following beneficial effects: This method constructs a fine-grained resource monitoring foundation by collecting and correlating multi-dimensional resource consumption data and resource pool identifiers from tenants in real time. Based on this data, it configures elastic thresholds for each tenant and performs logical partitioning management of overall resources. This achieves refined and elastic isolation of core runtime resources such as CPU and I / O, allowing high-priority real-time services to automatically borrow idle quotas from the low-priority pool when resources are scarce. This effectively alleviates the "resource preemption" problem caused by tenant service heterogeneity and dynamic load fluctuations, ensuring the responsiveness of core services and the fairness of service experience for all tenants. Simultaneously, when tenant resource usage exceeds limits or the resource pool is overloaded, a priority-based and dynamic token bucket-based queuing and rate-limiting mechanism gracefully manages subsequent requests, avoiding the damage to business continuity caused by abrupt task interruptions. Ultimately, by analyzing long-term resource consumption data and execution results such as rate limiting, the system automatically extracts behavioral characteristics of each tenant in terms of resource consumption intensity, resource pool preference, and load fluctuation, and generates profile tags. This breaks the management black box of traditional monitoring, which can only show "what" the resource consumption is but cannot explain "why" or "who consumes it in what mode." This enables service providers to accurately identify tenants with different behavioral patterns (such as tidal load type and high-value core business type), thereby providing a solid data-driven basis for implementing differentiated resource protection strategies, conducting forward-looking capacity planning, and optimizing the overall operational efficiency of the platform. It realizes a fundamental transformation from static, passive "one-size-fits-all" resource restrictions to dynamic, intelligent "tailored" resource management and service optimization. Attached Figure Description

[0025] To more clearly illustrate the technical solutions in the embodiments or prior art of this specification, the drawings used in the description of the embodiments or prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings: Figure 1 A flowchart illustrating a method for generating tenant profiles for a cloud-native database, provided for one or more embodiments of this specification; Figure 2 This is a schematic diagram of the structure of a tenant profile generation device for a cloud-native database provided in one or more embodiments of this specification. Detailed Implementation

[0026] This specification provides a method and device for generating tenant profiles for cloud-native databases.

[0027] To enable those skilled in the art to better understand the technical solutions in this specification, the technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this specification, and not all embodiments. Based on the embodiments of this specification, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of this specification.

[0028] Figure 1 This diagram illustrates a process for generating tenant profiles for a cloud-native database, provided in one or more embodiments of this specification. This process can be executed by a cloud-native database tenant profile generation system. Certain input parameters or intermediate results in the process can be manually adjusted to help improve accuracy.

[0029] The method flow steps of the embodiments in this specification are as follows: S101, collect multi-dimensional resource consumption data of each tenant in the multi-tenant environment when running the cloud-native database in real time, and associate the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interaction services and a batch processing pool for low-priority batch processing tasks.

[0030] In the embodiments described in this specification, the system needs to deploy a resource acquisition module. This module captures multi-dimensional resource metrics consumed by each tenant session or request in real time at a fixed frequency (e.g., per second) by embedding the database kernel or calling its native monitoring interface (e.g., hooking to the SQL execution engine). These metrics mainly include computing resources (e.g., CPU time), IO resources (e.g., IOPS, throughput), connection resources (e.g., current number of connections), and business operation resources (e.g., SQL execution time).

[0031] During data collection, the system needs to parse the business semantic identifiers (such as SQL tags) carried by each resource consumption request. Based on preset rules (e.g., requests tagged as "order query" or "payment confirmation"), the system automatically associates them with a resource pool identifier, which indicates whether the request belongs to the "real-time interaction pool" or the "batch processing pool". Ultimately, each resource consumption data is formed into a structured record containing "tenant identifier, resource pool identifier, resource consumption metric, and timestamp".

[0032] The processed data stream is pushed to the downstream resource isolation control module in real time and simultaneously persisted to a time-series database as a basis for long-term analysis. The output of this step (i.e., resource consumption data and its associated resource pool identifiers) forms the basis for decision-making and analysis in all subsequent steps.

[0033] S102, based on the resource consumption data, configure elastic resource thresholds for each tenant, and perform logical partitioning management of the overall resources of the cloud-native database according to the resource pool identifier, generating the real-time interaction pool and the batch processing pool. In response to detecting that the resource utilization rate of the real-time interaction pool reaches a first preset threshold, automatically borrow quotas from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to detecting that the resource utilization rate of the real-time interaction pool falls back to a second preset threshold, automatically return the borrowed quotas to the batch processing pool.

[0034] In the embodiments described in this specification, this step is the core control layer of the system, which uses the data from S101 to achieve fine-grained resource isolation and intelligent scheduling.

[0035] Based on the resource pool identifiers provided by S101, the resource isolation control module logically divides the database cluster's core resources, such as total CPU and IO, into two resource pools: a real-time interaction pool and a batch processing pool. Simultaneously, based on analysis of historical resource consumption data, it configures elastic resource thresholds for each tenant across different resource dimensions (such as CPU and IOPS). These thresholds can be differentiated by time period (such as peak / off-peak hours) to initially adapt to business fluctuations.

[0036] The system continuously monitors the real-time utilization of the two resource pools.

[0037] When the utilization rate of the real-time interaction pool reaches a high-water mark threshold, the control module automatically calculates and borrows a portion of the quota from the current idle resource quota of the batch processing pool to prioritize real-time services. During this process, the system locks the basic quota of the batch processing pool to strictly prevent batch tasks within it from encroaching on real-time pool resources.

[0038] When the utilization rate of the real-time interaction pool drops to a safe threshold, the system automatically returns the previously borrowed quota to the batch processing pool, restoring its full available resources.

[0039] S103, when it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data.

[0040] In the embodiments of this specification, this step is the execution unit of the control layer, which performs fine-grained traffic control on the excessive behavior of specific tenants within the framework established in S102.

[0041] The system continuously compares the real-time resource consumption data collected in S101 with the elastic resource thresholds configured for each tenant in S102. When it is found that the resource consumption of any tenant reaches or exceeds its threshold, or the overall utilization rate of a resource pool reaches the warning line, rate limiting is triggered for that tenant.

[0042] The system instantiates a dynamic token bucket for the corresponding resource dimension of the rate-limited tenant. The capacity of the token bucket corresponds to the elastic resource threshold configured for that tenant in S102. The token generation rate is not fixed, but dynamically adjusted based on the tenant's historical peak resource consumption and the real-time utilization rate of its resource pool.

[0043] When a tenant initiates a new resource request, it must first obtain a token from its token bucket. If obtaining the token fails, the request is placed in a queue. Requests in the queue are ordered according to their preset business priority tags (associated in S101). When the token bucket generates a new token, it is preferentially allocated to the highest priority request in the queue.

[0044] During this step, detailed execution result data will be generated, such as rate limiting event records, queue status, and token consumption. This data is an important input for S104's analysis.

[0045] S104, Analyze the resource consumption data and the execution result data, extract the resource usage behavior characteristics of each tenant, and generate profile tags for each tenant based on the resource usage behavior characteristics.

[0046] In the embodiments described in this specification, this step is the intelligent analysis layer of the system, which drives the optimization closed loop of the entire system by analyzing historical data.

[0047] The tenant resource profiling and analysis module continuously aggregates and analyzes two types of data: one is the resource consumption data continuously collected by S101; the other is the execution result data generated by S103 (such as rate limiting frequency, queuing events, and resource pool borrowing logs).

[0048] Based on the above data, the module extracts resource usage behavior characteristics for each tenant through statistical analysis. For example: The determination is based on the frequency of events that trigger rate limiting or queuing in S103.

[0049] The determination is based on the consumption ratio of the tenant's historical data in the real-time interaction pool and the batch processing pool in S101.

[0050] The judgment is based on the "time period during which resource exceedance events occur in concentrated periods" in S103.

[0051] The determination is based on the frequency of triggering resource pool borrowing or limiting expansion behavior in S102 / S103.

[0052] Based on the extracted features, the system generates descriptive profile tags for each tenant, such as "tidal load tenant" and "high-value - core business tenant".

[0053] The generated profile tags are fed back to the resource isolation control module (i.e., the execution entity of S102 and S103) in real time. Based on this, the control module automatically optimizes the resource isolation control strategy, for example: For "tidal load tenants", dynamically adjust (increase) their elastic resource thresholds during their historical peak periods.

[0054] To optimize (increase) the priority of requests for "high-value - core business tenants" in the queuing and rate limiting mechanism of S103.

[0055] It should be noted that this method constructs a fine-grained resource monitoring foundation by collecting and correlating multi-dimensional resource consumption data of tenants with resource pool identifiers in real time. Based on this data, elastic thresholds are configured for each tenant, and overall resources are logically partitioned for management. This achieves refined and elastic isolation of core runtime resources such as CPU and I / O, allowing high-priority real-time services to automatically borrow idle quotas from the low-priority pool when resources are scarce. This effectively alleviates the "resource preemption" problem caused by tenant business heterogeneity and dynamic load fluctuations, ensuring the responsiveness of core services and the fairness of service experience for all tenants. Simultaneously, when tenant resource usage exceeds limits or the resource pool is overloaded, a priority-based and dynamic token bucket-based queuing and rate-limiting mechanism gracefully manages subsequent requests, avoiding the damage to business continuity caused by abruptly interrupting tasks. Ultimately, by analyzing long-term resource consumption data and execution results such as rate limiting, the system automatically extracts behavioral characteristics of each tenant in terms of resource consumption intensity, resource pool preference, and load fluctuation, and generates profile tags. This breaks the management black box of traditional monitoring, which can only show "what" the resource consumption is but cannot explain "why" or "who consumes it in what mode." This enables service providers to accurately identify tenants with different behavioral patterns (such as tidal load type and high-value core business type), thereby providing a solid data-driven basis for implementing differentiated resource protection strategies, conducting forward-looking capacity planning, and optimizing the overall operational efficiency of the platform. It realizes a fundamental transformation from static, passive "one-size-fits-all" resource restrictions to dynamic, intelligent "tailored" resource management and service optimization.

[0056] Furthermore, in the process of configuring elastic resource thresholds for each tenant, the method flow steps of this embodiment are as follows: S201 allows for the configuration of differentiated elastic resource thresholds for the same resource dimension for the same tenant, based on different time dimensions.

[0057] In the embodiments of this specification, this step is the core configuration link in the resource isolation control strategy to achieve "elasticity" and "refined" management. Its goal is to transform fixed resource constraints into intelligent strategies that can dynamically adapt to the business load patterns.

[0058] The implementation of this step relies on the output of preceding steps (such as resource data collection and behavioral analysis). Based on the analysis of tenants' historical resource consumption data, the system or administrator identifies the periodic patterns of their load (e.g., identifying "tidal load" tenants). These patterns are abstracted into different time dimensions, such as "daytime peak business hours," "nighttime off-peak business hours," and "promotional activity periods."

[0059] Based on the above analysis, the system allows users to bind multiple sets of elastic resource thresholds for the same resource dimension (e.g., CPU utilization or IOPS) for the same tenant via a policy configuration interface or API. For example, configuring a policy for an e-commerce tenant: During the "daytime peak hours (9:00-18:00)," set the elastic resource threshold for CPU usage to a higher level (such as 30% of the peak limit) to cope with intensive order queries and transaction requests.

[0060] During the "nighttime off-peak hours (00:00-6:00)," the elastic resource threshold for CPU usage is lowered to a basic level (such as 10% of the peak limit), because this is mainly for background tasks such as offline report generation.

[0061] During "major promotional events", a higher threshold contingency plan can be temporarily activated.

[0062] Once configured, the resource isolation control module will automatically select the appropriate elastic resource threshold as the benchmark for real-time monitoring and traffic management for the tenant based on the current time dimension. This replaces the approach of using a single fixed threshold.

[0063] It's important to note that this method dynamically matches resource allocation strategies with the cyclical fluctuations of tenant business load by allowing different elastic threshold limits to be set for the same type of resource (such as CPU) for the same tenant based on different time periods (such as peak and off-peak periods). Compared to traditional solutions using a single, fixed threshold, this method enables resource allocation strategies to proactively adapt to the "tidal" patterns of tenant business (such as daytime access peaks and nighttime processing troughs). Therefore, during preset periods of high tenant business load, the system can automatically provide more abundant resource quotas to ensure the smoothness and responsiveness of their critical business operations, avoiding request throttling or blocking due to reaching a fixed low threshold; during periods of low load, quotas are automatically reverted to prevent resources from being idle and wasted due to long-term invalid reservations. Ultimately, this mechanism drives resource management from a rigid "one-size-fits-all" model to a flexible "on-demand adaptation" model that aligns with actual business needs, significantly improving the overall platform resource utilization efficiency and allocation rationality while prioritizing the service experience (SLA) of each tenant.

[0064] Furthermore, in the process of automatically borrowing quotas from the idle resources of the batch processing pool in response to detecting that the resource utilization rate of the real-time interaction pool has reached a first preset threshold, the method flow steps of this embodiment are as follows: S301, under the condition that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, determine the idle resource quota in the batch processing pool, and perform borrowing based on the idle resource quota; The borrowing process also includes locking the preset basic resource quota in the batch processing pool to prevent batch processing tasks from overflowing into the real-time interaction pool.

[0065] In the embodiments described in this specification, this step is the core operation for implementing the elastic scheduling component of "dynamic isolation of dual logical resource pools". It defines how the system safely and intelligently allocates resources from the batch processing pool when the real-time interaction pool faces resource pressure, while strictly protecting the isolation of the core pool.

[0066] The execution of this step strictly depends on a clear trigger condition. The resource isolation control module continuously monitors and confirms that the overall resource utilization of the real-time interaction pool has reached a preset high water level (i.e., the first preset threshold). Once the condition is met, the system does not unconditionally demand resources; instead, it first queries the resource acquisition module to determine the current operating status of the batch processing pool. The system calculates the difference between the total quota of the batch processing pool and its currently occupied quota to accurately determine its idle resource limit. This "quota" represents the upper limit of resources available for safe allocation.

[0067] Based on the calculated idle resource quota, the system performs a borrowing operation, temporarily allocating the management rights of this quota to the real-time interaction pool to immediately alleviate its resource pressure and ensure the real-time interaction services within it. Simultaneously, a crucial protective operation is executed: locking the preset basic resource quota in the batch processing pool. This "basic resource quota" is the amount of resources pre-allocated to the batch processing pool to guarantee the minimum operational requirements of its internal tasks. The locking operation means that regardless of the resource pressure on the real-time interaction pool, tasks in the batch processing pool (such as large data synchronization jobs) are prohibited from using or waiting for this locked basic resource, thus physically cutting off the path for batch tasks to "overflow" or "compete" for resources from the real-time interaction pool.

[0068] Therefore, the "borrowing" described in S301 is a protected, one-way elastic resource scheduling. It allows the real-time interaction pool to use the idle resources of the batch processing pool during peak periods, but through the "locking" mechanism, it ensures that the core task operation foundation of the batch processing pool itself is not eroded, fundamentally eliminating cross-pool interference and maintaining the effectiveness of the dual-pool logical isolation architecture.

[0069] It's important to note that this mechanism first defines clear resource scheduling trigger conditions and refined operations: when high-priority real-time interaction pool resources are scarce, the system doesn't simply reject requests. Instead, it automatically checks for idle resources in the low-priority batch processing pool and securely borrows them. This provides additional elastic resource guarantees for critical real-time services during peak business periods, effectively alleviating performance bottlenecks that might result from static resource allocation. Crucially, this "borrowing" action is not unrestrained resource misappropriation; it simultaneously implements protective measures to lock the basic quota of the batch processing pool. This design ensures that resource scheduling is unidirectional and controlled. While borrowing idle resources to address urgent real-time business needs, it fundamentally eliminates the possibility of large background tasks in the batch processing pool competing for or encroaching on real-time interaction pool resources, thus firmly upholding the bottom line of isolation that guarantees core business performance and determinism. Therefore, this mechanism achieves a balance between the often contradictory goals of "resource elasticity" and "core pool isolation," realizing intelligent and secure cross-resource pool allocation. This improves overall resource utilization while ensuring the deterministic service quality of critical businesses.

[0070] Furthermore, in the process of automatically returning the borrowed quota to the batch processing pool in response to the detection that the resource utilization rate of the real-time interaction pool has fallen back to the second preset threshold, the method flow steps of this embodiment are as follows: S401, under the condition that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, determine the quota amount to be returned to the batch processing pool, and perform the return based on the quota amount; The execution return process also includes releasing the borrowed quota from the quota of the real-time interaction pool and restoring it to the available resources of the batch processing pool.

[0071] In the embodiments described in this specification, this step is the closing and reset stage of the elastic scheduling loop in the "dynamic isolation of dual logical resource pools". It defines how the system automatically and accurately returns the previously borrowed resources to the batch processing pool after the resource pressure of the real-time interaction pool is relieved, so as to restore the original resource allocation pattern of the system and ensure that the task execution capability of the batch processing pool is fully restored.

[0072] This step is triggered by a specific release condition. The resource isolation control module continuously monitors and confirms that the overall resource utilization of the real-time interaction pool has decreased and stabilized below a preset safety level (i.e., the second preset threshold). Once the condition is met, the system does not arbitrarily return an estimated amount; instead, it needs to determine the quota to be returned. This is typically achieved by querying and verifying the system's "resource borrowing record." This record, generated during a previous borrowing operation (S301), clearly records the specific resource type and quantity borrowed from the batch processing pool. Based on this historical record, the system accurately calculates the quota that needs to be returned this time.

[0073] Execution of repayment and state restoration: Based on a defined amount, the system performs a repayment operation. This operation specifically comprises two consecutive atomic actions: First, the system deducts the corresponding amount of resources from the currently available quota in the real-time interaction pool. This signifies that management rights over this portion of the resources have been relinquished from the real-time interaction pool.

[0074] Next, the system adds the same amount of resources back to the available resource pool of the batch processing pool. At the same time, the "locked" state of the basic resource quota of the batch processing pool during the borrowing phase can be released or maintained according to the policy, but the core is to ensure that the returned resources can be used immediately for scheduling tasks within the batch processing pool.

[0075] The "return" described in S401 is a key operation that ensures the temporary and reversible nature of resource scheduling. It ensures that elastic borrowing does not permanently change the ownership of the resource pool, allowing background tasks in the batch processing pool to regain full resource guarantees at the appropriate time after their resources are temporarily used to support for-front-end business, thereby maintaining the fairness of the system's long-term operation and the determinism of resource planning.

[0076] It should be noted that this return mechanism, together with the aforementioned borrowing mechanism, constitutes a complete, closed-loop resource elastic scheduling cycle. When the business pressure on the real-time interaction pool decreases and resource utilization returns to a safe level, this mechanism is automatically triggered, accurately calculating and executing the return operation of previously borrowed resources. This process not only promptly releases the real-time interaction pool from the additional resource holdings incurred due to temporary borrowing, restoring its resource status to normal, but more importantly, through the "release" and "restore" operations, the borrowed resources are completely and accurately returned to the available resource pool of the batch processing pool. This fundamentally guarantees the temporary and reversible nature of resource scheduling, ensuring the resource integrity of the batch processing pool and the predictability of its background task execution capabilities. Therefore, this mechanism, in conjunction with the borrowing mechanism, enables on-demand and secure bidirectional flow of resources between the two pools. This ensures that elastic scheduling does not disrupt the established logical isolation and guarantee baseline between the resource pools. After successfully handling sudden loads from real-time business operations, it can automatically restore the original resource structure of the system, thereby sustainably and stably improving the overall resource utilization rate in long-term operation and guaranteeing the service quality commitments of both types of business.

[0077] Furthermore, the execution result data includes at least one of the following quantifiable data: The actual available quota data for tenants across various resource dimensions; Record events where a tenant triggers rate limiting, enters a queue, or has their request rejected; Resource usage distribution data of the real-time interaction pool and the batch processing pool; The log of resource quota borrowing and reclamation behavior between the real-time interaction pool and the batch processing pool.

[0078] It should be noted that the execution result data is a structured log generated by the resource isolation control module during operation, recording its control actions and system status. The core of its implementation plan lies in definition, collection, and storage.

[0079] The actual available quota data for each tenant across all resource dimensions is calculated in real-time by the resource isolation control module. The system dynamically calculates the tenant's "actual available quota" at the current moment based on the elastic resource threshold configured for each tenant, the real-time overall quota of its resource pool, and deducting the amount of resources currently used. For example, if a tenant's CPU elastic threshold is 20%, and it has already used 5%, then its current available quota is 15%. This data needs to be continuously updated with timestamps, using tenant and resource dimensions as keys, reflecting the real-time, dynamic results after resource policy execution, rather than static configuration.

[0080] For events involving tenants triggering rate limiting, entering queuing, or having their requests rejected, this record is generated directly at runtime by the queuing and rate limiting mechanism. When the system detects that a tenant's resource consumption exceeds the limit and triggers rate limiting, a "Trigger Rate Limiting" event is generated; when a request enters the queuing queue due to insufficient tokens, an "Enter Queuing" event is recorded; if a request times out or is directly rejected while waiting in the queue, a "Request Rejected" event is recorded. Each record must include key elements: tenant identifier, trigger time, event type (rate limiting / queuing / rejection), resource dimension involved, associated resource pool identifier, and specific request information (such as SQL ID). This provides a precise audit trail of control actions.

[0081] The resource usage distribution data for the real-time interaction pool and batch processing pool originates from the overall monitoring of the two logical resource pools by the resource acquisition module, but is aggregated and labeled by the resource isolation control module. It describes, at a given moment, the proportion of total resources in each of the real-time interaction pool and batch processing pool that are used by tenants, the proportion that are idle, and the proportion that are locked or borrowed due to elastic scheduling. This data provides a snapshot of resource distribution at the macro-pool level and is an important state view for judging pool-level load and triggering inter-pool elastic scheduling (borrowing / returning).

[0082] The log for resource quota borrowing and reclamation between the real-time interaction pool and the batch processing pool is generated by the submodule implementing dynamic scheduling between the two pools. When the system responds to the real-time interaction pool utilization reaching a threshold and executes a "borrowing" operation, a borrowing log is recorded, including the borrowing time, lender (batch processing pool), borrower (real-time interaction pool), and borrowed resource quota. Similarly, when executing a "return" operation, a corresponding reclamation log is recorded. This log comprehensively records the historical trajectory of elastic flow between resource pools and is the basis for analyzing the system's elastic scheduling activity, evaluating the effectiveness of scheduling strategies, and calculating "elastic resource costs."

[0083] It is important to note that this method systematically defines and collects a complete set of quantifiable execution result data from multiple perspectives, constructing a panoramic monitoring data foundation encompassing static resource quotas, dynamic management events, and inter-pool scheduling behavior. Specifically, the actual available quota data for tenants reflects the real-time execution status of resource isolation strategies; rate limiting, queuing, and rejection event records accurately capture the specific timing and objects of resource contention; resource pool occupancy distribution data depicts the macro-level allocation and load of global resources; and resource borrowing and reclamation behavior logs comprehensively record the historical trajectory of the elastic scheduling mechanism. The aggregation of this structured data allows the system to transcend traditional monitoring that only focuses on instantaneous resource consumption, enabling comprehensive and objective auditing and backtracking of the execution effectiveness of resource management strategies, tenant resource acquisition experience, and the actual operation of the elastic scheduling mechanism. This provides a solid, multi-dimensional data basis for subsequent in-depth root cause analysis, strategy effectiveness evaluation, and tenant behavior pattern mining, upgrading resource management from experience-based, reactive operations to a scientific decision-making process based on full-data facts that is analyzable and optimizable.

[0084] Furthermore, based on the resource usage behavior characteristics, at least one of the following features is extracted: Resource consumption intensity characteristics are determined based on the frequency of events that trigger rate limiting or queuing by tenants; Resource pool preference characteristics are determined based on the ratio of tenants' historical resource usage distribution data in the real-time interaction pool and the batch processing pool. Load fluctuation characteristics are determined based on the time periods of events that occur in the event log set where tenant resource consumption reaches or exceeds their elastic resource threshold. Elastic demand characteristics are determined based on the frequency of tenant behavior logs that trigger resource pool borrowing or are restricted from expansion.

[0085] It should be noted that this step is the core processing step of the tenant resource profiling and analysis module, aiming to transform raw system monitoring and control logs into tenant behavior pattern tags with business semantics. The core of its implementation plan is to perform statistical analysis and pattern recognition on specific types of historical data.

[0086] The extraction of resource consumption intensity characteristics relies directly on the "event records of tenants triggering rate limiting, entering queues, or having their requests rejected" in the execution result data. Within a set analysis period (e.g., the past 24 hours or 7 days), the system filters out records belonging to the target tenant that are of the type "triggered rate limiting" or "entered queues" from the event logs. Subsequently, the total frequency of these events is calculated. A higher frequency indicates that the tenant triggers the system's protection mechanisms more frequently per unit of time, thus being marked as having higher resource consumption intensity. This reflects the urgent need and continuous pressure on resources for its business.

[0087] The extraction of resource pool preference features primarily relies on resource consumption data, combined with the associated information of resource pool identifiers. The system aggregates and calculates the total resources consumed by the target tenant in the real-time interaction pool and batch processing pool from historical resource consumption data stored in persistent storage (calculation can be performed separately or comprehensively based on core dimensions such as CPU and IO). Then, it calculates the proportion of the tenant's resource consumption in the real-time interaction pool to its total consumption. Based on the proportion (e.g., exceeding 70% indicates a "strong real-time interaction preference"), it can be determined whether the tenant is a core business user (preferring the real-time pool) or a batch operation user (preferring the batch pool).

[0088] For extracting load fluctuation characteristics, this feature extraction relies on the "tenant triggers rate limiting" event records in the execution result data, as well as the elastic resource thresholds configured for the tenant. The system collects all historical event records of the tenant where "resource consumption reached or exceeded its elastic resource threshold" and extracts the timestamp of each record. By clustering and analyzing the distribution patterns of these timestamps (e.g., statistical distribution over a 24-hour day), the time periods in which events are concentrated can be identified. If events are clearly concentrated in a few fixed peak periods each day (e.g., 10 AM to 12 PM), the tenant can be determined to have tidal load characteristics; if the event distribution is sparse and irregular, it may belong to a stable load type.

[0089] The extraction of elastic demand characteristics relies on the "resource pool borrowing and reclamation behavior logs" in the execution result data. The system filters entries related to the tenant from the behavior logs. These include two categories: first, behavior logs showing that the tenant's real-time interaction pool "borrows" resources from the batch processing pool due to high overall utilization; and second, behavior logs showing that the tenant's requests are "limited expansion" when resources are scarce. The frequency of these behaviors within the analysis period is then statistically analyzed. Frequent borrowing or frequent limited expansion indicates that the tenant's business has significant elastic demand for resources, and its resource consumption fluctuates considerably.

[0090] It should be noted that this method systematically defines and associates four specific resource usage behavior characteristics with underlying observable system events and data, transforming and abstracting raw, discrete operation logs (such as rate limiting events, resource pool occupancy records, threshold exceedance records, and borrowing behavior logs) into tenant behavior pattern labels with clear business semantics. This process elevates the understanding of tenant behavior from a simple "resource consumption statistics" level to a "behavioral intent and pattern recognition" level. By analyzing event frequency (consumption intensity, elastic demand), resource distribution ratio (resource pool preference), and the temporal pattern of event occurrence (load fluctuation), the system can automatically distinguish which tenants are high-resource-consuming, which prefer real-time interaction or batch processing, which businesses exhibit regular tidal fluctuations, and which have a sustained high demand for resource elasticity. This feature profiling based on objective behavioral data provides service providers with precise tenant insights that go beyond subjective experience-based judgments. This enables subsequent operational decisions, such as resource quota pre-allocation, flexible strategy adjustments, service level agreement (SLA) customization, and capacity planning, to shift from a passive "one-size-fits-all" or "remedial" approach to a proactive and intelligent management model of "tailored solutions" and "pre-emptive prediction." Ultimately, this achieves a high degree of alignment between resource allocation and actual business needs, improving overall service efficiency and operational sophistication.

[0091] Furthermore, the method flow steps of the embodiments in this specification are as follows: S501, the profile tags of each tenant are fed back to the step of configuring elastic resource thresholds and activating queuing and flow control mechanisms related to resource isolation control strategies for each tenant, so as to optimize the resource isolation control strategies.

[0092] In the embodiments described in this specification, this step is the core operation for realizing "closed-loop optimization driven by tenant behavior profiles" and is the final step in forming an intelligent management closed loop. It marks the system's transition from the "analysis and cognition" stage to the "decision execution" stage, transforming the analysis results into specific optimization actions.

[0093] This step is triggered by the tenant resource profiling and analysis module. Once this module completes the analysis of tenant historical data and generates or updates profile tags (such as "tidal load type" or "high value - core business type"), it will immediately push these tags and their corresponding tenant identifiers to the resource isolation control module via internal communication interfaces (such as message bus or API calls). The feedback is targeted and immediate, ensuring that the control module can obtain the latest tenant behavior insights.

[0094] Upon receiving a profile tag, the resource isolation control module activates its built-in policy optimizer. Based on the behavioral semantics implied by the tag, the optimizer automatically matches and triggers predefined optimization rules, thereby dynamically adjusting the resource isolation control policy for that tenant. This is not a complete redesign of the policy, but rather a personalized and adaptive adjustment of parameters based on the existing policy framework (elastic resource thresholds, queuing and rate limiting mechanisms).

[0095] It's important to note that this method, by establishing a closed-loop pathway that feeds tenant profile tags back to preceding resource isolation control steps, fundamentally transforms resource management from static configuration based on general rules to dynamic, personalized optimization based on the understanding of individual tenant behavior. The system first analyzes historical resource consumption and control event data to abstract tenants into profile tags with specific behavioral patterns (such as "tidal load type" and "high-value core business type"). Then, these tags, imbued with business characteristics and needs, are directly applied to adjust the two core levers of the resource isolation control strategy: elastic resource thresholds and queuing / rate limiting priorities. For example, for identified "tidal load type" tenants, resource thresholds are automatically and temporarily increased during predicted peak business periods to preventatively adapt to their increased demand; or "high-value core business type" tenants are given higher scheduling priority during resource contention to ensure the continuity of their critical business. This closed-loop mechanism ensures that resource control strategies are no longer pre-set and fixed, but rather continuously evolve and precisely adapt as the understanding of tenant behavior deepens. Ultimately, the system evolves from a passive, extensive resource-constrained framework into an intelligent agent capable of continuously learning, predicting, and proactively optimizing resource allocation. This systematically improves the rationality of resource allocation, the fairness of service experience, and the refinement and intelligence of platform operation in complex multi-tenant heterogeneous load environments.

[0096] Furthermore, optimizing the resource isolation control strategy includes at least one of the following methods: Dynamically adjust the elastic resource threshold for the corresponding tenant; Optimize the priority of resource requests from the corresponding tenant in the queuing and rate limiting mechanism.

[0097] In the embodiments described in this specification, this step is the specific execution process by which the "strategy optimizer" adjusts the existing resource management strategy in a personalized and dynamic manner after receiving the "profile tag". Its core implementation is to transform the abstract "profile tag" into specific control parameter adjustment instructions.

[0098] For dynamically adjusting the elastic resource thresholds for corresponding tenants, the strategy optimizer triggers this optimization action when it receives a profile label such as "tidal load type" that characterizes load patterns. The specific basis for optimization is the load fluctuation characteristic data behind the label (such as historical time periods in which resource exceedance events occur in clusters). Based on the patterns identified by feature analysis, the system automatically configures a time-dimensional scheduling strategy for the tenant's elastic resource thresholds. For example, during the predicted peak business hours for the tenant (such as 10:00 AM to 12:00 PM daily), the system automatically and temporarily increases the upper limit of its CPU or IOPS threshold by a certain percentage; during its off-peak business hours (such as early morning), it automatically restores or lowers the threshold to the baseline level. This adjustment is dynamic and periodic, requiring no manual intervention.

[0099] To optimize the priority of resource requests from a given tenant within the queuing and rate-limiting mechanism, this optimization action is triggered when the strategy optimizer receives a profile label such as "High Value - Core Business Type," which indicates business importance. The optimization is based on the resource pool preference characteristics (strongly biased towards real-time interaction pools) and business criticality implied by the label. The system automatically modifies the scheduling parameters for that tenant within the queuing and rate-limiting mechanism. Specifically, when a tenant's resource request enters the queue due to insufficient tokens, the system will prioritize it based on its higher assigned scheduling priority, placing it further up the queue. When new tokens are generated, they will be prioritized for allocation to such high-priority requests, significantly reducing their waiting time and ensuring the continuity of response for their core business.

[0100] It's important to note that this method achieves an intelligent evolution of resource management strategies from general, static configuration to personalized, dynamically adaptable approaches by applying tenant behavior-based profile tags to two core control levers: dynamically adjusting elastic resource thresholds and optimizing priority in queuing mechanisms. Based on tenant behavior patterns revealed by the profile tags (such as "tidal load type" and "high-value core business type"), the system proactively adjusts the resource quota limits for specific tenants (e.g., temporarily increasing thresholds during predicted peak business periods) or assigns differentiated scheduling priorities when resource requests are contested. This transforms resource isolation control strategies from fixed or passive responses based solely on instantaneous load to proactive and precise adjustments tailored to the inherent, continuous business needs and value differences of different tenants. Ultimately, this mechanism significantly improves the matching accuracy between resource allocation and actual business needs while ensuring overall system stability and fairness, thereby optimizing the service continuity experience of critical businesses and driving overall resource utilization towards greater efficiency and intelligence.

[0101] Furthermore, the method flow steps for initiating the priority-based and dynamic token bucket-based queuing and rate limiting mechanism in this embodiment are as follows: This process is the core execution logic of the resource isolation control module, which implements refined and differentiated traffic control for subsequent resource requests under specific triggering conditions (excessive tenant resource consumption or overall resource pool overload).

[0102] S601, Configure an independent token bucket for each resource dimension of each tenant, wherein the capacity of the token bucket corresponds to the elastic resource threshold of the tenant in that resource dimension.

[0103] In the embodiments described in this specification, after the system configures elastic resource thresholds for a tenant (such as a CPU limit of 20% and an IOPS limit of 1000), the resource isolation control module instantiates an independent token bucket for each managed resource dimension of each tenant. The capacity of this bucket (i.e., the total number of tokens it can hold) is directly related to or proportionally mapped to the elastic resource threshold value of the corresponding dimension, thereby ensuring that the token bucket mechanism can accurately execute the resource limit policy set for the tenant.

[0104] S602, dynamically adjust the token generation rate of the token bucket based on the tenant's historical resource consumption peak and the real-time utilization rate of the resource pool.

[0105] In the embodiments described in this specification, this step is the core of the "dynamic" feature. The system continuously acquires two types of data from the resource acquisition module: first, the tenant's historical peak resource consumption (from long-term monitoring data); and second, the real-time utilization rate of the resource pool (real-time interaction pool or batch processing pool) to which the tenant belongs. A built-in adjustment algorithm (such as part of a policy optimizer) calculates and updates the token generation rate of the corresponding token bucket in real time based on these inputs. For example, when the resource pool utilization is low and there are many idle resources, the token generation rate can be appropriately increased to allow for faster request processing; conversely, when the resource pool is approaching saturation, the generation rate is reduced to tighten traffic.

[0106] S603 When a tenant initiates a resource request, it needs to obtain a token from the corresponding token bucket.

[0107] S604 If the acquisition is successful, the corresponding operation is allowed to be executed.

[0108] In the embodiments described in this specification, when a tenant initiates a resource request (such as executing a SQL statement), the system first attempts to obtain a token from the token bucket of the corresponding resource dimension. This is an immediate check. If the token is successfully obtained, it indicates that the tenant's resource consumption at the current moment is still within its elastic quota, and the system immediately allows the request to execute the corresponding database operation.

[0109] S605, if the acquisition fails, the resource request is placed in a queue, and the resource requests in the queue are sorted according to their preset priorities.

[0110] In the embodiments described in this specification, if obtaining a token fails, it indicates that the tenant has instantly exhausted its quota in the current resource dimension. In this case, the system will not abruptly reject the request, but will instead place it in a queue to wait. When placing the request in the queue, the system will determine its order position in the queue based on the preset priority label carried by the request (for example, a request labeled "payment confirmation" has a higher priority than "data export"), thereby forming a priority-ordered waiting sequence.

[0111] S606, when the token bucket generates a new token according to the token generation rate, it is preferentially allocated to the resource request with the highest priority in the queue.

[0112] In the embodiments described in this specification, the token bucket continuously generates new tokens according to the token generation rate set in S602. Once a new token becomes available, the system checks the queue and prioritizes assigning it to the highest-priority resource request within it. Once the request obtains the token, it is allowed to execute and removed from the queue. This ensures that high-priority service requests experience lower latency in resource contention scenarios.

[0113] It's important to note that this queuing and rate limiting mechanism achieves fine-grained and precise resource management by independently configuring token buckets for each resource dimension of each tenant and mapping bucket capacity to elastic resource thresholds. Its core innovation lies in combining dynamic rate control with static priority scheduling, forming an intelligent and differentiated traffic shaping and queue management strategy. The mechanism dynamically adjusts the token generation rate based on historical load and real-time system utilization, enabling traffic control to adapt to changes in overall system load and smooth out sudden traffic surges. When a request cannot immediately obtain a token, it is not simply rejected but instead placed in a queue ordered by preset priority. This design ensures that high-priority business requests (such as real-time transactions) receive priority waiting positions in the queue during resource-scarce periods. Finally, when system resources (in token form) become available again, they are allocated strictly according to the priority order in the queue, thus guaranteeing the access latency and service continuity of high-priority requests. Therefore, while effectively preventing system overload and achieving graceful rate limiting, this mechanism solves the problem that the traditional token bucket algorithm's "first-come, first-served" mode cannot distinguish the importance of businesses by introducing priority scheduling. This enables critical businesses to still obtain deterministic service quality assurance in resource contention scenarios, achieving a balance between the fairness of resource isolation and the differentiation of business assurance.

[0114] Figure 2 A schematic diagram of a tenant profile generation device for a cloud-native database, provided for one or more embodiments of this specification, includes: At least one processor and bus; and, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, which, when executed by the at least one processor, enable the at least one processor to: Real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment when running a cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interactive services and a batch processing pool for low-priority batch processing tasks. Based on the resource consumption data, elastic resource thresholds are configured for each tenant, and the overall resources of the cloud-native database are logically partitioned and managed according to the resource pool identifier. The real-time interaction pool and the batch processing pool are generated. In response to the detection that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, quotas are automatically borrowed from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to the detection that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the borrowed quotas are automatically returned to the batch processing pool. When it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data. The resource consumption data and the execution result data are analyzed to extract the resource usage behavior characteristics of each tenant, and a profile tag for each tenant is generated based on the resource usage behavior characteristics.

[0115] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the embodiments of apparatus, devices, and non-volatile computer storage media are basically similar to the method embodiments, so the descriptions are relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0116] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the apparatus embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0117] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0118] In the embodiments provided in this application, it should be understood that the disclosed apparatus / network devices and methods can be implemented in other ways. For example, the apparatus / network device embodiments described above are merely illustrative. For instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0119] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0120] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The aforementioned units can be implemented in hardware or software.

[0121] If the integrated module / unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc. It should be noted that the content included in the computer-readable medium can be appropriately added or removed according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media do not include electrical carrier signals and telecommunication signals.

[0122] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A tenant profiling method of a cloud-native database, characterized in that, The method includes: Real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment when running a cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interactive services and a batch processing pool for low-priority batch processing tasks. Based on the resource consumption data, elastic resource thresholds are configured for each tenant, and the overall resources of the cloud-native database are logically partitioned and managed according to the resource pool identifier. The real-time interaction pool and the batch processing pool are generated. In response to the detection that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, quotas are automatically borrowed from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to the detection that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the borrowed quotas are automatically returned to the batch processing pool. When it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data. The resource consumption data and the execution result data are analyzed to extract the resource usage behavior characteristics of each tenant, and a profile tag for each tenant is generated based on the resource usage behavior characteristics.

2. The method of claim 1, wherein, The configuration of elastic resource thresholds for each tenant includes: Configure differentiated elastic resource thresholds for the same resource dimension for the same tenant, based on different time dimensions.

3. The method of claim 1, wherein, The step of automatically borrowing quotas from the idle resources of the batch processing pool in response to the detection that the resource utilization rate of the real-time interaction pool has reached a first preset threshold includes: When the resource utilization rate of the real-time interaction pool reaches a first preset threshold, the idle resource quota in the batch processing pool is determined, and borrowing is performed based on the idle resource quota. The borrowing process also includes locking the preset basic resource quota in the batch processing pool to prevent batch processing tasks from overflowing into the real-time interaction pool.

4. The method of claim 1, wherein, The step of automatically returning the borrowed quota to the batch processing pool in response to the detection that the resource utilization rate of the real-time interaction pool has fallen back to the second preset threshold includes: When the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the quota amount to be returned to the batch processing pool is determined, and the return is performed based on the quota amount. The execution return process also includes releasing the borrowed quota from the quota of the real-time interaction pool and restoring it to the available resources of the batch processing pool.

5. The method of claim 1, wherein, The execution result data includes at least one of the following quantifiable data: The actual available quota data for tenants across various resource dimensions; Record events where a tenant triggers rate limiting, enters a queue, or has their request rejected; Resource usage distribution data of the real-time interaction pool and the batch processing pool; The log of resource quota borrowing and reclamation behavior between the real-time interaction pool and the batch processing pool.

6. The method of claim 5, wherein, Based on the resource usage behavior characteristics, at least one of the following features is extracted: Resource consumption intensity characteristics are determined based on the frequency of events that trigger rate limiting or queuing by tenants; Resource pool preference characteristics are determined based on the ratio of tenants' historical resource usage distribution data in the real-time interaction pool and the batch processing pool. Load fluctuation characteristics are determined based on the time periods of events that occur in the event log set where tenant resource consumption reaches or exceeds their elastic resource threshold. Elastic demand characteristics are determined based on the frequency of tenant behavior logs that trigger resource pool borrowing or are restricted from expansion.

7. The method of claim 1, wherein, The method further includes: The profile tags of each tenant are fed back to the step of configuring elastic resource thresholds and initiating queuing and flow control mechanisms for each tenant to optimize the resource isolation control strategy.

8. The method according to claim 7, characterized in that, The optimization of the resource isolation control strategy includes at least one of the following methods: Dynamically adjust the elastic resource threshold for the corresponding tenant; Optimize the priority of resource requests from the corresponding tenant in the queuing and rate limiting mechanism.

9. The method according to claim 1, characterized in that, The activation of the priority-based and dynamic token bucket-based queuing and rate limiting mechanism includes: Configure an independent token bucket for each resource dimension of each tenant, and the capacity of the token bucket corresponds to the elastic resource threshold of the tenant in that resource dimension; The token generation rate of the token bucket is dynamically adjusted based on the tenant's historical peak resource consumption and the real-time utilization rate of the resource pool. When a tenant initiates a resource request, it needs to obtain a token from the corresponding token bucket; If the acquisition is successful, the corresponding operation can be executed; If the acquisition fails, the resource request is placed in a queue, and the resource requests in the queue are sorted according to their preset priority. When the token bucket generates new tokens according to the token generation rate, they are preferentially allocated to the resource requests with the highest priority in the queue.

10. A tenant profile generation device for a cloud-native database, characterized in that, include: At least one processor and bus; as well as, A memory communicatively connected to the at least one processor; wherein, The memory stores instructions executable by the at least one processor, which, when executed by the at least one processor, enable the at least one processor to: Real-time collection of multi-dimensional resource consumption data of each tenant in a multi-tenant environment when running a cloud-native database, and association of the resource consumption data with the resource pool identifier to which the resource consumption request belongs. The resource pool identifier includes at least a real-time interaction pool for high-priority real-time interactive services and a batch processing pool for low-priority batch processing tasks. Based on the resource consumption data, elastic resource thresholds are configured for each tenant, and the overall resources of the cloud-native database are logically partitioned and managed according to the resource pool identifier. The real-time interaction pool and the batch processing pool are generated. In response to the detection that the resource utilization rate of the real-time interaction pool reaches the first preset threshold, quotas are automatically borrowed from the idle resources of the batch processing pool to ensure the execution of real-time interaction services. In response to the detection that the resource utilization rate of the real-time interaction pool falls back to the second preset threshold, the borrowed quotas are automatically returned to the batch processing pool. When it is detected that the resource consumption of a specified tenant reaches or exceeds the corresponding elastic resource threshold, or the overall utilization rate of the resource pool reaches the third preset threshold, a queuing and rate limiting mechanism based on priority and dynamic token bucket is initiated for the subsequent resource requests of the specified tenant to obtain execution result data. The resource consumption data and the execution result data are analyzed to extract the resource usage behavior characteristics of each tenant, and a profile tag for each tenant is generated based on the resource usage behavior characteristics.