A database write amplification optimization method based on frozen page identification and isolation
By identifying and isolating frozen pages in the database, partitioning them using popularity information and lifecycle, and combining this with a trained model, the write amplification problem in the database was solved, thus improving database performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HUAZHONG UNIV OF SCI & TECH
- Filing Date
- 2025-04-17
- Publication Date
- 2026-06-19
Smart Images

Figure CN120429282B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of database and data storage technology, and more specifically, relates to a database write amplification optimization method based on frozen page identification and isolation. Background Technology
[0002] Solid-State Drives (SSDs) based on NAND flash memory have seen continuous performance improvements and cost reductions, making them the primary persistent storage medium for database systems. With the growth in database size and the frequency of transaction processing, the I / O workload of modern database systems typically involves a large number of read and write operations. When a database runs on a NAND flash-based SSD, write operations can cause a common and serious problem: write amplification. The write amplification factor (WAF) is defined as the ratio between the amount of physical data actually written to the underlying flash media and the amount of user data actually written to the host system (i.e., the amount of data written by the database). A high write amplification factor reduces the lifespan of the SSD, causing SSD bandwidth limitations, unpredictable performance, and high tail latency, further impacting database performance. According to relevant research, continuous TPC-C testing on SSDs can cause the SSD's write amplification factor to increase to more than four times its initial value. The continuously increasing WAF causes the number of I / O operations per second (IOPS) to decrease, and the number of transactions per minute (TPMC) to decrease to one-third of its initial value.
[0003] The physical characteristics of NAND flash memory prevent it from being updated in place like traditional hard drives. Writing to flash pages (small read / write units) requires resetting the chip at a larger "erase block" granularity (also known as a superblock) before new data can be written. Due to this "erase before write" limitation, SSDs implement a complex, host-invisible FTL (Flash Translation Layer) module to translate in-place modifications into append writes and maintain the mapping between logical page addresses and physical page addresses. When available space is insufficient, the SSD needs to perform garbage collection, copying valid pages from the superblock to a new location before erasing and writing new data, thus causing write amplification in the SSD.
[0004] In databases based on B+ tree index structures, the properties of "write offset" and "temporal locality" often result in a high degree of hot-cold data mixing. Because SSDs write data arriving simultaneously to the same erase block (regardless of whether the data is hot or cold), frequently updated hot data and long-unchanging cold data may be mixed and stored on the same erase block. During garbage collection, the SSD must relocate valid cold data to free up space for writing new data. This mixing of hot and cold data further exacerbates the write amplification problem.
[0005] A key technique for reducing write amplification caused by garbage collection is to isolate data based on its expiration time, writing pages with similar expiration times into the same erase block so they expire around the same time, reducing the proportion of valid pages within a block during garbage collection and thus lowering write amplification. However, existing solutions (rule-based data placement schemes) are designed for general block I / O loads and lack specific adaptation to the characteristics of database operation. Specifically, database loads suffer from the "frozen page" problem, where a large number of pages are frequently rewritten to the SSD within a short time window, but will not be updated after this window. These pages that will not be updated are called "frozen pages." Existing solutions rely on page popularity for placement, and even after a page becomes frozen, it is still partitioned and placed according to its popularity, resulting in unnecessary migration. In addition, existing technologies lack effective methods for identifying frozen pages. Summary of the Invention
[0006] In view of the above-mentioned defects or improvement needs of the existing technology, the present invention provides a database write amplification optimization method based on frozen page identification and isolation. Its purpose is to reduce write amplification when the database is stored on SSD, and at the same time reduce unnecessary page migration, so as to improve database performance.
[0007] To achieve the above objectives, this invention provides a database write amplification optimization method based on frozen page identification and isolation, comprising:
[0008] S1. During database operation, record the popularity information of each page in the corresponding page;
[0009] S2. Divide the different partitions of ZNS SSD into frozen partitions and popularity type partitions; when a page is evicted from the database, adopt a rule-based data placement scheme to write the evicted page into the corresponding popularity type partition according to its popularity.
[0010] When the S3 and ZNS SSDs are running out of space, the ZNS SSD performs a garbage collection operation, reading the valid pages written to the hot type partitions into memory;
[0011] S4. Input the popularity information of each valid page read into memory into the trained frozen page recognition model. The model outputs whether each page is a frozen page. If it is a frozen page, write the page directly into the frozen partition. Otherwise, adopt a rule-based data placement scheme to write the page into the corresponding popularity type partition or frozen partition.
[0012] Furthermore, the rule-based data placement scheme is SepBIT, a data isolation method based on block expiration time; SepBIT divides the ZNS SSD into six categories of partitions from high to low based on the popularity value, and sets the sixth category partition as the frozen partition, while the first to fifth categories of partitions are the popularity type partitions.
[0013] In S2, SepBIT is used to write pages that will be evicted from the database into corresponding popularity type partitions based on their popularity, including:
[0014] S21. Calculate the lifecycle of each page evicted from the database, where the lifecycle of the page first written to ZNS SSD is set to infinity; and dynamically maintain a global lifecycle threshold lifespan_threshold based on the lifecycle of each page written to ZNS SSD within a preset time.
[0015] S22. If the lifespan of the page currently being evicted from the database is less than lifespan_threshold, then write the page to the first type of partition; otherwise, write the page to the second type of partition.
[0016] Correspondingly, in S4, SepBIT is used to write the page to the corresponding popularity type partition, including:
[0017] If the page comes from the first type of partition, it is written to the third type of partition; if the page comes from the second type of partition, the page age AGE is calculated; the page age AGE is the difference between the time when the ZNS SSD performs garbage collection and the time when the page was last written to the ZNS SSD.
[0018] If AGE < lifespan_threshold * 4, then write it to the fourth partition, where * represents a multiplication operation; if lifespan_threshold * 4 ≤ AGE < lifespan_threshold * 16, then write it to the fifth partition; if AGE ≥ lifespan_threshold * 16, then write it to the frozen partition.
[0019] Furthermore, the rule-based data placement scheme is the Dynamic Data Clustering Method (DAC). DAC divides the ZNS SSD into six partitions from low to high popularity value, and sets the first partition as the frozen partition, while the second to sixth partitions are popularity type partitions.
[0020] In S2, DAC is used to write pages that will be evicted from the database into corresponding popularity type partitions based on their popularity, including:
[0021] Each page written to ZNS SSD maintains a popularity value. The popularity value of a page written to ZNS SSD for the first time is 2. The popularity value of a page is incremented by one each time it is written to ZNS SSD, and decremented by one each time a page is garbage collected. For each page written to ZNS SSD, the page with a popularity value of i is written to the i-th partition, and its popularity value is incremented by one; where i∈{2,3,4,5,6}.
[0022] Correspondingly, in S4, DAC is used to write the page to the corresponding popularity type partition, including:
[0023] Based on the page's popularity value, write the page with a popularity value of i to the i-th partition and decrement its popularity value by one.
[0024] Furthermore, the trained frozen page recognition model is obtained in the following way:
[0025] A training sample set is constructed, and the frozen page recognition model is trained using the training sample set to obtain the trained frozen page recognition model; wherein, the training samples in the training sample set are the popularity information of the page when it is currently written to ZNS SSD and the popularity information of the page when it was last written to ZNS SSD, and the label is whether the page is a frozen page.
[0026] Furthermore, the construction of the training sample set includes:
[0027] Perform database load testing, recording the heat information of each page when it is written to ZNS SSD, and the expiration time of the page after it is written to ZNS SSD. If the page will not be written to SSD again, the expiration time is a predefined invalid value. The heat information includes: the time WT when the page is written to ZNS SSD, the amount of valid data on the page VD, the average read interval AI of the page within time T, the average write interval MI of the page within time T, the number of reads AC of the page within time T, and the number of writes MC of the page within time T. The time T is the time from when the page is read into memory to when the page is written to ZNS SSD.
[0028] The page is determined as frozen based on the expiration time: if the expiration time is invalid, the page is frozen; otherwise, it is a normal page. Each page is tagged as frozen based on this.
[0029] After normalizing the popularity information of each page, the popularity information HR_current when each page is currently written to ZNS SSD and the popularity information HR_last when the page was last written to ZNS SSD are extracted as training samples to obtain the training sample set; wherein, HR_last when the page is first written to ZNS SSD is 0.
[0030] Furthermore, the frozen page identification model is a logistic regression model.
[0031] Furthermore, the frozen page recognition model is a long short-term memory neural network model.
[0032] The present invention also provides a database write amplification optimization system based on frozen page identification and isolation, including a computer-readable storage medium and a processor;
[0033] The computer-readable storage medium is used to store executable instructions;
[0034] The processor is used to read executable instructions stored in the computer-readable storage medium and execute the database write amplification optimization method based on frozen page identification and isolation described above.
[0035] The present invention also provides a computer-readable storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the database write amplification optimization method based on frozen page identification and isolation as described in any of the preceding claims.
[0036] The present invention also provides a computer program product, including a computer program that, when the computer program is run on a computer, causes the computer to execute the database write amplification optimization method based on frozen page identification and isolation as described above.
[0037] In summary, the above-described technical solutions conceived in this invention can achieve the following beneficial effects:
[0038] (1) The database write amplification optimization method based on frozen page identification and isolation of the present invention divides different partitions of ZNS SSD into frozen partitions for storing frozen pages and hot-type partitions for storing normal pages. When a page is written to the SSD, it is first written to the corresponding hot-type partition of ZNS SSD according to the existing rule-based data placement scheme. When the SSD performs garbage collection, it uses a frozen page identification model to identify valid pages in memory based on the page's hotness information. If it is identified as a frozen page, it is directly placed in the frozen partition dedicated to frozen pages. Otherwise, it is placed in the corresponding hot-type partition of ZNS SSD according to the existing rule-based data placement scheme. In this way, effective frozen page isolation is achieved, unnecessary page migration is avoided, and the write amplification caused by garbage collection is reduced while improving database performance.
[0039] (2) Preferably, the database write amplification optimization method based on frozen page identification and isolation in the embodiments of the present invention improves the existing rule-based data placement schemes SepBIT and DAC, realizes active isolation and placement of frozen pages, and seamlessly integrates with the rule-based data placement scheme, which can effectively reduce write amplification and improve database performance.
[0040] (3) Furthermore, in the process of constructing the training sample set, the present invention collects six feature variables that are strongly correlated with the frozen page to form a heat information. The constructed data sample can fully reflect the characteristics of the page. Moreover, based on the page's expiration time, it can determine whether the page has become a frozen page, and can make full use of the page features to effectively identify frozen pages.
[0041] (4) As a preferred option, the logistic regression model is used as the frozen page identification model. As a low-overhead and efficient classification algorithm model, the logistic regression model can efficiently classify frozen pages.
[0042] In summary, the learning-based frozen page identification and isolation scheme (SepFrozen) of this invention, based on an effective frozen page identification model, uses this model to extract frozen pages from valid pages during system garbage collection and places them in an isolated area to avoid mixing with normal pages. This reduces write amplification when the database is stored on SSD and improves database performance. Attached Figure Description
[0043] Figure 1 This is a schematic diagram of the frozen page isolation mechanism provided in an embodiment of the present invention;
[0044] Figure 2 A schematic diagram illustrating a data placement method for database frozen page awareness provided in an embodiment of the present invention;
[0045] Figure 3 A schematic diagram illustrating another data placement method for database frozen page awareness provided in an embodiment of the present invention;
[0046] Figure 4 This is a schematic diagram of the structure of a learning-based database frozen page recognition model provided in an embodiment of the present invention. Detailed Implementation
[0047] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention. Furthermore, the technical features involved in the various embodiments of this invention described below can be combined with each other as long as they do not conflict with each other.
[0048] In this invention, the terms "first," "second," etc., used in the invention and accompanying drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.
[0049] Example 1
[0050] like Figure 1 As shown, this embodiment of the invention provides a database write amplification optimization method based on frozen page identification and isolation, mainly including:
[0051] S1. During database operation, the popularity information of each page is recorded in the corresponding page; in this embodiment of the invention, the database is a database based on a B+ tree index structure.
[0052] S2. Divide the different partitions of ZNS SSD (Zoned Namespace SSD) into frozen partitions for storing frozen pages and hot-type partitions for storing normal pages; when a page is evicted from the database, a rule-based data placement scheme (such as SepBIT, DAC) is adopted to write the evicted page into the corresponding hot-type partition of ZNS SSD according to its hotness.
[0053] When the S3 and SSD system space is insufficient, the SSD performs garbage collection, reading valid pages written to different popularity type partitions of the ZNS SSD into memory. Valid pages store the data content currently recognized by the system, which is compared with invalid pages that have been overwritten or deleted. When the system updates data, the new data is written to a new physical page, and the page in the original storage location is automatically marked as invalid. The page newly written to the SSD is a valid page. If a page is written to the SSD, it will not be written to the SSD again, and this page is also a valid page.
[0054] S4. Input the heat information of each valid page read into memory into the trained frozen page recognition model. The frozen page recognition model is used to determine whether each page is a frozen page. If it is a frozen page, the page is directly written to the frozen partition of the SSD. Otherwise, a rule-based data placement scheme is adopted to write the page to the corresponding heat type partition or frozen partition of the SSD.
[0055] As a preferred implementation, the rule-based data placement scheme is an improved SepBIT (data isolation technology based on block expiration time). The improved SepBIT divides the ZNS SSD into six partitions, from the first to the sixth, based on the popularity value from high to low. The sixth partition is set as a frozen partition, and the first to fifth partitions are popular type partitions used to store normal pages.
[0056] like Figure 2 As shown in S2, when a page is evicted from the database, SepBIT is used to write the evicted page to the corresponding popularity type partition of the ZNS SSD according to its popularity, including:
[0057] S21. Calculate the lifecycle of pages evicted from the database. The method for calculating the page lifecycle is the existing technology. The page lifecycle represents the popularity of the page. The longer the lifecycle, the lower the popularity. If the page is being written to the SSD for the first time, its lifecycle is set to infinity. And dynamically maintain a global lifecycle threshold lifespan_threshold based on the lifecycle of each page written to the SSD within a preset time.
[0058] S22. If the lifespan of the page currently being evicted from the database is less than lifespan_threshold, then write the page to the first type of partition; otherwise, write the page to the second type of partition.
[0059] Correspondingly, in S4, an improved SepBIT is used to write pages to the corresponding hot-type partition of the SSD, including:
[0060] If the current page comes from the first type of partition, it is placed in the third type of partition; if the current page comes from the second type of partition, the age of the current page is calculated (the difference between the current garbage collection time and the time when the page was last written to the SSD); if the age is less than the threshold lifespan_threshold*4, it is placed in the fourth type of partition, where the symbol * indicates a multiplication operation; if the age is greater than or equal to the threshold lifespan_threshold*4 but less than the threshold lifespan_threshold*16, it is placed in the fifth type of partition; if the age is greater than or equal to the threshold lifespan_threshold*16, it is placed in the sixth type of partition.
[0061] As a preferred implementation, the rule-based data placement scheme is an improved dynamic data clustering (DAC) method. The improved DAC divides the ZNS SSD into six partitions, from the first type to the sixth type, based on the popularity value from low to high. The first type partition is set as a frozen partition, and the second to sixth types of partitions are used to store the popularity type partitions of normal pages.
[0062] like Figure 3 As shown in S2, when a page is evicted from the database, DAC is used to write the evicted page to the corresponding popularity type partition of the ZNS SSD according to its popularity, including:
[0063] S21. Each page written to ZNS SSD maintains a popularity value. The popularity value of the page written to ZNS SSD for the first time is 2. The popularity value increases once every time the page is written to ZNS SSD. The popularity value decreases once every time the page is garbage collected.
[0064] S22. For each page written to the ZNS SSD, the page with a heat value of i is written to the i-th partition, and its heat value is incremented once; where i∈{2,3,4,5,6}.
[0065] Correspondingly, in S4, DAC is used to write pages to the corresponding hotspot type partition of the SSD, including:
[0066] Based on the page's popularity value, the page with a popularity value of i is written to the i-th partition, and its popularity is decreased by one.
[0067] As a preferred implementation, the training method for the frozen page recognition model in S4 includes:
[0068] (1) Perform database load testing, record the hotness information of each page when it is written to the SSD, and the expiration time after the page is written to the SSD. If the page will not be written to the SSD again, the expiration time is a predefined invalid value. In this embodiment of the invention, the hotness information includes six characteristic variables of the page, namely the time when the page is written to the SSD (WT), the effective data volume of the page (VD), the average read interval of the page in time T (AI), the average write interval of the page in time T (MI), the number of reads of the page in time T (AC), and the number of writes of the page in time T (MC). Among them, time T is the time from when the page is read into memory to when the page is written to the SSD. The tuple (WT, VD, AI, MI, AC, MC) composed of the six characteristic variables is called a hotness record, that is, the hotness information. The expiration time after the page is written to the SSD is the time when the page is written to the SSD again.
[0069] (2) Determine whether a page has become a frozen page based on its expiration time: if the expiration time is a normal value (not an invalid value), then the page is a normal page; if the expiration time is an invalid value, then the page becomes a frozen page. Tag each page as frozen page based on this.
[0070] (3) After normalizing the heat information of each page when it is written to SSD, extract the heat information (HR_current) of each page when it is currently written to SSD and the heat information (HR_last) of the page when it was last written to SSD. If the page is being written to SSD for the first time, set its HR_last to all 0. This will give us the dataset for training the frozen page recognition model. In the dataset, a sample is the heat information of a page when it is currently written to SSD and the heat information of the page when it was last written to SSD. The label is whether the page is a frozen page.
[0071] (4) Randomly sample each sample in the dataset and divide the training set and test set in a 3:1 ratio. Allocate 75% of the data to the training set to train the frozen page recognition model and use the test set to test the model's performance.
[0072] During model training, the popularity information is used to determine whether a page is a frozen page. This is a binary classification problem that can be solved using a logistic regression (LR) model. The logistic regression model performs a dot product between the popularity information and the weight vector determined by the model, and then uses the sigmoid function to map it to a probability value between 0 and 1. If the probability value is greater than a certain set threshold (usually 0.5), the model outputs the category "frozen"; otherwise, the output category is "normal". The LR model uses Maximum Likelihood Estimation (MLE) to solve for the weight vector. To further enhance the model's fitting performance, in this embodiment, the stochastic gradient descent (SGD) algorithm is used instead of MLE to solve for the weight vector, and the logistic regression model using the stochastic gradient descent algorithm is called the SGD model. Since logistic regression does not support time-series feature input, in this embodiment, HR_last (HRlast) and HR_current (HRcurrent) are stacked to form a single 12-dimensional vector, such as... Figure 4 As shown, the SGD model is trained using a training set. In other embodiments, the frozen page recognition model can also employ a network model that supports temporal feature inputs, such as a long short-term memory neural network, to achieve classification.
[0073] Online inference is used to determine the frozen page nature of a page during actual database runtime. The model's input includes the page's current popularity information when it was written to the SSD and its last popularity information when it was written to the SSD. The output is whether the page is frozen. The model has two misclassification scenarios: false negatives (misclassifying frozen pages as normal pages) and false positives (misclassifying normal pages as frozen pages). False negative pages will follow the existing rule-based data placement scheme, undergoing multiple redundant migrations before finally reaching the frozen partition. For false positive pages, if the page is modified and rewritten to the SSD, the original page address becomes invalid, and the newly written page will be written to the corresponding non-frozen partition (popularity-type partition).
[0074] The machine learning-based frozen page identification model in this invention can extract frozen pages from pages to be recycled efficiently and with low overhead. The learning-based frozen page identification and isolation mechanism can be seamlessly integrated with various existing rule-based data placement schemes.
[0075] In this embodiment of the invention, by combining a learning-based frozen page identification and isolation mechanism, two frozen page-aware data placement methods are obtained by improving two existing efficient rule-based data placement algorithms. These methods can effectively isolate and place frozen pages, reduce redundant migration of frozen pages during garbage collection, thereby effectively reducing write amplification and improving database performance.
[0076] Example 2
[0077] This invention provides a database write amplification optimization system based on frozen page identification and isolation, including a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the steps of the database write amplification optimization method based on frozen page identification and isolation in Embodiment 1 above.
[0078] The relevant technical solutions are the same as above, and will not be repeated here.
[0079] Example 3
[0080] This invention provides a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it implements the steps of the database write amplification optimization method based on frozen page identification and isolation in Embodiment 1 above.
[0081] The relevant technical solutions are the same as above, and will not be repeated here.
[0082] Example 4
[0083] This application provides a computer program product, including a computer program that, when run on a computer, causes the computer to perform the steps of the database write amplification optimization method based on frozen page identification and isolation in Embodiment 1 above.
[0084] The relevant technical solutions are the same as above, and will not be repeated here.
[0085] Those skilled in the art will readily understand that the above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A database write amplification optimization method based on frozen page identification and isolation, characterized in that, include: S1. During database operation, record the popularity information of each page in the corresponding page; S2. Divide the different partitions of ZNS SSD into frozen partitions and popularity type partitions; when a page is evicted from the database, adopt a rule-based data placement scheme to write the evicted page into the corresponding popularity type partition according to its popularity. When the S3 and ZNS SSDs are running out of space, the ZNS SSD performs a garbage collection operation, reading the valid pages written to the hot type partitions into memory; S4. Input the heat information of each valid page read into memory into the trained frozen page recognition model, and the model outputs whether each page is a frozen page; If it is a frozen page, the page is directly written to the frozen partition; otherwise, a rule-based data placement scheme is adopted to write the page to the corresponding popularity type partition or frozen partition. The trained frozen page recognition model is obtained in the following way: A training sample set is constructed, and the frozen page recognition model is trained using the training sample set to obtain the trained frozen page recognition model; wherein, the training samples in the training sample set are the popularity information of the page when it is currently written to ZNS SSD and the popularity information of the page when it was last written to ZNS SSD, and the label is whether the page is a frozen page. The construction of the training sample set includes: Perform database load testing, recording the heat information of each page when it is written to ZNS SSD, and the expiration time of the page after it is written to ZNS SSD. If the page will not be written to SSD again, the expiration time is a predefined invalid value. The heat information includes: the time WT when the page is written to ZNS SSD, the amount of valid data on the page VD, the average read interval AI of the page within time T, the average write interval MI of the page within time T, the number of reads AC of the page within time T, and the number of writes MC of the page within time T. The time T is the time from when the page is read into memory to when the page is written to ZNS SSD. The page is determined as frozen based on the expiration time: if the expiration time is invalid, the page is frozen; otherwise, it is a normal page. Each page is tagged as frozen based on this. After normalizing the popularity information of each page, the popularity information HR_current when each page is currently written to ZNS SSD and the popularity information HR_last when the page was last written to ZNS SSD are extracted as training samples to obtain the training sample set; wherein, HR_last when the page is first written to ZNS SSD is 0.
2. The method of claim 1, wherein, The rule-based data placement scheme is SepBIT, a data isolation method based on block expiration time. SepBIT divides ZNSSSD into six types of partitions from high to low based on the heat value, and sets the sixth type of partition as the frozen partition. The first to fifth types of partitions are the heat type partitions. In S2, SepBIT is used to write pages that will be evicted from the database into corresponding popularity type partitions based on their popularity, including: S21. Calculate the lifecycle of each page evicted from the database, where the lifecycle of the page first written to ZNS SSD is set to infinity; and dynamically maintain a global lifecycle threshold lifespan_threshold based on the lifecycle of each page written to ZNS SSD within a preset time. S22. If the lifespan of the page currently being evicted from the database is less than lifespan_threshold, then write the page to the first type of partition; otherwise, write the page to the second type of partition. Correspondingly, in S4, SepBIT is used to write the page to the corresponding popularity type partition, including: If the page comes from the first type of partition, it is written to the third type of partition; if the page comes from the second type of partition, the page age AGE is calculated; the page age AGE is the difference between the time when the ZNS SSD performs garbage collection and the time when the page was last written to the ZNS SSD. If AGE < lifespan_threshold * 4, then write it to the fourth partition, where * represents a multiplication operation; if lifespan_threshold * 4 ≤ AGE < lifespan_threshold * 16, then write it to the fifth partition; if AGE ≥ lifespan_threshold * 16, then write it to the frozen partition.
3. The method of claim 1, wherein, The rule-based data placement scheme is the Dynamic Data Clustering Method (DAC). DAC divides the ZNS SSD into six partitions from low to high popularity value, and sets the first partition as the frozen partition, while the second to sixth partitions are the popularity type partitions. In S2, DAC is used to write pages that will be evicted from the database into corresponding popularity type partitions based on their popularity, including: Each page written to ZNS SSD maintains a popularity value. The popularity value of a page written to ZNS SSD for the first time is 2. The popularity value of a page is incremented by one each time it is written to ZNS SSD, and decremented by one each time a page is garbage collected. For each page written to ZNS SSD, the page with a popularity value of i is written to the i-th partition, and its popularity value is incremented by one; where i∈{2,3,4,5,6}. Correspondingly, in S4, DAC is used to write the page to the corresponding popularity type partition, including: Based on the page's popularity value, write the page with a popularity value of i to the i-th partition and decrement its popularity value by one.
4. The method of claim 1, wherein, The frozen page identification model is a logistic regression model.
5. A database write amplification optimization system based on freeze page identification and isolation, the system comprising: Includes computer-readable storage media and processors; The computer-readable storage medium is used to store executable instructions; The processor is used to read executable instructions stored in the computer-readable storage medium and execute the database write amplification optimization method based on frozen page identification and isolation as described in any one of claims 1-4.
6. A computer-readable storage medium having stored thereon a computer program, characterized in that, When the program is executed by the processor, it implements the database write amplification optimization method based on frozen page identification and isolation as described in any one of claims 1-4.
7. A computer program product, characterised in that, The method includes a computer program that, when run on a computer, causes the computer to perform the database write amplification optimization method based on frozen page identification and isolation as described in any one of claims 1-4.