Effective data bitmap generation method and bitmap generator
By combining hardware circuitry and P2L tables, effective data bitmaps can be generated quickly, solving the problems of long generation time and high resource consumption in existing technologies, and improving the performance of storage devices.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HEFEI YIXIN ELECTRONIC TECH CO LTD
- Filing Date
- 2024-12-30
- Publication Date
- 2026-06-30
AI Technical Summary
In existing technologies, the process of creating a valid data bitmap is costly, time-consuming, and consumes a large amount of CPU resources, thus affecting the performance of storage devices.
Using dedicated hardware circuits and P2L tables, the hardware circuits quickly generate valid data bitmaps, and the P2L table entries record the mapping relationship between the LBA value of the DU and the FTL table, generating bit values that indicate the validity of the DU data.
It significantly reduces the time required to generate valid data bitmaps, lowers CPU resource consumption, and improves storage device performance.
Smart Images

Figure CN122308707A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of storage technology, and in particular to a method and bitmap generator for generating effective data bitmaps. Background Technology
[0002] Figure 1A A block diagram of a storage device is shown. Storage device 102 is coupled to a host computer to provide storage capabilities. The host computer and storage device 102 can be coupled in various ways, including but not limited to using storage protocols such as SATA (Serial Advanced Technology Attachment), SCSI (Small Computer System Interface), SAS (Serial Attached SCSI), IDE (Integrated Drive Electronics), USB (Universal Serial Bus), PCIe (Peripheral Component Interconnect Express), NVMe (NVM Express), Ethernet, Fibre Channel, and wireless communication networks. The host computer can be an information processing device capable of communicating with the storage device in the above ways, such as a personal computer, tablet computer, server, laptop computer, network switch, router, cellular phone, or personal digital assistant. Storage device 102 includes interface 103, control unit 104, one or more NVM chips 105, and DRAM (Dynamic Random Access Memory) 110.
[0003] NAND flash memory, phase change memory, FeRAM (Ferroelectric RAM), MRAM (Magnetic Random Access Memory), RRAM (Resistive Random Access Memory), XPoint memory, etc. are common NVMs.
[0004] Interface 103 is compatible with exchanging data with the host via methods such as SATA, IDE, USB, PCIe, NVMe, SAS, Ethernet, and Fibre Channel.
[0005] The control unit 104 is used to control data transfer between the interface 103, the NVM chip 105, and the DRAM 110. It is also used for memory management, host logical address to flash physical address mapping, erase leveling, bad block management, etc. The control unit 104 can be implemented in various ways, including software, hardware, firmware, or a combination thereof. For example, the control unit 104 can be in the form of an FPGA (Field-programmable gate array), an ASIC (Application Specific Integrated Circuit), or a combination thereof. The control unit 104 may also include a processor or controller, in which software executes to manipulate the hardware of the control unit 104 to process I / O (Input / Output) commands. The control unit 104 can also be coupled to the DRAM 110 and can access the data in the DRAM 110. FTL tables and / or cached I / O command data can be stored in the DRAM.
[0006] The control unit 104 includes a flash interface controller (or media interface controller, flash channel controller), which is coupled to the NVM chip 105 and issues commands to the NVM chip 105 in accordance with the interface protocol of the NVM chip 105 to operate the NVM chip 105, and receives the command execution results output from the NVM chip 105. Known NVM chip interface protocols include "Toggle", "ONFI", etc.
[0007] Figure 1B A detailed block diagram of the control components of the storage device is shown.
[0008] The host accesses the storage device using I / O commands that conform to the storage protocol. The control unit generates one or more media interface commands based on the I / O commands from the host and provides them to the media interface controller. The media interface controller generates storage media access commands (e.g., programming commands, read commands, erase commands) that conform to the NVM chip's interface protocol based on the media interface commands. The control unit also tracks the completion of all media interface commands generated from a single I / O command and indicates the processing results of the I / O commands to the host.
[0009] See Figure 1BThe control components include, for example, a host interface, a host command processing unit, a storage command processing unit, a media interface controller, and a storage media management unit. The host interface receives I / O commands from the host and generates storage commands, which are then provided to the storage command processing unit. A storage command may access a storage space of the same size, such as 4KB. The data unit recorded in the NVM chip corresponding to the data accessed by a storage command is called a data frame. A physical page records one or more data frames. For example, if the size of a physical page is 17664 bytes and the size of a data frame is 4KB, then one physical page can store four data frames.
[0010] The storage media management unit maintains the LBA (Logical Block Address) to physical address (PPA) translation for each storage command. For example, the storage media management unit includes an FTL (Framework Time Limiting) table. The FTL table can be used to maintain mapping information from logical addresses to physical addresses, as shown in [see...]. Figure 2 Logical addresses constitute the storage space of the storage device as perceived by upper-level software such as the operating system, while physical addresses are the addresses of the physical storage units used to access solid-state storage devices. Typically, FTL (Filesystem Address Translation) entries record the address mapping relationships within the storage device in units of specified sizes of storage units (e.g., 512 bytes, 2KB, 4KB, also called Data Units, DUs). There is a one-to-one correspondence between addresses (LBAs) in the logical address space and entries in the FTL table, allowing the indexing of FTL entries based on LBAs and obtaining the PPA (Programmatical Address Attribute) recorded in each entry. FTL entries are arranged contiguously in memory, and LBAs are also arranged contiguously in the LBA space, thus the memory address of the corresponding FTL entry can be obtained based on the LBA. Therefore, FTL entries do not need to record their LBAs; rather, the LBA represented by the entry is implied by its position in the FTL table. Figure 2 It shows multiple LBAs in the LBA space with values from 0 to (2m-1), and the FTL table entries corresponding to each LBA.
[0011] For read commands, the storage media management unit outputs the physical address corresponding to the logical address accessed by the storage command. For write commands, the storage media management unit allocates an available physical address and records the mapping relationship between the accessed logical address and the allocated physical address. The storage media management unit also maintains functions required for managing the NVM chip, such as garbage collection and wear leveling.
[0012] The storage command processing unit, based on the physical address provided by the storage media management unit, operates the media interface controller to issue storage media access commands to the NVM chip. For clarity, commands sent from the host to the storage device are called I / O commands, commands sent from the host command processing unit to the storage command processing unit are called storage commands, commands sent from the storage command processing unit to the media interface controller are called media interface commands, and commands sent from the media interface controller to the NVM chip are called storage media access commands. Storage media access commands conform to the NVM chip's interface protocol.
[0013] SSDs (Solid State Drives) consist of multiple NVM chips. Each NVM chip includes one or more LUNs (Logical Units), and each logical unit includes multiple physical blocks. As storage capacity increases, so does the number of NVM chips / LUNs / blocks, which also increases the likelihood of storage media failure. To ensure the reliability of the stored data delivered to users, enterprise-grade SSDs use technologies similar to RAID (Redundant Arrays of Independent Disks) to construct data protection units across NVM chips / logical units. This ensures that even if a single NVM chip / logical unit fails, data is not lost. This also addresses the occasional data errors that may occur during SSD operation.
[0014] A large block comprises physical blocks from each of multiple logical units (LUNs). The multiple logical units that provide physical blocks for a large block are called a group of logical units. Each logical unit in a group of logical units can provide a physical block for the large block. A large block comprises physical blocks from each of multiple logical units (LUNs). For example, in... Figure 3 The large blocks shown are constructed on every 16 logical units (LUNs). Each large block comprises 16 physical blocks, each originating from one of the 16 logical units (LUNs). Figure 3 In the example, block 0 comprises physical blocks 0 from each of the 16 logical units (LUNs), while block 2 comprises physical blocks 2 from each logical unit (LUN). Blocks can also be constructed in a variety of other ways. Figure 3In the diagram, physical blocks are indicated by reference numerals of the form Bb-a, where 'a' indicates that the physical block is provided by a logic unit (LUN a), and 'b' indicates that the physical block has a block number of 'b' within the logic unit. Large blocks store user data and checksum data. Checksum data for the large block is calculated based on the user data stored in the large block. For example, checksum data is stored in the last physical block of the large block. Alternatively, other physical blocks of the large block can be selected to store checksum data. As another example, Chinese patent application number 201710752321.0 provides other construction methods for the large blocks.
[0015] Page stripes are data protection units within an SSD, constructed using RAID technology. Page stripes are structured within large blocks. See also Figure 4A as well as Figure 4B A large block comprises multiple page strips, and each page strip comprises multiple physical pages, which originate from different physical blocks within the same large block. For example, a page strip might have P pages storing user data and Q pages storing validation data. Figure 4A and Figure 4B In this context, Q = 1). The data written to page i is denoted as D(i). The checksum D(Q) is generated from the user data of page P according to the specified error correction algorithm. For example, if the checksum is determined based on the XOR operation, then D(Q) = D(0)XOR D(1)XOR…XOR D(P-1). Therefore, in writing data to the page stripe, D(Q) can only be calculated if D(0) to D(P-1) are known. The typical size of D(i) is, for example, 2KB, 4KB, 16KB, etc.
[0016] For a Logical Unit (LUN), each LUN can include multiple planes. Each plane includes multiple physical blocks. Each physical block includes multiple physical pages. Each physical page includes multiple Data Units (DUs). DUs have a specified size, such as 4KB. If the physical page size is 8KB, then it includes two DUs. As an example, a LUN may include 4 planes, each plane may include 500 physical pages, and each physical page may include 2 DUs.
[0017] In TLC (Triple-Level Cell) flash memory, a one-shot programming mode is typically used, requiring the programming of three physical pages on the same word line in a single ONFI command. To improve the efficiency of read and write operations on NVM chips, multiple physical pages with the same physical page number in four planes are also operated in a single programming or read command. Thus, a single programming or read command can operate on 24 DUs (each LUN includes 4 planes, each plane includes 3 physical pages, and each physical page includes 2 DUs, 24 = 4 * 3 * 2). The physical address of a DU is called a PPA, allowing these 24 DUs to have consecutive physical addresses.
[0018] To facilitate writing data to page stripes, physical addresses are allocated to the data being written to the storage device on a page stripe basis. For example... Figure 4C As shown, data is first written to 24 DUs in LUN 0 of page stripe 0. After it is full, data is then written to 24 DUs in LUN 1, and so on, until the entire page stripe 0 is full. Next, data is written to the other page stripes. Thus, the data in the NVM chip of the storage device is written in a specific order.
[0019] When writing data to a storage device, the host provides I / O commands that describe the Level Authentication Code (LBA) of the data being written. The storage device allocates a Data Unit (DU) for the data being written, and each DU has a physical address. When writing data to a DU, the LBA of that data is also written associated with it. LBAs can be written to DUs, or a specified amount of storage space can be allocated within a page strip to record the LBAs of all DUs in that page strip. Alternatively, the LBAs of each DU can be recorded in storage space outside the page strips of the storage device.
[0020] By recording the Level Requirement Basis (LBA) of a Data Entry (DU), it's possible to identify whether the data recorded on that DU is expired. For example, if an DU record has an LBA of 100, and querying the FTL table with LBA=100 yields a PPA of 0x55AA, while the DU's PPA is 0x66AA, then the data in that DU record is expired. If querying the FTL table with LBA=100 also yields a PPA of 0x66AA, then the data in that DU record is valid.
[0021] Figure 5AThis diagram illustrates the valid data bitmap used to record the data status corresponding to the data stored in a DU (Data Utility). In the valid data bitmap, each DU corresponds to one bit; if the DU records valid data, the corresponding bit value is 1, and if the DU records invalid data, the corresponding bit value is 0. Because the valid data bitmap records the status of the data in the DU, it can be used to identify the amount of valid and invalid data in each large block of storage. This allows for the selection of large blocks to be reclaimed during GC (Garbage Collection) or the identification of invalid data to be reclaimed.
[0022] Figure 5B The flowchart for creating a valid data bitmap is shown. Based on the DU's PPA, the LBA corresponding to the DU is obtained from the page stripe. The obtained LBA is used to look up the FTL table to determine PPA'. Then, it is checked whether PPA' and PPA' are the same. If they are the same, the bit value corresponding to PPA is set to 1 in the valid data bitmap; otherwise, the bit value corresponding to PPA is set to 0 in the valid data bitmap. Then, the PPA of the next DU is obtained, and the above process is repeated to create the valid data bitmap. Summary of the Invention
[0023] In existing technologies, creating a valid data bitmap is a costly and time-consuming process. Taking a 4TB storage device as an example, there are over 2^30 Data Units (DUs). To create a valid data bitmap, each DU needs to be scanned, and for each DU's corresponding LBA, the following operations need to be performed: reading the FTL table, comparing the PPA' recorded in the FTL with the PPA corresponding to the DU, and generating a 1-bit value in the bitmap. Assuming that each read from a DU to generating a 1-bit value takes approximately 10µs, scanning 2^30 DUs would take about 10,000 seconds (approximately 3 hours). This time consumption is unacceptable for storage devices. Furthermore, if this process is controlled by the CPU, it will consume a significant amount of CPU processing resources and impact the performance of the storage device.
[0024] Therefore, embodiments of this application aim to provide dedicated hardware circuitry to quickly construct an effective data bitmap.
[0025] It is worth noting that the physical address describes the LUN number, physical block number, physical page number, DU number, etc., and the numerical relationship between the PPAs of adjacent DUs cannot be calculated by simply adding 1 using a conventional addition circuit. Therefore, some embodiments of this application also aim to provide a hardware circuit for calculating the next adjacent PPA based on the current PPA.
[0026] In a first aspect, embodiments of this application provide an effective data bitmap generation method, including:
[0027] Obtain the LBA value of the DU recorded in the entry of the P2L table. The P2L table includes multiple consecutively arranged entries. The first physical address corresponding to each entry of the P2L table represents the physical address of the DU associated with that entry in the NVM chip. The P2L table is constructed based on the arrangement of the DUs in the NVM chip.
[0028] The second physical address of the DU is obtained by looking up the FTL table based on the LBA value of the DU obtained from the P2L table.
[0029] Based on the comparison between the first physical address and the second physical address of the DU, a bit value is generated to indicate whether the data recorded by the DU is valid.
[0030] A valid data bitmap is generated based on the bit values corresponding to each entry in the P2L table.
[0031] Optionally, generating a bit value indicating whether the data recorded by the DU is valid based on the comparison between the first physical address and the second physical address of the DU includes:
[0032] In response to the first physical address and the second physical address of the DU being the same, a first bit value indicating that the data recorded by the DU is valid data is generated;
[0033] In response to the difference between the first physical address and the second physical address of the DU, a second bit value is generated to indicate that the data recorded by the DU is invalid.
[0034] Optionally, the NVM chip includes N LUNs, and physical pages from the N LUNs construct at least one page stripe, wherein each physical page includes J DUs;
[0035] In the P2L table entries corresponding to a single page stripe, each LUN provides a set of DUs for that page stripe, which corresponds to a set of entries in the P2L table. Each LUN provides a set of DUs for that page stripe with a different DU number, and the DU number is a component of the first physical address.
[0036] Within different LUNs on the same page strip, each LUN corresponds to a set of DUs with the same DU number.
[0037] Optionally, the P2L table is a first table organization method, the P2L table includes at least one P2L sub-table, the P2L sub-table corresponds one-to-one with the page strip, and the number of entries corresponding to the P2L sub-table is equal to the number of DUs included in the page strip;
[0038] In the P2L sub-table, the P2L table entries of a group of DUs from the same LUN are arranged consecutively. For adjacent LUNs, the P2L table entries of the last DU from the previous LUN and the P2L table entries of the first DU from the next LUN are arranged consecutively.
[0039] In the P2L table, adjacent P2L sub-tables are arranged such that the last P2L entry of the preceding P2L sub-table and the first P2L entry of the following P2L sub-table are arranged consecutively. The consecutively arranged P2L entries are continuous in the storage space storing the P2L table. The number of entries between the first DU of each of the two adjacent P2L sub-tables is a preset value; or, the arrangement of the last P2L entry of the preceding P2L sub-table and the first P2L entry of the following P2L sub-table has a specified interval.
[0040] Optionally, each LUN includes K Planes, where K is greater than or equal to 2;
[0041] In a single-page stripe, each Plane of the LUN provides M physical pages for that single-page stripe, where M is greater than or equal to 1. A single LUN provides a set of Q DUs for that single-page stripe, where Q is K*M*J.
[0042] The Q DUs from a single LUN are arranged in the single LUN according to the first arrangement based on the DU number.
[0043] Optionally, the P2L sub-table corresponding to a single page strip includes Q*N entries;
[0044] A single LUN provides a set of Q DUs, and the Q entries corresponding to them are arranged consecutively in the P2L sub-table according to the DU number of their corresponding DUs;
[0045] Two LUNs with adjacent LUN numbers provide two sets of Q DUs. The entry corresponding to the last DU in the first set of Q DUs in the P2L sub-table is adjacent to the entry corresponding to the first DU in the second set of Q DUs in the P2L sub-table.
[0046] Optionally, each LUN includes 1 Plane, a single LUN includes at least two physical blocks, each physical block includes P+1 physical pages, N physical pages from N LUNs with the same physical page number constitute a page stripe, physical blocks from N LUNs with the same physical block number constitute a large block, and physical blocks from N LUNs constitute at least two large blocks, each large block includes P+1 page stripes;
[0047] In a single-page strip, a single LUN provides a set of J DUs, and the J DUs from the single LUN are arranged according to the second arrangement based on the DU number;
[0048] The first DU of adjacent blocks has the same number.
[0049] Optionally, the P2L sub-table corresponding to a single page strip includes J*N entries;
[0050] The J entries corresponding to a set of J DUs provided by a single LUN are arranged consecutively in the P2L sub-table according to the DU number of their corresponding DUs;
[0051] Two LUNs with adjacent LUN numbers provide two groups of J DUs. The entry corresponding to the last DU in the first group of J DUs in the P2L sub-table is adjacent to the entry corresponding to the first DU in the second group of J DUs in the P2L sub-table.
[0052] The P2L table entries of the last DU in the preceding block and the P2L table entries of the first DU in the following block are either consecutive or have an interval of a specified size.
[0053] Optionally, the NVM chip includes N LUNs, each LUN includes G physical pages with different physical page numbers, and each physical page includes J DUs;
[0054] In different LUNs in the same row, N physical pages with the same physical page number correspond to J DUs with the same DU number;
[0055] In a single LUN, for adjacent physical pages with consecutive physical page numbers, the DU number of the last DU of the preceding physical page is consecutive to the DU number of the first DU of the following physical page.
[0056] In each physical page of a single LUN, J DUs are arranged according to the third arrangement based on the DU number.
[0057] Optionally, within a single physical page, the P2L table entries corresponding to J DUs are arranged consecutively in the order of their corresponding DU numbers;
[0058] For adjacent physical pages in a single LUN, the P2L table entry of the last DU of the preceding physical page is adjacent to or has a specified size gap with the P2L table entry of the first DU of the following physical page.
[0059] For adjacent LUNs, the P2L table entry of the last DU of the preceding LUN is adjacent to the P2L table entry of the first DU of the following LUN.
[0060] Optionally, the P2L table is a second table organization method. The P2L table includes N P2L sub-tables, and the N P2L sub-tables correspond one-to-one with the N LUNs. The number of entries in the P2L sub-tables is equal to the number of DUs included in a single LUN.
[0061] For two adjacent P2L sub-tables, the LUN number corresponding to the first P2L sub-table is consecutive to the LUN number corresponding to the second P2L sub-table.
[0062] Optionally, L out of the N LUNs are not used to carry user-written data, and the remaining (NL) LUNs are used to carry user-written data, where the value of L is greater than or equal to 1 and less than N;
[0063] For (NL) LUNs, the P2L table entries corresponding to the J DUs of a single physical page are arranged consecutively in the P2L table according to the DU number of the corresponding DU. In adjacent physical pages with the same physical page number among (NL) physical pages in adjacent LUNs, the P2L table entry of the last DU of the previous physical page is adjacent to the P2L table entry of the first DU of the next physical page.
[0064] In (NL) LUNs, the P2L table entry of the last DU of the last physical page in the previous (NL) physical pages is adjacent to or has a specified gap with the P2L table entry of the first DU of the first physical page in the next (NL) physical pages.
[0065] Optionally, the P2L table is a third table organization method. The P2L table includes G P2L sub-tables. Each P2L sub-table corresponds to a row of (NL) physical pages with the same physical page number, which are used to carry data written by the user. The number of entries in the P2L sub-table is equal to the number of DUs included in the (NL) physical pages.
[0066] The entries in the P2L sub-table are arranged consecutively in the order of their corresponding DU numbers;
[0067] In two adjacent P2L sub-tables, the physical page numbers of the (NL) physical pages corresponding to the first P2L sub-table are consecutive to the physical page numbers of the (NL) physical pages corresponding to the second P2L sub-table.
[0068] Optionally, in N LUNs, the first R physical pages of each LUN are not used to carry user-written data, and the remaining (GR) physical pages are used to carry user-written data, where L is greater than or equal to 1 and less than N, and R is greater than or equal to 1 and less than G.
[0069] A row of physical pages with physical page number i, the row of physical pages before it includes N or (NL) physical pages with physical page number i-1 used to carry data written by the user, and the row of physical pages after it includes N or (NL) physical pages with physical page number i+1 used to carry data written by the user, where the value of i is greater than or equal to 0 and less than or equal to G-1.
[0070] For two adjacent rows of physical pages, the P2L table entry of the last DU of the last physical page in the previous row that carries user-written data is adjacent to or has a specified interval with the P2L table entry of the first DU of the first physical page in the next row that carries user-written data.
[0071] Optionally, the P2L table is a fourth table organization method. The P2L table includes G P2L sub-tables. Each P2L sub-table corresponds to a row of physical pages with the same physical page number used to carry user-written data. The number of entries in the P2L sub-table is equal to the number of DUs included in a row of physical pages with the same physical page number used to carry user-written data.
[0072] For each P2L sub-table, it corresponds to N or (NL) physical pages used to carry user-written data. The entries corresponding to the J DUs of a single physical page are arranged consecutively in the P2L table according to the DU number of the corresponding DU. For adjacent physical pages corresponding to the P2L sub-table, the P2L table entries of the last DU of the previous physical page and the P2L table entries of the first DU of the next physical page are arranged consecutively.
[0073] The physical page numbers corresponding to two adjacent P2L sub-tables are consecutive.
[0074] Optionally, the method further includes:
[0075] The P2L table is generated by scanning the DUs arranged in the NVM chip; or
[0076] At least one P2L sub-table is read from a specified location on the NVM chip, and the P2L table is generated based on the at least one P2L sub-table.
[0077] Optionally, each entry in the P2L table is read sequentially to obtain the LBA value of the entry record, and the first physical address of the DU corresponding to the next entry is obtained sequentially based on the first physical address of the DU corresponding to each entry;
[0078] Access the FTL table based on the LBA value of the entry to obtain the second physical address corresponding to the LBA value;
[0079] The comparison between the second physical address and the first physical address is used to obtain the comparison results of the first physical address and the second physical address of the DU.
[0080] Optionally, during the process of sequentially reading each entry of the P2L table, the LBA value is obtained based on the entry value, and the first physical address of the DU corresponding to the next entry is obtained by incrementing by 1 based on the first physical address of the DU corresponding to the current entry.
[0081] Optionally, the first physical address includes a field indicating the DU number, a field indicating the LUN number, and a field indicating the physical block number. When performing the increment calculation, the value of the field indicating the DU number is incremented by 1.
[0082] Specifically, the field value indicating the DU number carries over to the field value indicating the LUN number, and the field value indicating the LUN number carries over to the field value indicating the physical block number.
[0083] Optionally, in the first table organization mode, after the field value of the DU number is incremented to the maximum value, it is calculated based on adding 1 to wrap back to the minimum value and generate a carry signal; after the field value of the LUN number is incremented to the maximum value, it is calculated based on the carry signal to wrap back to the minimum value and generate a carry signal.
[0084] After both the field value of the indicator DU and the field value of the indicator LUN generate a carry signal, the field value of the indicator DU is updated based on the offset;
[0085] The field value indicating the physical block number is based on the carry signal, incrementing the recorded physical page number by 1. The field value indicating the physical block number is incremented by 1 when the recorded physical page number is updated to the largest physical page number in the physical block.
[0086] Optionally, in the second table organization mode, when the field value indicating the DU number reaches the maximum count value, the field value indicating the DU number is updated based on the offset.
[0087] Optionally, in the third table organization mode, after the field value of the DU number is incremented to the maximum value, it is calculated based on adding 1 to wrap back to the minimum value and generate a carry signal. The carry signal is provided to the next LUN used to carry the data written by the user.
[0088] The increment range of the field value indicating the LUN number corresponds to the arrangement range of (NL) LUNs in N LUNs. Among the N LUNs, L LUNs other than the (NL) LUNs are not used to carry user-written data. After the field value indicating the LUN number increments to the maximum value, it wraps back to the minimum value based on the carry signal and generates a carry signal.
[0089] Optionally, in the fourth table organization mode, after the field value of the DU number is incremented to the maximum value, it is calculated based on adding 1 to wrap back to the minimum value and generate a carry signal. The carry signal is provided to the next LUN used to carry the data written by the user.
[0090] The timing of a carry in the field value indicating the LUN number is related to the arrangement of L LUNs within N LUNs and the arrangement of R physical pages within L LUNs.
[0091] Optionally, multiple entries in the P2L table can be read in a single operation;
[0092] Based on the LBA value of each entry record in the multiple entries, the FTL table is accessed sequentially to obtain the second physical address of the FTL table entry record.
[0093] Optionally, upon power-up, the P2L table is generated, and the valid data bitmap is generated based on the P2L table.
[0094] Secondly, embodiments of this application provide a bitmap generator, including: a PPA+1 calculator, a first reading unit, a second reading unit, a comparator, and a writing unit;
[0095] The output of the PPA+1 calculator is connected to the input of the comparator, the output of the first reading unit is connected to the input of the second reading unit, the output of the second reading unit is connected to the input of the comparator, and the output of the comparator is connected to the write unit.
[0096] The first reading unit receives the first address of the P2L table entry, reads the P2L table entry according to the first address, and provides the LBA value of the DU recorded in the read P2L table entry to the second reading unit. The entry address of the subsequent entry of the P2L table entry read by the first reading unit is further provided to the first reading unit.
[0097] The second read unit reads the FTL table entry from the FTL table according to the received LBA value, and provides the second physical address recorded in the read FTL table entry to the comparator;
[0098] The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator.
[0099] The comparator compares the first physical address output by the PPA+1 calculator with the second physical address output by the second read unit to generate a bit value indicating whether the data recorded by the DU is valid.
[0100] The write unit writes the bit value output by the comparator into the valid data bitmap.
[0101] Optionally, during the initial calculation, the PPA+1 calculator outputs the first physical address before the increment calculation.
[0102] Optionally, if the first physical address and the second physical address of DU are the same, the comparator outputs the first bit value indicating that the data recorded by DU is valid data;
[0103] If the first physical address and the second physical address of the DU are different, the comparator outputs a second bit value indicating that the data recorded by the DU is invalid.
[0104] Optionally, adjacent entries in the P2L table represent DUs that are adjacent in the writing order, and the two first physical addresses corresponding to two adjacent DUs are addresses for accessing the contiguous storage space of the NVM chip.
[0105] Thirdly, embodiments of this application provide a bitmap generator, including: a PPA+1 calculator, a first read unit, a first FIFO, a second read unit, a second FIFO, a comparator, a third FIFO, and a write unit;
[0106] The output of the PPA+1 calculator is connected to the input of the comparator. The output of the first read unit is connected to the input of the first FIFO. The output of the first FIFO is connected to the input of the second read unit. The output of the second read unit is connected to the input of the second FIFO. The output of the second FIFO is connected to the input of the comparator. The output of the comparator is connected to the input of the third FIFO. The output of the third FIFO is connected to the write unit.
[0107] The first read unit receives the first address of a P2L table entry, reads multiple consecutive entries from the P2L table according to the first address, and stores multiple LBA values obtained from the multiple consecutive entries continuously in the first FIFO. The entry address of the subsequent entry of the P2L table entry read by the first read unit is continued to be provided to the first read unit.
[0108] The second read unit reads the FTL table based on multiple LBA values obtained from the first FIFO, and obtains multiple second physical addresses to store in the second FIFO;
[0109] The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator.
[0110] The comparator compares the second physical address output by the second FIFO with the first physical address output by the PPA+1 calculator, and generates a bit value indicating whether the data recorded by the DU is valid, which is then stored in the third FIFO.
[0111] When the number of bit values stored in the third FIFO meets the condition, the write unit writes multiple bit values in batches into the valid data bitmap.
[0112] Optionally, during the initial calculation, the PPA+1 calculator outputs the first physical address before the increment calculation.
[0113] Optionally, the second read unit reads the FTL table each time based on one LBA value from the first FIFO to obtain a single second physical address;
[0114] Each time, the comparator generates a bit value based on a second physical address from the second FIFO and a first physical address from the PPA+1 calculator.
[0115] Fourthly, embodiments of this application provide a bitmap generator, including: a PPA+1 calculator, a third read unit, a first FIFO, a second FIFO, a selector, a comparator, a third FIFO, and a write unit;
[0116] The output of the PPA+1 calculator is connected to the input of the comparator; the output of the selector is connected to the input of the third read unit; the output of the third read unit is connected to the inputs of the first FIFO and the second FIFO; the output of the first FIFO is connected to the input of the selector; the output of the second FIFO is connected to the input of the comparator; the output of the comparator is connected to the input of the third FIFO; and the output of the third FIFO is connected to the write unit.
[0117] The selector selects the first address of the P2L table entry and provides it to the third reading unit. The third reading unit reads the P2L table entry according to the first address and provides the LBA value of the DU recorded in the read P2L table entry to the first FIFO.
[0118] The first FIFO provides the LBA value to the selector, the selector selects to provide the LBA value to the third read unit, and the third read unit reads the second physical address of the FTL table entry based on the LBA value and records it in the second FIFO;
[0119] The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator.
[0120] The comparator compares the second physical address output by the second FIFO with the first physical address output by the PPA+1 calculator, generates a bit value indicating whether the data recorded by the DU is valid, and stores it in the third FIFO;
[0121] When the number of bit values stored in the third FIFO meets the condition, the write unit writes multiple bit values in batches into the valid data bitmap;
[0122] The entry address of the successor entry of the P2L table entry read by the third read unit is then provided to the selector.
[0123] Optionally, during the initial calculation, the PPA+1 calculator outputs the first physical address before the increment calculation.
[0124] Optionally, the third read unit reads multiple consecutive entries from the P2L table at a single time based on the first address, and obtains multiple LBA values which are then stored consecutively in the first FIFO;
[0125] The selector selects to receive one LBA value from the first FIFO each time, and multiple LBA values are provided to the selector by the first FIFO multiple times.
[0126] The selector outputs the received LBA value to the third read unit, and the third read unit reads the FTL table based on the LBA value to obtain the second physical address and stores it in the second FIFO;
[0127] The comparator compares the second physical address from the second FIFO and the first physical address from the PPA+1 calculator each time, and obtains the bit value to store in the third FIFO;
[0128] When the number of bit values in the third FIFO meets the condition, the write unit writes multiple bit values in batches into the valid data bitmap.
[0129] Optionally, the first physical address includes a field indicating the DU number, a field indicating the LUN number, and a field indicating the physical block number. When the PPA+1 calculator increments the first physical address by 1, the value of the field indicating the DU number is incremented by 1.
[0130] Specifically, the field value indicating the DU number carries over to the field value indicating the LUN number, and the field value indicating the LUN number carries over to the field value indicating the physical block number.
[0131] Optionally, in the first table organization mode, the PPA+1 calculator, after the field value of the DU number is incremented to the maximum value, calculates the wrapback to the minimum value based on adding 1 and generates a carry signal; after the field value of the LUN number is incremented to the maximum value, calculates the wrapback to the minimum value based on the carry signal and generates a carry signal.
[0132] After both the field value of the indicator DU and the field value of the indicator LUN generate a carry signal, the field value of the indicator DU is updated based on the offset;
[0133] The field value indicating the physical block number is based on the carry signal, incrementing the recorded physical page number by 1. The field value indicating the physical block number is incremented by 1 when the recorded physical page number is updated to the largest physical page number in the physical block.
[0134] Optionally, in the second table organization mode, when the field value of the DU number reaches the maximum count value, the field value of the DU number is updated based on the offset.
[0135] Optionally, in the third table organization mode, the PPA+1 calculator, after the field value of the DU number is incremented to the maximum value, calculates the loop back to the minimum value based on adding 1 and generates a carry signal, which is provided to the next LUN for carrying the data written by the user;
[0136] The increment range of the field value indicating the LUN number corresponds to the arrangement range of (NL) LUNs in N LUNs. Among the N LUNs, L LUNs other than the (NL) LUNs are not used to carry user-written data. After the field value indicating the LUN number increments to the maximum value, it wraps back to the minimum value based on the carry signal and generates a carry signal.
[0137] Optionally, in the fourth table organization mode, the PPA+1 calculator, after the field value of the DU number is incremented to the maximum value, calculates the loop back to the minimum value based on adding 1 and generates a carry signal, which is provided to the next LUN for carrying the data written by the user;
[0138] The timing of a carry in the field value indicating the LUN number is related to the arrangement of L LUNs within N LUNs and the arrangement of R physical pages within L LUNs.
[0139] Optionally, the first reading unit or the third reading unit reads multiple entries in the P2L table in a single reading.
[0140] The second read unit or the third read unit accesses the FTL table sequentially based on the LBA of each entry record in the multiple entries to obtain the second physical address of the FTL table entry record.
[0141] According to the embodiments of this application, a P2L table is constructed based on the LBA recorded by each DU in the NVM chip, and an effective data bitmap is created using the P2L table and the FTL table. The creation of the effective data bitmap can be completed in a short time and with less overhead. This process does not require the participation of the CPU and the creation of the effective data bitmap is completed based on the hardware circuit, which can avoid occupying CPU processing resources and ensure the performance of the storage device.
[0142] In a further embodiment, by batch reading P2L table entries for LBA caching, sequentially caching PPAs read multiple times based on LBA, and storing multiple bit values in registers, the number of reads of the P2L table and FTL table, as well as the number of data writes to the valid data bitmap, can be reduced while making full use of bus bandwidth; hardware costs can be saved by simplifying the structure of the bitmap generator.
[0143] In a further embodiment, by providing a PPA+1 calculator for incremental PPA calculation, the next adjacent PPA can be calculated based on the relationship between PPAs of adjacent DUs, given the current PPA. By providing a universal PPA+1 calculator, the order of field values can be adjusted according to the carry relationship between fields to adapt to different PPA formats. Regardless of the organization of the P2L table, carry can be generated at the appropriate time based on the organization of the P2L table, thereby using the PPA+1 calculator to perform PPA+1 calculation. Attached Figure Description
[0144] Figure 1A A block diagram representing a storage device;
[0145] Figure 1B A detailed block diagram of the control components of the storage device is shown;
[0146] Figure 2 A schematic diagram representing an FTL table;
[0147] Figure 3 Diagram 1 showing a large block;
[0148] Figure 4A A schematic diagram illustrating page stripes;
[0149] Figure 4B A diagram showing page stripes Figure 2 ;
[0150] Figure 4C A diagram showing page stripes Figure 3 ;
[0151] Figure 5A This displays a valid data bitmap used to record the data status corresponding to the data stored in the DU;
[0152] Figure 5B The flowchart for creating a valid data bitmap is shown.
[0153] Figure 6A This application illustrates a schematic diagram of the DU arrangement in an NVM chip according to an embodiment of the present application;
[0154] Figure 6B Demonstrated based on Figure 6AA schematic diagram of the P2L table constructed using the DU arrangement method;
[0155] Figure 7 This application illustrates a flowchart of an embodiment for creating a valid data bitmap based on a P2L table;
[0156] Figure 8 A schematic diagram illustrating the address format of the PPA provided in an embodiment of this application is shown;
[0157] Figure 9 The hardware structure diagram of a PPA plus 1 calculator provided in one embodiment of this application is shown;
[0158] Figure 10 A hardware structure diagram of a bitmap generator provided in one embodiment of this application is shown;
[0159] Figure 11 A hardware structure diagram of a bitmap generator provided in another embodiment of this application is shown;
[0160] Figure 12 A hardware structure diagram of a bitmap generator provided in another embodiment of this application is shown;
[0161] Figure 13A A schematic diagram illustrating the DU arrangement in an NVM chip according to another embodiment of this application is shown;
[0162] Figure 13B Demonstrated based on Figure 13A A schematic diagram of the P2L table constructed using the DU arrangement method;
[0163] Figure 14A This application illustrates a schematic diagram of the write sequence for writing data to an NVM chip according to an embodiment of the present application;
[0164] Figure 14B This illustration shows a schematic diagram of the write sequence for writing data to an NVM chip according to another embodiment of this application;
[0165] Figure 14C This illustration shows a schematic diagram of the write sequence for writing data to an NVM chip according to another embodiment of this application;
[0166] Figure 15A A hardware block diagram of a general-purpose PPA+1 calculator according to an embodiment of this application is shown;
[0167] Figure 15B A hardware block diagram of a general-purpose PPA+1 calculator according to another embodiment of this application is shown. Detailed Implementation
[0168] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0169] Figure 6A This paper presents a schematic diagram illustrating the DU arrangement in an NVM chip according to an embodiment of this application.
[0170] exist Figure 6A In this context, a large block (0) comprises physical blocks from N logical units (LUNs), and a single page stripe comprises physical pages provided by N logical units. Each LUN comprises 4 planes. For a given page stripe, the 4 planes corresponding to a single LUN provide 12 physical pages (3 physical pages per plane) and 24 DUs (2 DUs per physical page). The 24 DUs can be configured according to... Figure 6A The DUs are arranged in a specific way, with 4 DUs per row as a group. The 6 groups of DUs are distributed across 6 rows within the physical page. The last DU in one row is consecutive to the first DU in the next row. Of course, Figure 6A The example shown is just one specific arrangement. The 24 DUs can also be arranged in other ways, which will not be elaborated on here.
[0171] When writing data to block 0, data is first written to page stripe 0. After page stripe 0 is full, data is written to page stripe 1. After page stripe 1 is full, data is written to page stripe 2. Specifically, when writing data to page stripes, data is first written to LUN0. After the 24 DUs in LUN0 are full, data is written to LUN1. This sequence is followed to complete the data writing of page stripes.
[0172] Within the same page strip, DUs corresponding to different LUNs have the same numbering; that is, DUs within different LUNs are numbered using the same numbering method. For example, in... Figure 6A In each page strip, the 24 DUs corresponding to LUN0 are numbered the same as the 24 DUs in LUN1, LUN2...LUNN-1.
[0173] Within the same LUN, the DU (Duplex Unit) numbers differ across different page strips. For two adjacent page strips, the difference between the DU numbers at the same position in the latter and former page strips is called the offset. The offset represents the difference between the first DU numbers of each adjacent page strip. Figure 6AIn this example, the offset value is 24. The first DU number of page strip 0 is 0, the first DU number of page strip 1 is 24, the first DU number of page strip 2 is 48, and the difference between the first DU numbers of adjacent page strips is 24. It can be understood that the offset value depends on the organization of DUs in the NVM chip and is also related to the specifications and model of the NVM chip; therefore, the offset value varies in different implementations.
[0174] Figure 6B Demonstrated based on Figure 6A A schematic diagram of the P2L table constructed using the DU arrangement method.
[0175] Figure 6B The P2L table shown is constructed based on the LBAs recorded by each DU in the NVM chip. Each entry in the P2L table records the LBA corresponding to the DU, with the DU's PPA serving as the entry index, and the entry value being the LBA corresponding to the DU. The DU's PPA is an attribute of the DU itself, known when the DU is accessed. While it's possible to additionally record the PPAs of DUs in the P2L table, due to the large number of entries and the resulting large table size, to save memory, only the LBAs corresponding to the DUs are recorded in the P2L table.
[0176] Figure 6B The P2L table shown comprises multiple P2L sub-tables, each corresponding to a page strip. A single P2L sub-table contains multiple entries, and each entry in a P2L sub-table corresponds one-to-one with a DU in the matching page strip. Each entry in a P2L sub-table records the LBA of each DU belonging to the same page strip. The entries in the P2L sub-table are arranged contiguously in the SRAM (Static Random-Access Memory) or DRAM (hereinafter referred to as main memory) of the storage device, with no gaps between entries. Therefore, knowing the starting address of the P2L sub-table in main memory, each entry in the P2L sub-table can be indexed by the PPA of the DU corresponding to that entry.
[0177] For example, sub-table 0 of P2L corresponds to page stripe 0. In Figure 6B In this context, PPA represents the physical address of the corresponding DU. The PPA serves as the entry index for the entry corresponding to the DU, and the entry value is the LBA recorded in that DU. For example, PPA0-PPA23 represent the physical addresses of DU0-DU23 in the storage space provided by LUN0 for page stripe 0, respectively. It's important to understand that... Figure 6B In the DUi, i represents the combined number of the physical page and the DU within the physical block, and PPAj represents the physical address. j is used to distinguish different physical addresses and does not represent the value of the physical address.
[0178] In the P2L subtable, the entries corresponding to the 24 DUs from adjacent LUNs are also arranged consecutively. For example, PPA24 represents the physical address of DU0 of LUN 1, and the entry corresponding to DU0 of LUN 1 and the entry corresponding to DU23 of LUN 0 are consecutive in the P2L subtable. Similarly, for the same page stripe, the entries corresponding to DU0 of LUN i and DU23 of LUNi-1 are also consecutively arranged in the P2L subtable.
[0179] For adjacent P2L sub-tables within a P2L table, the two sub-tables are adjacent to each other, meaning the first entry of the latter sub-table is consecutive to the last entry of the former sub-table (the last entry of the former sub-table is stored in memory adjacent to the first entry of the latter sub-table). Figure 6B In the middle, the entries corresponding to PPA24*N-1 and PPA24*N are arranged consecutively. At this time, the entries corresponding to DU23 of LUN N-1 in page strip 0 (the DU that matches PPA24*N-1) and DU24 of LUN0 in page strip 1 (the DU that matches PPA24*N) are arranged consecutively.
[0180] In another implementation, for adjacent P2L sub-tables within the P2L table, the two P2L sub-tables are not adjacent at the beginning and end; that is, the first entry of the latter P2L sub-table is not consecutive with the last entry of the former P2L sub-table (the memory location of the last entry of the former P2L sub-table is spaced apart from the memory location of the first entry of the latter P2L sub-table). For example, in Figure 6B The interval between the entries corresponding to PPA24*N-1 and PPA24*N is represented by offset. Therefore, the physical address of PPA24*N can be calculated based on the physical address of PPA24*N-1 and the offset.
[0181] In this embodiment, data identical to the content of the P2L sub-table can be recorded separately in a page stripe. Optionally, the P2L sub-tables of all page stripes in a large block can be recorded using a designated storage space. In this case, the data of each P2L sub-table is read from the NVM chip, and a P2L table is formed in memory. As another option, the P2L sub-table is not recorded separately on the NVM chip, but is constructed by scanning each DU, and a P2L table is formed in memory. The process of creating the P2L table can occur when the storage device is powered on or during the operation of the storage device.
[0182] By constructing a P2L table, the FTL table can be read based on the LBA recorded in the entries of the P2L table, the PPA' recorded in the FTL table can be obtained, and the corresponding bit value can be generated according to the comparison between PPA' and the PPA corresponding to DU, so as to create a valid data bitmap and realize the creation of a valid data bitmap using the P2L table.
[0183] Figure 7 The flowchart illustrating the creation of a valid data bitmap based on a P2L table in an embodiment of this application is shown.
[0184] Step 701: Obtain the DU of the starting position of the P2L table. Next, execute steps 702 and 703.
[0185] Step 702: Obtain the LBA corresponding to the DU from the P2L table. After step 702, proceed to step 704.
[0186] Step 703: Obtain the PPA corresponding to the DU (the PPA is an attribute of the DU itself). Step 705 is executed after step 703, and steps 702 and 703 are executed concurrently, without a strict order between them.
[0187] Step 704: Based on the LBA obtained in step 702, query the FTL table to obtain PPA'.
[0188] Step 705: Check whether the PPA obtained in step 703 is the same as the PPA' obtained in step 704. If they are the same, proceed to step 706; otherwise, proceed to step 707.
[0189] Step 706: Set the bit value corresponding to PPA to 1 in the valid data bitmap, and then execute step 708.
[0190] Step 707: Set the bit value corresponding to PPA to 0 in the valid data bitmap, and then execute step 708.
[0191] Step 708: Obtain the next consecutive DU from the P2L table, and then return to steps 702 and 703.
[0192] By constructing a P2L table based on the LBAs recorded by each DU in the NVM chip, and using the P2L table to create a valid data bitmap, the creation of a valid data bitmap can be completed in a short time and with minimal overhead, without consuming a large amount of CPU processing resources, thus ensuring the performance of the storage device.
[0193] The PPA in the P2L table serves as an entry index, used to index the LBA as the entry value. The storage device uses the PPA' recorded in the FTL table to access the storage space provided by the NVM chip. The physical address describes the LUN number, physical block number, physical page number, DU number, etc. Since the storage device has multiple NVM chips, the PPA' stored in the FTL table differs in form from the physical address sent to the NVM chip when accessing it. For example, the physical address sent to the NVM chip does not need to describe the LUN number, but it does need to describe the physical page number. In the PPA' of the FTL table, all DUs provided by physical pages within a physical block are uniformly numbered. For example, if a physical block has 300 physical pages, and each physical page provides 4 DUs, the corresponding DU number ranges from 0 to 1199. Therefore, when accessing the NVM chip, the DU number needs to be converted into a physical page number that the NVM chip can accept.
[0194] The PPAs in the P2L table have the same address format as the PPAs in the FTL table. Each DU corresponds one-to-one with a physical address; a DU number can be used to represent a physical address. The address format of the PPA is illustrated below with an example. Figure 8 As shown, the PPA includes three fields, denoted as Field 0, Field 1, and Field 2, each with a specified number of bits. Field 0 (e.g., 12 bits) indicates the DU number, with a value of 0-(24*W-1), where W is the number of word lines or physical pages in the physical block; Field 1 (e.g., 4 bits) indicates the LUN number, with a value of 0-7; and Field 2 (e.g., 10 bits) indicates the block number, with a value of 0-700. Based on... Figure 8 The address format shown Figure 6B The multiple physical addresses in the data are represented as follows:
[0195] PPA0 = [block0][DU0][LUN0]
[0196] PPA1 = [block0][DU1][LUN0]
[0197] PPA23 = [block 0][DU23][LUN0]
[0198] PPA24 = [block 0][DU0][LUN1]
[0199] ...
[0200] PPA47 = [Block 0][DU23][LUN1]
[0201] PPA48 = [block 0][DU0][LUN2]
[0202] In the P2L table, the offset between the first DU number in the corresponding two page stripes of two adjacent P2L sub-tables is called offset. Figure 6A The physical address representation of the first DU in different page stripes is as follows:
[0203] The physical address of the first DU in page stripe 0 is PPA0 = [block 0][DU0][LUN0].
[0204] The physical address of the first DU in page stripe 1 is PPA24*N = [block 0][DU0+offset][LUN0].
[0205] The physical address of the first DU in page stripe 2 is PPA48*N = [block 0][DU0+2*offset][LUN0].
[0206] In the above example, the offset values between page stripe 0 and page stripe 1, and between page stripe 1 and page stripe 2, are the same. As an alternative implementation, for page stripe groups composed of different adjacent page stripes, the corresponding offset values can be different. For example, page stripe 0 and page stripe 1 form page stripe group 0, and page stripe 1 and page stripe 2 form page stripe group 1. The offset values corresponding to page stripe group 0 are different from those corresponding to page stripe group 1.
[0207] based on Figure 8 The physical address format shown is for Figure 6B The physical addresses of adjacent DUs will be introduced.
[0208] For example, the current DU is DU0 of LUN0 in page stripe 0, and the physical address corresponding to DU0 is PPA0 = [block 0][DU0][LUN0]; the next DU in the P2L table is DU1 of LUN0 in page stripe 0, and the physical address corresponding to DU1 is PPA1 = [block 0][DU1][LUN0].
[0209] If the current DU is DU1 of LUN0 in page stripe 0, the physical address PPA1 corresponding to DU1 is [block 0][DU1][LUN0]; the next DU in the P2L table is DU2 of LUN0 in page stripe 0, and the physical address PPA2 corresponding to DU2 is [block 0][DU2][LUN0].
[0210] If the current DU is DU23 of LUN N-1 in page stripe 0, the physical address corresponding to DU23 is PPA24*N-1 = [block 0][DU23][LUN N-1]; the next DU in the P2L table is DU24 of LUN0 in page stripe 1, and the corresponding physical address is PPA24*N = [block 0][DU0+offset][LUN0].
[0211] Based on the examples above, it can be seen that adjacent PPAs (PPA) j With PPA j+1 The numerical relationship between PPA and PPA cannot be calculated using a typical addition circuit by adding 1. Therefore, this application provides a hardware circuit (also called a PPA+1 calculator) for calculating the PPA+1 value. j To calculate PPA j+1 This makes the calculated PPA j+1 It can be compared with the physical address recorded in the FTL table.
[0212] Figure 9 A hardware structure diagram of the PPA+1 calculator provided in one embodiment of this application is shown.
[0213] exist Figure 9 In the PPA+1 calculator, there are block number registers, DU number registers, and LUN number registers. The block number register is used to store the physical block number represented by the PPA (see also...). Figure 8 Field 2 indicates the block number; the DU number register is used to store the DU number represented by the PPA. Figure 8 Field 0 in the middle indicates the DU number), and the LUN number register is used to store the LUN number represented by the PPA. Figure 8 (Field 1 indicates the LUN number). The PPA value input to the PPA+1 calculator is stored in the block number register, DU number register, and LUN number register according to Field 2, Field 0, and Field 1, respectively.
[0214] The PPA+1 calculator also includes three incrementers (Incrementer 0, Incrementer 1, and Incrementer 2). Incrementer 2 receives the value from the block number register, Incrementer 1 receives the value from the LUN number register, and Incrementer 0 receives the value from the DU number register. For Incrementer 0, its inputs also receive a constant 1 (used to increment by 1), an offset value, and the value of register D. Register D is used to cache the DU number in the PPA corresponding to the first entry of each P2L sub-table, and the offset value is used when updating the page stripe number. The inputs of Incrementer 1 also receive the carry output from Incrementer 0, and the inputs of Incrementer 2 also receive the carry output from Incrementer 1.
[0215] The values of the DU register, LUN register, and block number register received by incrementers 0, 1, and 2 are used as the incrementers' own values. When incrementer 0 receives an increment signal, its own value is incremented by 1. When incrementers 1 and 2 receive a carry signal, their own values are incremented by 1. This increment is the usual increment calculation, which is different from the increment in the PPA+1 calculator.
[0216] exist Figure 9In the structure shown, incrementer 0 performs, for example, base-24 addition. The initial value of incrementer 0 is D. Every 24 increments relative to D result in a carry, and the value of incrementer 0 returns to D. Before completing 24 increments relative to D, the output of incrementer 0 is the previous value + 1. The number "24" here comes from... Figure 6A The DU arrangement shown provides 24 DUs for one page stripe by one LUN. This value depends on the construction method of the page stripe, the construction method of the P2L table, and the number of DUs written simultaneously by a multiplane programming operation supported by the NVM chip.
[0217] Among them, Figure 6A In the DU arrangement shown, for page stripe 0, register D caches the DU number in the PPA of the first entry of sub-table 0 of P2L, so the initial value of incrementer 0 is 0; for page stripe 1, register D caches the DU number in the PPA of the first entry of sub-table 1 of P2L, so the initial value of incrementer 0 is 24; for page stripe 2, register D caches the DU number in the PPA of the first entry of sub-table 2 of P2L, so the initial value of incrementer 0 is 48.
[0218] For any P2L sub-table, when calculating the PPA corresponding to the DU provided by each LUN, incrementer 0 performs an increment operation (+1), and the carry of incrementer 0 indicates that the calculation of the PPA corresponding to the DU provided by the next LUN should begin. After completing the calculation of all DUs corresponding to a P2L sub-table (one page strip), D is updated to D+offset, and incrementer 0 is initialized with the updated D to obtain the DU number in the PPA of the first entry of the next P2L sub-table, and then the calculation of PPA for each DU of the next P2L sub-table begins.
[0219] exist Figure 6A In the DU layout shown, the offset value between page stripe 0 and page stripe 1 is 24. The offset values for two adjacent page stripes may be the same or different, depending on the word line numbering of the NVM chip (the order in which word lines or physical pages are operated on during multiple programming operations within a physical block). For a specific NVM chip, the order in which word lines / physical pages are programmed within its physical block is determined. Based on this order, the offset for each page strip is calculated and provided to the PPA+1 calculator.
[0220] exist Figure 9In the structure shown, the initial value of incrementer 1 is provided by the LUN register. Each time a carry signal is received from incrementer 0, incrementer 1 increments its own value by 1. Incrementer 1 performs N-ary addition, where N represents the number of LUNs providing DUs for the page stripe. For example, if there are 16 LUNs providing DUs for the page stripe, and N is 16, the initial value of incrementer 1 is 0. When incrementer 0 generates 16 carry signals, the value of incrementer 1 changes from 15 to its initial value of 0. At this point, the processing of the first page stripe is complete, and incrementer 1 generates one carry.
[0221] exist Figure 9 In the structure shown, the initial value of incrementer 2 is provided by the block number register. Each time a carry signal is received from incrementer 1, the physical page number recorded by incrementer 2 is incremented by 1, and incrementer 2 does not need to generate a carry. The physical block number is updated when incrementer 1 generates multiple carry signals and the value of incrementer 2 is updated to the largest physical page number in the physical block. Figure 6A In the structure shown, when incrementer 1 generates, for example, 64 carry-overs, the physical page number value recorded by incrementer 2 is updated from the maximum physical page number 63 in the physical block to 0, and the physical block number is updated (incremented by 1).
[0222] Optionally, incrementer 0 also has a maximum value. The maximum value of incrementer 0 is, for example, 24*P, where 24 is the 24 DUs provided by a LUN for a page stripe, and P is the number of page stripes within the block. When incrementer 0 reaches its maximum value and incrementer 1 generates a carry, incrementer 2 is incremented by 1, thus eliminating the need for incrementer 2 to record the physical page number value itself.
[0223] The PPA+1 calculator also includes a block number output register, a DU output register, and a LUN output register. The block number output register receives the output of incrementer 2, the DU output register receives the output of incrementer 0, and the LUN output register receives the output of incrementer 1. These three registers are combined to form the PPA output register. The content of the PPA output register serves as the output of the PPA+1 calculator, which calculates the next physical address.
[0224] for Figure 9 The provided PPA+1 calculator takes the PPA corresponding to the previous entry in the P2L sub-table as input and outputs the PPA corresponding to the next entry in the P2L sub-table. It provides PPAs with the same address format as PPA' in the FTL table, enabling physical address comparison based on these identical address format PPAs.
[0225] Figure 10A hardware structure diagram of a bitmap generator provided in one embodiment of this application is shown.
[0226] The bitmap generator is used to generate valid data bitmaps based on the P2L table and FTL table, thereby reducing the workload of the CPU in generating valid data bitmaps.
[0227] exist Figure 10 In this design, the bitmap generator includes a PPA+1 counter, two read units (read unit 0 and read unit 1), a comparator, and a write unit. Read units 0 and 1 are, for example, bus read units or read units of DDR (Double Data Rate SDRAM) memory, used to read the corresponding data according to the address. The length of each data piece is, for example, the length of a PPA. The P2L table, FTL table, and valid data bitmap are, for example, located in DDR memory.
[0228] The first entry in the P2L table stores the LBA corresponding to the first DU in the P2L table, and its corresponding PPA is the initial PPA, which is input into the bitmap generator. The initial PPA is input to the PPA+1 calculator, and the starting address of the P2L table is input to read unit 0. In some cases, the initial PPA and the starting address of the P2L table have the same value or partially the same value. Optionally, the starting address of the P2L table can be mapped from the initial PPA. The PPA+1 calculator is used to calculate the next PPA; read unit 0 is used to read the P2L table. Read unit 0 obtains the value (LBA) of each entry in the P2L table by reading the P2L table. Read unit 0 reads one entry at a time, and after reading one entry, it reads the next adjacent entry, and so on, until all entries in the P2L table have been read. It should be noted that the first value output by the PPA+1 calculator is the initial PPA, and the subsequent outputs are the PPA values after each increment by 1; the first value read from read unit 0 is the LBA value of the first entry, and the subsequent values are the LBA values of the next consecutive entry.
[0229] Each LBA value read by read unit 0 is provided to read unit 1. Read unit 1 reads the FTL table based on the acquired LBA value to obtain the PPA' corresponding to the LBA value recorded in the FTL table. A comparator is connected to the PPA+1 calculator and read unit 1, used to compare whether the PPA' read by read unit 1 is the same as the PPA provided by the PPA+1 calculator. If they are the same, a 1-bit value of 1 is output; if they are different, a 1-bit value of 0 is output. A write unit is connected to the output of the comparator, used to sequentially write the bit values 0 or 1 output by the comparator into the valid data bitmap.
[0230] Figure 10 The bitmap generator shown below works as follows to generate valid data bitmaps:
[0231] The initial PPA value is input into the PPA+1 calculator, and the starting address of the P2L table is input into read unit 0. The initial output of the PPA+1 calculator is the initial PPA. Read unit 0 reads the first entry of the P2L table based on the received starting address and obtains the LBA value corresponding to the first entry, which is then provided to read unit 1. Read unit 1 reads the FTL table based on the obtained LBA value to obtain PPA'. The comparator compares the PPA' provided by read unit 1 with the PPA provided by the PPA+1 calculator, outputs a 1-bit value, and the write unit writes the 1-bit value to the valid data bitmap.
[0232] Next, the PPA+1 calculator calculates the next PPA based on the initial PPA and outputs it. Read unit 0 reads the next entry in the P2L table to obtain the corresponding LBA value and provides it to read unit 1. Read unit 1, the comparator, and the write unit perform the corresponding operation to write the corresponding 1-bit value to the valid data bitmap.
[0233] By repeating the above process, a valid data bitmap can be generated after the P2L table is read.
[0234] Throughout the implementation, for each entry in the P2L table processed, a 1-bit value is generated and written to the valid data bitmap. The bitmap generator continues to operate until all entries in the P2L table have been traversed, generating a complete valid data bitmap. This allows for the creation of valid data bitmaps using the bitmap generator without CPU intervention. The bitmap generator is a hardware circuit; creating the valid data bitmap based on this hardware circuit avoids consuming CPU processing resources, ensuring the performance of the storage device.
[0235] Figure 11 A hardware structure diagram of a bitmap generator provided in another embodiment of this application is shown.
[0236] and Figure 10 Compared to the structure shown, Figure 11 The provided bitmap generator adds three FIFO (First in First out) registers: a FIFO register for storing LBA (hereinafter referred to as FIFO register 1), a FIFO register for storing PPA' (hereinafter referred to as FIFO register 2), and a FIFO register for storing 1 bit value (hereinafter referred to as FIFO register 3).
[0237] FIFO register 1 is connected to the output of read unit 0. Read unit 0 reads multiple consecutive P2L table entries (e.g., 8 / 16 / 128) from the P2L table each time to fully utilize the bus bandwidth. The multiple entry values (LBAs) read by read unit 0 are provided to FIFO register 1, which buffers the multiple LBAs sequentially.
[0238] Read unit 1 reads the FTL table each time based on 1 LBA provided by FIFO register 1, and obtains 1 PPA'. This process continues, and the multiple PPA' read by read unit 1 are cached in FIFO register 2 in sequence.
[0239] The comparator retrieves one PPA from the calculator (PPA+1) and one PPA' from FIFO register 2 each time. By comparing PPA and PPA', it generates a 1-bit value and provides it to FIFO register 3. After storing a set of values (such as 64 / 256 1-bit values) in FIFO register 3, the write unit writes the set of values stored in FIFO register 3 into the valid data bitmap.
[0240] exist Figure 11 In the structure shown, the initial PPA is input to the PPA+1 calculator, and the starting address of the P2L table is input to read unit 0. After the PPA+1 calculator outputs the initial PPA, it performs the PPA+1 operation and outputs multiple PPAs sequentially. Read unit 0 reads entries sequentially from the first entry of the received P2L table based on the starting address of the P2L table, storing the batch-read LBAs in FIFO register 1. Read unit 1 retrieves one LBA from FIFO register 1 each time to read the FTL table, obtains PPA', and caches the multiple read PPA's sequentially in FIFO register 2. The comparator retrieves one PPA' from FIFO register 2 each time, compares it with the PPA provided by the PPA+1 calculator, and generates a 1-bit value stored in FIFO register 3. When the number of values stored in FIFO register 3 meets the condition, the write unit writes the values stored in FIFO register 3 in batches into the valid data bitmap.
[0241] By batch reading P2L table entries for LBA caching, sequentially caching multiple PPAs based on LBA reads in FIFO register 2, and storing multiple 1-bit values in FIFO register 3, the number of reads of the P2L table and FTL table, as well as the number of data writes to the valid data bitmap, can be reduced while making full use of bus bandwidth.
[0242] Figure 12 A hardware structure diagram of a bitmap generator provided in another embodiment of this application is shown.
[0243] and Figure 11 Compared to the structure shown, Figure 12The provided bitmap generator includes only one read unit (read unit 2) and adds a selector. Read unit 2 can read both the P2L table and the FTL table. By replacing two read units with one, the structure of the bitmap generator can be simplified. The added selector can be used to select whether the address provided to read unit 2 is the address of the P2L table or the address of the FTL table, thus allowing read unit 2 to read either the P2L table or the FTL table.
[0244] exist Figure 12 In this process, the starting address of the P2L table is first input to the selector, followed by the PPA (Programming Address Parcel) indexing the next entry in the P2L table or the LBA (Large Entrance Address Parcel) indexing a certain entry in the FTL table. The output of the selector is connected to read unit 2, providing it with either the PPA value for accessing the P2L table or the LBA value for accessing the FTL table. Read unit 2 is a read unit that reads data from memory based on addresses, and it has a basically the same structure as read unit 0 or read unit 1 in the previous embodiments. When a PPA value is provided to read unit 2, read unit 2 reads the LBA from the P2L table entry; when an LBA value is provided to read unit 2, read unit 2 reads the PPA' from the FTL table entry.
[0245] Read unit 2 first reads the P2L table based on the PPA selected by the selector. The output of read unit 2 is connected to a FIFO register for caching LBAs (hereinafter referred to as FIFO register 1) and a FIFO register for caching PPAs (hereinafter referred to as FIFO register 2). After reading multiple consecutive entries in the P2L table based on the PPA selected by the selector, read unit 2 stores the read LBAs in FIFO register 1.
[0246] The input of the selector is also connected to the output of FIFO register 1, which is used to obtain LBAs from FIFO register 1 and provide them to read unit 2. Read unit 2 receives one LBA from FIFO register 1 each time. Read unit 2 selects the output LBA based on the selector, reads the FTL table one by one using the LBAs, obtains PPA', and caches the read PPA' in order in FIFO register 2.
[0247] The output of FIFO register 2 is connected to a comparator, and the output of the PPA+1 calculator is also connected to the comparator. Each time, the comparator retrieves a PPA' from FIFO register 2, compares it with the PPA provided by the PPA+1 calculator, and generates a 1-bit value stored in a FIFO register (hereinafter referred to as FIFO register 3) used to buffer this 1-bit value. When the number of values stored in FIFO register 3 meets the condition, the write unit writes the values stored in FIFO register 3 in batches into the valid data bitmap.
[0248] exist Figure 12In the structure shown, the PPA+1 calculator takes an initial PPA as input, and the first value output by the PPA+1 calculator is the initial PPA. Then, the PPA+1 calculator performs a PPA+1 operation, sequentially outputting multiple PPAs. For Figure 12 The bitmap generator shown has the following workflow:
[0249] (1) After the starting address of the P2L table is provided to the selector, the selector provides the starting address of the P2L table to the read unit 2.
[0250] (2) Read unit 2 reads multiple consecutive (e.g., k) LBAs from the P2L table and caches the multiple LBAs in order in FIFO register 1.
[0251] (3) FIFO register 1 outputs 1 LBA to the selector each time.
[0252] (4) The selector selects one LBA from FIFO register 1 and sends it to read unit 2. Read unit 2 reads PPA' from the FTL table according to the LBA. Repeat steps (3) and (4) above until FIFO register 1 outputs all the buffered LBAs.
[0253] (5) The selector selects the next address of the output P2L table. The current address is k entries away from the start address of the P2L table, and then returns to step 2 and repeats steps (2) to (5).
[0254] In this process, read unit 2 reads PPA's from the FTL table according to LBA and caches them sequentially in FIFO register 2. The comparator retrieves one PPA' from FIFO register 2 and one PPA from the PPA+1 counter each time, compares the two, and generates a 1-bit value which is then written to FIFO register 3. The comparator needs to retrieve PPA's from FIFO register 2 whenever it contains data to generate a 1-bit value based on the comparison. When the number of 1-bit values cached in FIFO register 3 meets a certain condition, the write unit writes the values in FIFO register 3 in batches to the valid data bitmap.
[0255] By replacing two read units with one read unit and adding a selector to instruct the read unit to read either the P2L table or the FTL table, the structure of the bitmap generator can be simplified and costs can be saved.
[0256] Figure 13A This illustration shows a schematic diagram of the DU arrangement in an NVM chip according to another embodiment of this application.
[0257] exist Figure 13AThe system provides 8 LUNs, with a single page stripe corresponding to 8 LUNs. Each LUN includes 1 plane, and each LUN includes multiple physical blocks (such as block 0 and block 1). Each physical block includes multiple physical pages (P+1 physical pages in the diagram), and each physical page includes 8 DUs.
[0258] When writing data to the NVM chip, data is written to DU0-DU7 of physical block 0 of LUN 0 (equivalent to a full physical page). Then, instead of writing data to DU8 of LUN 0, data is written to DU0-DU7 of physical block 0 of LUN 1. Next, data is written sequentially to DU0-DU7 of physical blocks 0 of LUN 2 through LUN 7. After writing data to DU0-DU7 of physical block 0 of LUN 7, the process returns to LUN 0 and writes data to DU8-DU15 of physical block 0 of LUN 0 (equivalent to a physical page). Finally, data is written to DU8-DU15 of physical blocks 0 of LUN 1 through LUN 7. Figure 13A The arrows shown indicate the order in which data is written. Physical block 0 is written to before physical block 1 is written. The order in which the individual data units (DUs) in physical block 1 are written is the same as the order in which the DUs in physical block 0 are written.
[0259] exist Figure 13A In the layout, for the same page strip, the DUs corresponding to different LUNs are numbered the same. That is, the DUs in different LUNs are numbered in the same way. For example, in any page strip, the 8 DUs corresponding to LUN0 are numbered the same as the 8 DUs in LUN1, LUN2...LUN7.
[0260] For different page stripes, the DU numbers within the same LUN differ. For two adjacent page stripes, the difference between the DU numbers at the same position in the later page strip and the earlier page strip is the offset. Figure 13A In the table, the offset value is 8, the first DU number of page strip 0 is 0, the first DU number of page strip 1 is 8, the first DU number of page strip 2 is 16, and the difference between the first DU numbers of adjacent page strips is 8.
[0261] exist Figure 13A In this configuration, physical blocks 0 provided by LUN0 to LUN7 constitute large block 0, and physical blocks 1 provided by LUN0 to LUN7 constitute large block 1. For a single LUN, the DU arrangement and DU numbering in physical block 0 are the same as in physical block 1; that is, different physical blocks (such as block 0 and block 1) of each LUN have the same DU numbering and DU arrangement. When switching from physical block 0 to physical block 1, the DU numbers included in the LUN are restored to the initial DU numbers corresponding to physical block 0. Figure 13AIn this context, for each LUN, when switching from physical block 0 to physical block 1, the DU number is updated from DU 8P+7 to DU0.
[0262] Figure 13B Demonstrated based on Figure 13A A schematic diagram of the P2L table constructed using the DU arrangement method.
[0263] The P2L table records the LBAs of each DU, with the LBAs of the DUs arranged in the same order as the order in which data is written to each DU. It's important to understand that this consistent order is not mandatory, but it facilitates the simultaneous generation and recording of the P2L sub-table in the NVM chip during data writing. This eliminates the need for subsequent scanning of the NVM chip to construct the P2L sub-table; only a single pass through the NVM chip is required to complete the data writing and P2L table generation.
[0264] exist Figure 13B In the P2L table shown, the next entry adjacent to the entry corresponding to DU7 of LUN 0 is the entry corresponding to DU0 of LUN 1, and the next entry adjacent to the entry corresponding to DU7 of LUN 7 is the entry corresponding to DU8 of LUN 0. The arrangement of entries in the P2L table determines how the PPA of the next adjacent entry is calculated based on the PPA of one entry. That is, under this P2L table organization, the calculation method of the PPA+1 calculator needs to be adjusted accordingly.
[0265] The following will first use examples to illustrate... Figure 13A The physical addresses of adjacent DUs will be introduced.
[0266] If the physical address PPA_a of DU0 of LUN 0 is [block 0][DU0][LUN0], then the physical address PPA_b of the next adjacent DU is [block 0][DU0+1][LUN0] = [block 0][DU1][LUN0].
[0267] If the physical address PPA_a of DU7 of LUN 0 is [block 0][DU7][LUN0], then the physical address PPA_b of the next adjacent DU is [block 0][DU7+1][LUN0] = [block 0][DU0][LUN1].
[0268] Based on the above example, when calculating the PPA of adjacent DUs, a carry is generated by incrementing the DU number in the PPA by 1, and this carry is assigned to the LUN number. Therefore, when writing data to the NVM chip, the number of DUs written to the current LUN before switching from one LUN (denoted as the current LUN) to the next LUN determines the timing of generating a carry by incrementing the DU number by 1. Figure 13AIn the example shown, after writing 8 DUs to LUN0 and switching to LUN1, when the DU number is calculated to be +1, it carries over when it reaches 8, and the carry is assigned to the LUN number.
[0269] exist Figure 13A In the example, during the process of calculating PPA+1 of DU to obtain the PPA of the next adjacent DU, the maximum value of LUN number is 7. When the LUN number reaches 7, a wrapback occurs, resulting in LUN number 0. At the same time, the offset value is added to the DU number. For example, the physical address of DU7 with LUN 7 is PPA_a = [block 0][DU7][LUN7]. The physical address of the next adjacent DU can be represented as: PPA_b = [block 0][DU7+1][LUN7] = [block 0][DU0][LUN7+1] = [block 0][DU0+offset][LUN0] = [block 0][DU8][LUN0].
[0270] Based on the above example, when the LUN number wraps around, in order to calculate PPA+1, the offset needs to be added to the DU number after +1. The offset value can be obtained according to the organization of DU in the NVM chip's storage space.
[0271] In another example, the physical page write order is physical page 0 first, then physical page 1. In the physical address provided to the NVM chip, 10 bits represent the page number of the physical page, and 4 bits represent the DU number within the physical page. However, the PPA in the P2L table uses 14 bits to represent the DU, instead of the physical page number. For example:
[0272] PPA_a = [block 0][0b0000000000-0111][LUN7] = [block 0][DU7][LUN7]
[0273] PPA_b = [block 0][0b0000000001-0000][LUN0] = [block 0][DU16][LUN0]
[0274] In PPA_b, "0b0000000001-0000" represents the portion of the physical address provided to the NVM chip, which is 16 in decimal. In PPA_a, "0b0000000000-0111" is 7 in decimal, and adding 1 to it yields 8. The difference between this and the target value of 16 is 8. The target value is obtained by using the offset. For example, the offset is calculated based on the target value and the initial value of the DU number, resulting in offset = 16.
[0275] Return to view Figure 13AAfter writing physical blocks 0 of LUN 0 to LUN 7, the next DU to be written is DU0 of physical block 1 of LUN 0, that is, PPA_a = [block 0][DU8*P+7][LUN7], where P is the maximum physical page number in the physical block (a positive integer starting from 0); PPA_b = [block 1][DU0][LUN0].
[0276] Therefore, after incrementing the DU number of the PPA by 1, both the DU number and the LUN number generate a carry. The carry generated by the LUN number is used to assign a carry to the physical block number. Thus, when writing data to the NVM chip, the number of DUs written to all physical blocks 0 from LUN 0 to LUN 7 before switching from one physical block (denoted as the current physical block) determines when a carry is generated for the physical block number. After writing 8*P+8 DUs to each LUN, a carry is assigned to the physical block number.
[0277] based on Figure 13A The provided DU layout methods and Figure 13B The provided P2L table organization method Figure 9 The provided PPA+1 calculator requires adjustments to its calculation logic, while the hardware structure remains unchanged. Specific adjustments include: 1. Updating the offset value to 8; 2. Incrementing the DU number corresponding to incrementer 0 by 8 (determined by the number of DUs included in a single LUN of a certain page strip), and incrementing the LUN number corresponding to incrementer 1 by 8 (determined by the number of LUNs), with incrementer 1 wrapping back to 0 after reaching 8. Figure 6A The provided DU layout methods and Figure 6B The provided P2L table organization method and the configuration of the PPA+1 calculator are as follows: 1. The value of offset is 24; 2. Incrementer 0 corresponds to the DU number which carries over when it reaches 24, and incrementer 1 corresponds to the LUN number which carries over when it reaches N. Incrementer 1 wraps back to 0 after it reaches N.
[0278] The following describes the calculation process based on the adjusted PPA+1 calculator.
[0279] Incrementer 0 performs octal addition. Its initial value is D. Every eight increments relative to D result in a carry, and the value of incrementer 0 returns to D. Before completing eight increments relative to D, each increment of incrementer 0 outputs the previous value plus 1. The carry from incrementer 0 indicates the start of calculation for the PPA corresponding to the next DU provided by the LUN. (In relation to...) Figure 13A After LUN7 completes the calculation of DU7, it updates D to D+offset and uses the updated D to initialize the incrementer 0.
[0280] The initial value of incrementer 1 is provided by the LUN number register. Each time a carry signal is received from incrementer 0, incrementer 1 increments its own value by 1. Incrementer 1 performs octal addition, where 8 represents the number of LUNs providing DUs in the page stripe. Specifically, after calculating DU7 on LUN7, incrementer 1 generates a carry and the LUN number becomes 0 again. The carry from incrementer 1 indicates that calculations should begin for the next page stripe or physical page.
[0281] The initial value of incrementer 2 is provided by the block number register. Each time it receives a carry signal from incrementer 1, incrementer 2 increments its own value by 1. When incrementer 1 generates multiple carry signals and the value of incrementer 2 is updated to the maximum physical page number P in the physical block, the physical block number is updated. Incrementer 2 does not need to generate carry signals.
[0282] The following example illustrates the process of calculating PPA using the PPA+1 calculator.
[0283] For example, a PPA is a 64-bit number, where field 0 (DU number) is 8 bits long and occupies [13:6] of the 64 bits of the PPA, field 1 (LUN number) is 6 bits long and occupies [5:0] of the PPA, and field 2 (physical block number) is the remaining bits [63:14].
[0284] The physical address of a certain DU is represented as: PPA_0 64'h321234 567 8. The three hexadecimal digits “567” (equivalent to bits [15:4] in a 64-bit PPA) include bits [13:6] representing the DU number in the 64-bit PPA. The operation of PPA+1 adds 1 to the DU number, affecting the values of these bits. Furthermore, the underlined “7” represents bits [7:4] in the 64-bit PPA. Bits [7:6] belong to the DU number field, while bits [5:4] are unrelated to the DU number field. Therefore, the operation of DU number+1 adds 1 to bits [7:6] of “7”. The binary representation of “7” is 2'b0111, and adding 1 to its bits [7:6] yields 2'b1011, which is the hexadecimal representation of B.
[0285] If the DU above is the first DU of a certain page strip, its DU number is also recorded in register D of the PPA+1 calculator. In the example above, the 8-bit DU number is 2'b01011001. When the LUN number is incremented by 1 to generate a carry (when a page stripe switch is required), the DU number register is set and the update register D is updated using the value in register D plus the offset.
[0286] PPA_1 64'h32123456B8=PPA_0[13:6]+1 (the "+" here is for illustrative purposes and has a different meaning from the usual addition operation), where "B" is the result of adding 1 to the bits of "7" [7:6].
[0287] PPA_2 64'h32123456F8=PPA_0[13:6]+2=PPA_1[13:6]+1
[0288] PPA_3 64'h3212345738=PPA_0[13:6]+3=PPA_2[13:6]+1
[0289] As another example, the DU number consists of 15 bits, and when writing data to the LUN using the multiplane programming command, one command can write data to 12 DUs. Therefore, in the DU number field, 5 bits represent these 12 DUs, and the remaining 10 bits of the DU number represent the physical page number. When the PPA is incremented by 1, every 12 increments require the DU number to wrap around and generate a carry over to the LUN number. The 5 bits representing the DU represent a decimal range of 0-31 (>12), meaning the range of values that the 5 bits of the DU can represent is not fully utilized. In other words, the DU number requires a carry-over when it reaches 12, which differs from the carry rule of 5-bit binary addition (5-bit binary addition generates a carry when it reaches 32). This situation leads to the use of offsets. Other reasons for using offsets include reflecting changes in physical page numbers.
[0290] When incrementing the DU number by 1 requires a carry, if the LUN number does not require a carry, the DU number wraps around. The initial value of the wrap-around is recorded in register D. For example, for Figure 13A In the DU arrangement shown, after calculating DU7 for LUN0, incrementing the DU number by 1 requires a carry, while the LUN number (becoming 1) itself does not require a carry, and the DU number wraps back to 0. If incrementing the DU number by 1 generates a carry, and the LUN number also requires a carry, then the DU number register is updated using the value of register D plus the offset (e.g., 8), and register D is also updated. Simultaneously, the LUN number needs to wrap back. For example, for... Figure 13A In the DU arrangement shown, after the calculation of DU7 is completed for LUN7, the DU number needs to be incremented by 1 to generate a carry, and the LUN number (7+1) also needs to be incremented. The DU number register and the LUN number register are updated based on the value of register D + offset (e.g., 8), while the LUN number wraps back to 0.
[0291] When a LUN number needs a carry (i.e., wrapback), the physical page number of the next DU being written may change. In some cases, the relationship between the physical page numbers of the previous DU and the next DU is also +1, but in other cases, the relationship between the two physical page numbers is defined by the NVM chip and may not be regular. Therefore, the offset value reflects the impact of the change between the two physical page numbers on the DU number. Thus, it can be understood that the offset value used each time a LUN number carries (i.e., wraps back) can be the same or different.
[0292] from Figure 6A , Figure 6B as well as Figure 13A , Figure 13B As can be seen from the examples, various factors influence the organization of the P2L table, including, for example, the number of DUs (1, 2, 4, 8, 16, 48, etc.) that can be accessed by a single programming or read operation, the number of planes in a LUN, the number of physical blocks in a plane, the number of physical pages in a physical block, and the position and number of bits of each field (DU number field, LUN number field, physical block number field) in the PPA. Of course, there are other factors that affect the organization of the P2L table. The following examples illustrate how other DU arrangements affect the organization of the P2L table.
[0293] Figure 14A This illustration shows a schematic diagram of the writing sequence for writing data to an NVM chip according to an embodiment of this application.
[0294] exist Figure 14A The arrows indicate the order in which DUs are written to the NVM chip, and... Figure 13A The difference is that after a physical page (physical page 0) of LUN0 is filled, the next physical page (physical page 1) of LUN0 is not written next, but the next physical page (physical page 1) of LUN0 is written. After the next physical page (physical page 1) of LUN0 is filled, the other physical pages of LUN0 continue to be written in the order of increasing physical pages.
[0295] Taking a physical page containing 48 DUs as an example, when PPA is incremented by 1 to make DU number = 47, and PPA is incremented by 1 again to make DU number = 48, the DU number does not wrap back to 0 and generate a carry to the LUN number field. Instead, the value of the DU number is updated, specifically by adding the offset value to DU number 0.
[0296] In several physical pages of LUN0 ( Figure 14AAfter one column (the middle vertical column) is filled, when PPA is incremented by 1, the DU number wraps around and is updated to 0. At this point, a carry needs to be added to the LUN number, which is then updated to 1. Data is then written to multiple physical pages of LUN 1 in ascending order of physical page size. After several physical pages of LUN 1 are filled, when PPA is incremented by 1, the DU number wraps around and is updated to 0. The LUN number is updated to 2, and data continues to be written to multiple physical pages of LUN 2 in ascending order of physical page size, and so on until the physical pages of LUN 5 are filled.
[0297] Figure 14A The order in which data is written to the NVM chip corresponds to a different structure of the P2L table. Adjacent entries in the P2L table represent DUs that are adjacent in the order they are written.
[0298] because Figure 14A and Figure 13A The organization of P2L tables differs depending on the data writing method. Figure 14A In the corresponding P2L table, each LUN can correspond to a P2L sub-table, and six P2L sub-tables constitute the P2L table. For each P2L sub-table, the DU numbers corresponding to its multiple DU units are sequentially incremented. In two adjacent P2L sub-tables, the LBAs indicated by the last entry of the previous P2L sub-table and the first entry of the next P2L sub-table are contiguous in storage space. When a P2L sub-table switch occurs, the LUN number is updated, and the DU number wraps around (the DU number is updated to 0).
[0299] Figure 14B This illustration shows a schematic diagram of the write sequence for writing data to an NVM chip according to another embodiment of this application.
[0300] exist Figure 14B In this context, LUN0 and LUN1 are not used, therefore, their contents are not included in the P2L table, FTL table, or valid data bitmap. The arrows indicate the order in which DUs are written to the NVM chip. When writing data to the NVM chip, the first DU is located in physical page 0 of LUN 2, and after physical page 0 of LUN 2 is full, data is then written to physical page 0 of LUN 3. After physical pages 0 of LUN 4 and LUN 5 are successively full, the process wraps back to LUN 2 to continue writing data to its physical page 1.
[0301] Figure 14B The data writing method shown is the same as Figure 13A The data writing methods are similar: after filling a physical page of a LUN, writing continues to the physical page of the next LUN. Once the same physical page of multiple LUNs is full, data writing switches to the next physical page of the first LUN. The difference lies in... Figure 14BLUN0 and LUN1 are not involved in data writing, but are used in other scenarios.
[0302] exist Figure 14B In this context, the LUN number field has a value range of 2 to 5. When the LUN number becomes 5, if a carry occurs, the LUN number will wrap around. In this case, the LUN number will not wrap around back to LUN0, but will wrap around back to LUN 2.
[0303] because Figure 14B The data writing method shown is the same as Figure 13A The data writing methods differ, and the corresponding P2L table organization methods also differ. For Figure 14B In its corresponding P2L table, the first entry is the entry corresponding to DU0 of LUN2, with entry index PPA96 and entry value LBA corresponding to DU0 of LUN2. The next entry is the entry corresponding to DU1 of LUN2, with entry index PPA97 and entry value LBA corresponding to DU1 of LUN2. Physical page 0 of LUN2 to LUN5 corresponds to one P2L sub-table, while physical pages 1, 2, and 3 of LUN2 to LUN5 correspond to the other three P2L sub-tables. When a P2L sub-table switch occurs, the LUN number is updated from LUN5 to LUN2.
[0304] Among them, for Figure 14B As shown, when setting the initial value of PPA in the PPA+1 calculator, the initial value is set based on PPA96, and the LUN register is set to 2. The value range of LUN is 2-5. When LUN is 5, if a carry occurs, it will wrap back to LUN 2.
[0305] Figure 14C This illustration shows a schematic diagram of the write sequence for writing data to an NVM chip according to another embodiment of this application.
[0306] exist Figure 14C In this context, physical page 0 of LUN0 and LUN1 is not used to carry user-written data. For example, physical page 0 of LUN0 and LUN1 is used for system firmware storage, while other physical pages of LUN0 and LUN1 are used to carry user-written data. Therefore, the P2L table, FTL table, and valid data bitmap do not include content related to physical page 0 of LUN0 and LUN1.
[0307] The arrows indicate the order in which DUs are written to the NVM chip. When writing data to the NVM chip, the first DU is located in physical page 0 of LUN 2. After physical page 0 of LUN 2 is full, data is then written to physical page 0 of LUN 3. After physical pages 0 of LUN 4 and LUN 5 are filled in sequence, the process loops back to LUN 0 to continue writing data to its physical page 1. Then, data is written to physical pages 1 of LUN 1, LUN 2, LUN 3, LUN 4, and LUN 5. Finally, the process loops back to LUN 0 to continue writing data to its physical page 2.
[0308] and Figure 14B In comparison, Figure 14C In this configuration, only physical page 0 of LUN0 and LUN1 is not used; other physical pages can be used normally. For Figure 14C In its corresponding P2L table, the first entry is the entry corresponding to DU0 of LUN2, with entry index PPA96 and entry value LBA corresponding to DU0 of LUN2. The next entry is the entry corresponding to DU1 of LUN2, with entry index PPA97 and entry value LBA corresponding to DU1 of LUN2. Physical page 0 of LUN2 to LUN5 corresponds to one P2L sub-table, while physical pages 1, 2, and 3 of LUN0 to LUN5 correspond to the other three P2L sub-tables. When a P2L sub-table switch occurs, the LUN number is updated to LUN0.
[0309] Among them, for Figure 14C As shown, when setting the initial value of PPA in the PPA+1 calculator, the initial value is set based on PPA96, and the LUN register is set to 2. The value range of LUN is 0-5. When LUN is 5, if a carry occurs, it will wrap back to LUN 0.
[0310] Therefore, the PPA+1 calculator according to the embodiments of this application can be adapted to various different DU organization methods on the NVM chip and their corresponding P2L tables by modifying the register values, without modifying the hardware structure of the PPA+1 calculator.
[0311] With technological advancements, future storage devices may require one or more additional fields in their PPA (Program Address Address). The order in which multiple physical pages are written to the NVM chip (affecting the offset) and other factors can influence the configuration of the PPA+1 calculator, including the bit width of each register, its initial value, the carry threshold, the offset value, and the object affected by the carry (which field the carry value is added to after a carry is generated). To adapt to PPA+1 calculations under various conditions, this application provides a universal PPA+1 calculator to accommodate different application methods (such as some LUNs being used for other purposes and not managed by the FTL table, P2L table, and valid data bitmap) and physical address calculations for different types of NVM chips.
[0312] Figure 15A A hardware block diagram of a general-purpose PPA+1 calculator according to an embodiment of this application is shown.
[0313] like Figure 15A As shown, the PPA+1 calculator includes multiple field registers, each used to store a field in the PPA. Depending on the various organization methods of the P2L table, the PPA includes several fields, and the meaning and bit width of each field may vary. By providing multiple field registers in the PPA+1 calculator to process each field of the PPA, the calculator can be used with different PPA formats. Even if the number of fields in the PPA increases in the future, the PPA+1 calculator already has available field registers to record the additional fields of future PPAs.
[0314] The input for the PPA+1 calculator is PPA. j The output is PPA. j+1 The PPA input to and output by the PPA+1 calculator both include multiple fields, denoted as Field 0, Field 1, Field 2, Field 3, and Field 4. The fields of the PPA calculated by the PPA+1 calculator have a first order, which is arbitrary. For example, Figure 15A In the input PPA, field 4 is listed first, followed by fields 2, 3, 0, and 1. Each of these fields is defined in the input PPA. j The position in the text represents its order. The PPA+1 calculator in this embodiment of the application processes the input PPA... j The order of multiple fields is not restricted, thus allowing for different formats of input PPA. j All of them can be used to perform +1 calculations using the embodiments of this application.
[0315] Enter PPA jEach field has a meaning. For example, field 0 indicates the DU number, field 1 indicates the LUN number, field 2 indicates the block number, and fields 4 and 3 indicate other content described in the PPA; or, field 0 indicates the LUN number, field 1 indicates the physical block number, field 4 indicates the DU number, and fields 2 and 3 indicate other content described in the PPA. The PPA+1 calculator in this embodiment of the application processes the input PPA... j There are no restrictions on the meaning of each field, thus allowing for different formats of input PPA. j All of them can be used to perform +1 calculations using the embodiments of this application.
[0316] exist Figure 15A In the example, field 4 indicates the content input field register A, field 2 indicates the block number, providing the physical block number input field register B, field 3 indicates the content input field register C, field 0 indicates the DU number, providing the DU number input field register D, field 1 indicates the LUN number, providing the LUN number input field register E. Field registers A, B, C, D, and E all belong to the first row of field registers. The arrangement of the five field registers in the first row corresponds to the first order of multiple fields in the PPA.
[0317] The first row of field registers in the PPA+1 calculator provides multiple field registers to accommodate different PPA formats. The number of fields in the PPA and the bit width of each field can vary. The field registers have large bit widths (e.g., 16 bits or 32 bits) to accommodate a single field from the PPA. This also results in the sum of the bit widths of all field registers being greater than the bit width of the PPA. The positions of the fields within the PPA also vary, with each field retrieved from the PPA by bit field (e.g., [a:b]) and written to its corresponding field register.
[0318] Figure 15A In this configuration, the first row of field registers includes 5 field registers. This number is optional; more or fewer field registers can be provided. Correspondingly, the first, second, and third rows of field registers have the same number of field registers.
[0319] When performing calculations using the PPA+1 calculator, the carry relationships between the fields of the input PPA are used to give each field a secondary order. For example, the DU number field is first, followed by the LUN number field, and the physical block number field is last. The first row of field registers is connected to sequence adjustment circuit 1, which in turn is connected to the second row of field registers. The field registers in the second row of field registers also have a secondary order. This ensures that the values stored in the second row of field registers reflect the carry relationships between the fields of the input PPA.
[0320] The sequence adjustment circuit 1 adjusts the data provided by the first row of field registers according to a second order based on the carry relationship between each field. Then, it writes the output values of each field register in the first row of field registers into the second row of field registers (field registers 0 to 4) in the adjusted order. This ensures that the second row of field registers outputs, for example, DU number, LUN number, and physical block number, in an order determined by the carry relationship. It should be noted that if the number of field registers in one row provided by the PPA+1 calculator exceeds the number of fields in the PPA, the extra field registers can be shut down (i.e., power is not supplied). Since the PPA+1 calculator provided in this embodiment includes the sequence adjustment circuit 1, which can adjust the data provided by the first row of field registers according to the carry relationship between each field, it can adapt to different PPA formats.
[0321] The second row of field registers (Field Register 0 to Field Register 4) corresponds one-to-one with the incrementers. Therefore, Field Registers 0 to 4 also have a second order. Field Register 0 is connected to Incrementer 0, providing its initial value; Field Register 1 is connected to Incrementer 1, providing its initial value; Field Register 2 is connected to Incrementer 2, providing its initial value; Field Register 3 is connected to Incrementer 3, providing its initial value; and Field Register 4 is connected to Incrementer 4, providing its initial value. Specifically, Incrementer 0 connects to Incrementer 1, carrying to Incrementer 1; Incrementer 1 connects to Incrementer 2, carrying to Incrementer 2; Incrementer 2 connects to Incrementer 3, carrying to Incrementer 3; Incrementer 3 connects to Incrementer 4, carrying to Incrementer 4; and Incrementer 4 does not generate a carry. The arrangement of Incrementers 0 to Incrementer 4 also reflects the carry relationships between the fields of the input PPA. Since the carry relationship between incrementers is determined, regardless of the organization of the P2L table, a carry can be generated at the appropriate time based on the organization of the P2L table, so that the PPA+1 calculator can be used to perform PPA+1 calculation.
[0322] Figure 15A The incrementers in the PPA+1 calculator receive the values from the corresponding field registers and perform increment operations. Incrementers 1 through 4 perform increment operations in response to the received carry signal, while incrementer 0 increments according to the externally provided "add 1" signal, which is a 1-bit signal that indicates the operation of the PPA+1 calculator.
[0323] In this embodiment, each incrementer is associated with a set of registers (Min register and Max register). The Min register indicates the minimum value of the increment range for the corresponding incrementer, and the Max register indicates the maximum value of the increment range for the corresponding incrementer. For incrementers 0 to 4, after incrementing to the Max value, upon receiving a carry signal, a wrapback occurs, wrapping back to the Min value, and a carry signal is output. Optionally, the carry signals of some incrementers are not used, or they are used after further logical operations.
[0324] For example, for Figure 14B In the scenario shown, the LUN number ranges from 2 to 5. Therefore, the minimum increment of the corresponding incrementer (e.g., incrementer 1) is 2, and the maximum increment is 5. When incrementer 1 increments to 5, a wrapback occurs based on the newly generated carry signal, wrapping back to 2 and carrying over to incrementer 2. For another example, for... Figure 14C In the scenario shown, the initial value of the field register corresponding to the LUN number (e.g., field register 1 in the second row of field registers) is 2. The value range of the LUN number is 0-5. The minimum increment range of the corresponding incrementer (e.g., incrementer 1) is 0, and the maximum increment range is 5. When the LUN number increments to 5, if a carry occurs, it wraps back to LUN 0 and carries over to the corresponding incrementer 2. For another example, if the block number of a physical block does not start from 0 but from 1, then the minimum increment range of the corresponding incrementer (e.g., incrementer 2) is 1, not 0.
[0325] For the incrementer, during the first calculation, the incrementer adds 1 to the value of the corresponding field register from the second row of field registers (instead of the Min value), thus enabling the PPA+1 operation under different P2L table organization methods. For incrementer 0 (such as the incrementer used for DU numbers), at appropriate times (e.g., when incrementer 0 wraps around and incrementer 1 generates a carry), the D value and the incrementer value are updated using D+offset.
[0326] The carry signals of each incrementer are also connected to a wrapback control circuit, which outputs a wrapback indication for incrementer 0, instructing incrementer 0 to update the value of D and the value of incrementer 0 with D+offset at the appropriate time. For example, for Figure 6A as well as Figure 13A In the scenario shown, when the incrementer used for the LUN number (such as incrementer 1) generates a carry, the wrapback control circuit instructs incrementer 0 (the incrementer used for the DU number) to wrap back, updating the value of register D and incrementer 0 by adding an offset. For Figure 14AAs shown, when physical page 0 of LUN0 is full and data begins to be written to physical page 1 of LUN0, it is the time to update the D value and the incrementer value using D+offset. At this time, incrementer 0 will generate a carry signal to provide to the wrapback control circuit, which instructs incrementer 0 to add offset.
[0327] It should be noted that before using the general PPA+1 calculator, the wrapback control circuit needs to be configured according to the format of the P2L table to determine which carry signal generates the wrapback indication so that the value of the control register D is updated based on the offset.
[0328] Figure 15A The outputs of the five incrementers in the second row are connected to the sequence adjustment circuit 2. Sequence adjustment circuit 2 is connected to the third row of field registers. The third row of field registers includes field registers a through e. Sequence adjustment circuit 2 adjusts the order of the data provided by the second row of field registers, writing the values output by the incrementers corresponding to each field register in the second row of field registers into the third row of field registers (field registers a through e) according to the adjusted order. This ensures that field register a stores the same type of data as field register A, field register b stores the same type of data as field register B, field register c stores the same type of data as field register C, field register d stores the same type of data as field register D, and field register e stores the same type of data as field register E. Sequence adjustment circuit 2 is used to adjust the outputs of incrementers 0 through 4 back to the order in PPA, providing this information to field registers a through e. The output values of field registers a through e are then concatenated to obtain the output value PPA of the PPA+1 calculator. j+1 And PPA j+1 With PPA j They have the same format.
[0329] The following example illustrates the implementation process of calculating PPA+1 using a general PPA+1 calculator.
[0330] for Figure 6A The DU layout shown requires enabling [the function / operation]. Figure 15AThe PPA is stored in field registers B, D, and E, with field registers A and C optionally disabled. The PPA input format for the PPA+1 calculator is: Block Number (Field 2), DU Number (Field 0), LUN Number (Field 1). In the first row of field registers, field register B stores the block number (called the block number register), field register D stores the DU number (called the DU number register), and field register E stores the LUN number (called the LUN number register). Based on the sequence adjustment circuit 1, the output values of the three field registers are written to the three field registers in the second row in the adjusted order. The first three field registers in the second row, from left to right, are field register 0, field register 1, and field register 2. Field register 0 stores the DU number from field register D in the first row, field register 1 stores the LUN number from field register E in the first row, and field register 2 stores the block number from field register B in the first row.
[0331] The three field registers in the second row provide the corresponding output values to incrementer 0 for DU number, incrementer 1 for LUN number, and incrementer 2 for physical block number. The minimum value of the increment range corresponding to incrementer 0 is 0 and the maximum value is 23. The minimum value of the increment range corresponding to incrementer 1 is 0 and the maximum value is N-1.
[0332] If the PPA is entered into the PPA+1 calculator j Represented as [Block 0][DU0][LUN0], the output values of the three field registers in the second row are all 0. Incrementer 0 performs an increment operation and outputs 1. Since no carry is generated, incrementer 1 and incrementer 2 both output 0. After the output result is adjusted by the sequence adjustment circuit 2, it becomes Block No. 0, DU No. 1, and LUN No. 0. Therefore, the PPA output based on the PPA+1 calculator is... j+1 It is represented as [block0][DU1][LUN0].
[0333] If the PPA is entered into the PPA+1 calculator j Represented as [Block 0][DU23][LUN N-1], the output values of the three field registers in the second row are 23, N-1, and 0 respectively. Incrementer 0 performs an increment operation to generate a carry, and outputs 0. In response to the carry of incrementer 0, it performs an increment operation, causing incrementer 1 to output 0 and generate a carry. At this time, the value of incrementer 0 needs to be updated to DU0+24 (the value 24 is provided by offset). The wrapback control circuit instructs incrementer 0 to update to the value DU0+offset based on the carry signal received from incrementer 0 and the carry signal from incrementer 1.
[0334] Incrementer 0 outputs DU0+24, incrementer 1 outputs 0, and incrementer 2 outputs 0. After adjustment by sequence adjustment circuit 2, the outputs are: 0, DU0+24, 0. Sequence adjustment circuit 2 writes 0 to field register b, DU0+24 to field register d, and 0 to field register e. Therefore, the PPA output of the PPA+1 calculator is... j+1 It is represented as [block0][DU0+24][LUN0].
[0335] for Figure 13A The DU layout shown indicates the field registers that are enabled. Figure 6A same.
[0336] If the PPA is entered into the PPA+1 calculator j Represented as [Block 0][DU7][LUN7], the output values of the first three field registers in the second row are 7, 7, and 0 respectively. Incrementer 0 performs an increment operation to generate a carry, and incrementer 1 outputs 0 and generates a carry. At this time, the value of incrementer 0 needs to be updated to DU0+8. Incrementer 0 outputs DU0+8, incrementer 1 outputs 0, and incrementer 2 outputs 0. After the output results of the three incrementers are adjusted by the sequence adjustment circuit 2, they become 0, DU0+8, and 0. The sequence adjustment circuit 2 writes 0 to field register b, DU0+8 to field register d, and 0 to field register e. Therefore, the PPA output of the PPA+1 calculator is... j+1 It is represented as [block0][DU8][LUN0].
[0337] for Figure 14A The DU layout shown is used to input the PPA into the PPA+1 calculator. j Represented as [Block 0][DU47][LUN0], the output values of the first three field registers in the second row are 47, 0, and 0 respectively. Incrementer 0 performs an increment operation to generate a carry, and outputs 0. At this time, the wrapback control circuit indicates that the value of incrementer 0 is updated to DU0+48. The value of incrementer 0 represents DU48, which is the first DU in physical page 1 of LUN number 0 and block number 0. Incrementer 1 does not need to increment by 1 at this time. Therefore, incrementer 0 outputs a signal indicating whether it has counted to the maximum value based on its maximum count value (e.g., 48*P, where P is the number of physical pages in the physical block). This signal is different from the carry signal output by incrementer 0. Incrementer 1 increments by 1 based on the signal from incrementer 0 indicating that incrementer 0 has counted to the maximum value. The carry signal generated by incrementer 1 is provided to incrementer 2, causing incrementer 2 to increment by 1.
[0338] The outputs of the three incrementers, after being adjusted by the sequential adjustment circuit 2, are: 0, DU0+48, 0. Therefore, the PPA output from the PPA+1 calculator is... j+1 It is represented as [block0][DU48][LUN0].
[0339] Optionally, each incrementer of the PPA+1 calculator can output a carry signal and a signal indicating whether it has counted to the maximum value. Each incrementer also determines when to increment itself based on one of the carry signals output by the preceding incrementer and the signal indicating that it has counted to the maximum value, depending on the configuration selection.
[0340] Alternatively, the PPA+1 calculator does not need to handle changes in the physical block number caused by PPA+1. Instead, whenever the physical block number changes, an external circuit (e.g., the CPU) provides the PPA+1 calculator with an initial PPA value. The PPA+1 calculator then updates the field registers with the initial PPA value and performs the subsequent increment calculation.
[0341] Optionally, the PPA+1 value output by the PPA+1 calculator can also be provided to the input terminal of the PPA+1 calculator, so that the PPA+1 calculator can increment the PPA+1 value by 1 again.
[0342] Figure 15B A hardware block diagram of a general-purpose PPA+1 calculator according to another embodiment of this application is shown.
[0343] and Figure 15A Compared to the structure shown, Figure 15B The incrementer 0 in the middle changes, in Figure 15B In the above, incrementer 0 is largely the same as incrementers 1 to 4. The operation of accumulating offset is performed on the field corresponding to incrementer 0 during the second order adjustment, thus simplifying the design of the incrementer.
[0344] exist Figure 15A In a corresponding embodiment, register D is updated each time incrementer 0 wraps back (a wrapback associated with the indication of the wrapback control circuit), and incrementer 0 is updated to the new value. This is equivalent to accumulating the offset value each time a wrapback associated with the wrapback control circuit is performed, and the offset value may be different each time. Figure 15B In the structure shown, incrementer 0 returns to the Min value every time it carries (including wrap-around). In the sequential adjustment circuit 2, the value of incrementer 0 is added to the value of D and output to the corresponding field register. When incrementer 0 undergoes a wrap-around associated with the indication of the wrap-around control circuit, D is also updated to D+offset.
[0345] For example, for Figure 6A The layout shown indicates that the PPA input to the PPA+1 calculator will be... jRepresented as [Block 0][DU23][LUN N-1], the output values of the first three fields of the second row register are 23, N-1, and 0 respectively. Incrementer 0 performs an increment operation to generate a carry, and incrementer 0 outputs 0. Incrementer 1 outputs 0 and generates a carry. After being adjusted by the sequence adjustment circuit 2, the output result is 0, 0, 0. At this point, D (with a value of 0) needs to be updated to D+24. Then, the PPA output of the PPA+1 calculator is... j+1 It is represented as [block0][DU0+24][LUN0].
[0346] for Figure 13A The layout shown indicates that the PPA input to the PPA+1 calculator will be... j Represented as [Block 0][DU7][LUN7], the output values of the first three fields of the second row register are 7, 7, and 0 respectively. Incrementer 0 performs an increment operation to generate a carry, and incrementer 0 outputs 0. Incrementer 1 outputs 0 and generates a carry. After being adjusted by the sequence adjustment circuit 2, the output result is 0, 0, 0. At this point, D (with a value of 0) needs to be updated to D+8. Then, the PPA output of the PPA+1 calculator is... j+1 It is represented as [block0][DU8][LUN0].
[0347] Although preferred embodiments of this application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of this application. Clearly, those skilled in the art can make various alterations and variations to this application without departing from its spirit and scope. Thus, if such modifications and variations fall within the scope of the claims of this application and their equivalents, this application also intends to include such modifications and variations.
Claims
1. An effective data bit map generation method, characterized by, include: Obtain the LBA value of the DU recorded in the entry of the P2L table. The P2L table includes multiple consecutively arranged entries. The first physical address corresponding to each entry of the P2L table represents the physical address of the DU associated with that entry in the NVM chip. The P2L table is constructed based on the arrangement of the DUs in the NVM chip. The second physical address of the DU is obtained by looking up the FTL table based on the LBA value of the DU obtained from the P2L table. Based on the comparison between the first physical address and the second physical address of the DU, a bit value is generated to indicate whether the data recorded by the DU is valid. A valid data bitmap is generated based on the bit values corresponding to each entry in the P2L table.
2. The method of claim 1, wherein, in, The NVM chip includes N LUNs, and physical pages from the N LUNs construct at least one page stripe, wherein each physical page includes J DUs; In the P2L table entries corresponding to a single page stripe, each LUN provides a set of DUs for that page stripe, which corresponds to a set of entries in the P2L table. Each LUN provides a set of DUs for that page stripe with a different DU number, and the DU number is a component of the first physical address. Within different LUNs on the same page strip, each LUN corresponds to a set of DUs with the same DU number.
3. The method of claim 2, wherein, in, The P2L table is the first table organization method. The P2L table includes at least one P2L sub-table. The P2L sub-table corresponds one-to-one with the page strip. The number of entries corresponding to the P2L sub-table is equal to the number of DUs included in the page strip. In the P2L sub-table, the P2L table entries of a group of DUs from the same LUN are arranged consecutively. For adjacent LUNs, the P2L table entries of the last DU from the previous LUN and the P2L table entries of the first DU from the next LUN are arranged consecutively. In the P2L table, the last P2L table entry of the previous P2L table and the first P2L table entry of the next P2L table are arranged consecutively. The consecutively arranged P2L table entries are continuous in the storage space storing the P2L table. The number of entries between the first DU of each of the two adjacent P2L table entries in the P2L table is a preset value. Alternatively, there may be a specified interval between the last P2L entry of the preceding P2L sub-table and the first P2L entry of the following P2L sub-table.
4. The method of claim 3, wherein, in, Each LUN consists of K Planes, where K is greater than or equal to 2; In a single-page stripe, each Plane of the LUN provides M physical pages for that single-page stripe, where M is greater than or equal to 1. A single LUN provides a set of Q DUs for that single-page stripe, where Q is K*M*J. The Q DUs from a single LUN are arranged in the single LUN according to the first arrangement based on the DU number.
5. The method according to any one of claims 1 to 4, characterized in that, Also includes: The P2L table is generated by scanning the DUs arranged in the NVM chip; or At least one P2L sub-table is read from a specified location on the NVM chip, and the P2L table is generated based on the at least one P2L sub-table.
6. The method according to any one of claims 1 to 5, characterized in that, in, The LBA value of each entry in the P2L table is obtained by reading each entry sequentially, and the first physical address of the DU corresponding to the next entry is obtained according to the first physical address of the DU corresponding to each entry. Access the FTL table based on the LBA value of the entry to obtain the second physical address corresponding to the LBA value; The comparison between the second physical address and the first physical address is used to obtain the comparison results of the first physical address and the second physical address of the DU.
7. A bitmap generator characterized by include: PPA+1 calculator, first read unit, second read unit, comparator and write unit; The output of the PPA+1 calculator is connected to the input of the comparator, the output of the first reading unit is connected to the input of the second reading unit, the output of the second reading unit is connected to the input of the comparator, and the output of the comparator is connected to the write unit. The first reading unit receives the first address of the P2L table entry, reads the P2L table entry according to the first address, and provides the LBA value of the DU recorded in the read P2L table entry to the second reading unit. The entry address of the subsequent entry of the P2L table entry read by the first reading unit is further provided to the first reading unit. The second read unit reads the FTL table entry from the FTL table according to the received LBA value, and provides the second physical address recorded in the read FTL table entry to the comparator; The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator. The comparator compares the first physical address output by the PPA+1 calculator with the second physical address output by the second read unit to generate a bit value indicating whether the data recorded by the DU is valid. The write unit writes the bit value output by the comparator into the valid data bitmap.
8. A bitmap generator characterized by include: The PPA+1 calculator consists of a first read unit, a first FIFO, a second read unit, a second FIFO, a comparator, a third FIFO, and a write unit. The output of the PPA+1 calculator is connected to the input of the comparator. The output of the first read unit is connected to the input of the first FIFO. The output of the first FIFO is connected to the input of the second read unit. The output of the second read unit is connected to the input of the second FIFO. The output of the second FIFO is connected to the input of the comparator. The output of the comparator is connected to the input of the third FIFO. The output of the third FIFO is connected to the write unit. The first read unit receives the first address of a P2L table entry, reads multiple consecutive entries from the P2L table according to the first address, and stores multiple LBA values obtained from the multiple consecutive entries continuously in the first FIFO. The entry address of the subsequent entry of the P2L table entry read by the first read unit is continued to be provided to the first read unit. The second read unit reads the FTL table based on multiple LBA values obtained from the first FIFO, and obtains multiple second physical addresses to store in the second FIFO; The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator. The comparator compares the second physical address output by the second FIFO with the first physical address output by the PPA+1 calculator, and generates a bit value indicating whether the data recorded by the DU is valid, which is then stored in the third FIFO. When the number of bit values stored in the third FIFO meets the condition, the write unit writes multiple bit values in batches into the valid data bitmap.
9. A bitmap generator, characterized by include: The PPA+1 calculator consists of a third read unit, a first FIFO, a second FIFO, a selector, a comparator, a third FIFO, and a write unit. The output of the PPA+1 calculator is connected to the input of the comparator; the output of the selector is connected to the input of the third read unit; the output of the third read unit is connected to the inputs of the first FIFO and the second FIFO; the output of the first FIFO is connected to the input of the selector; the output of the second FIFO is connected to the input of the comparator; the output of the comparator is connected to the input of the third FIFO; and the output of the third FIFO is connected to the write unit. The selector selects the first address of the P2L table entry and provides it to the third reading unit. The third reading unit reads the P2L table entry according to the first address and provides the LBA value of the DU recorded in the read P2L table entry to the first FIFO. The first FIFO provides the LBA value to the selector, the selector selects to provide the LBA value to the third read unit, and the third read unit reads the second physical address of the FTL table entry based on the LBA value and records it in the second FIFO; The PPA+1 calculator receives a first physical address, increments the first physical address by 1, and outputs the updated first physical address. The first physical address output by the PPA+1 calculator is provided to the comparator and the input terminal of the PPA+1 calculator. The comparator compares the second physical address output by the second FIFO with the first physical address output by the PPA+1 calculator, generates a bit value indicating whether the data recorded by the DU is valid, and stores it in the third FIFO; When the number of bit values stored in the third FIFO meets the condition, the write unit writes multiple bit values in batches into the valid data bitmap; The entry address of the successor entry of the P2L table entry read by the third read unit is then provided to the selector.
10. The bitmap generator according to claim 8 or 9, characterized in that, The first or third reading unit reads multiple entries from the P2L table in a single reading. The second or third read unit accesses the FTL table sequentially based on the LBA of each entry record in the multiple entries to obtain the second physical address of the FTL table entry record.