Read threshold voltage estimation system and method for parameterized PV level modeling

By using parameterized PV level modeling with deep neural networks, the read error caused by the distortion of the read threshold voltage distribution in the memory system is solved, enabling more efficient and accurate read operations and improving the performance of the memory system.

CN115691624BActive Publication Date: 2026-06-30SK HYNIX INC

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SK HYNIX INC
Filing Date
2022-03-02
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies for determining the read threshold voltage in a memory system suffer from distorted or overlapping threshold voltage distributions, leading to read errors, especially after programming and erasing cycles. This makes it difficult to accurately distinguish the state of memory cells, resulting in read failures.

Method used

A parameterized PV level model based on a deep neural network is adopted. By estimating the skewed normal distribution probability distribution parameter set of the memory cell, the optimal read threshold voltage is determined, so that the probability density function values ​​of the first PV level and the second PV level are the same, thereby improving the read accuracy.

Benefits of technology

It effectively solves the read error problem, improves the read accuracy and reliability of the memory system, reduces the number of read retries, and enhances the stability and efficiency of data storage.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115691624B_ABST
    Figure CN115691624B_ABST
Patent Text Reader

Abstract

This invention provides a read threshold voltage estimation system and method for parameterized PV level modeling. An embodiment provides a scheme using a deep neural network (DNN) with reduced processing times to estimate the optimal read threshold voltage. A controller receives a first programming voltage (PV) level and a second PV level associated with a read operation for a cell. The controller estimates a first probability distribution parameter set and a second probability distribution parameter set representing skewed normal distributions of the first and second PV levels, respectively. The controller estimates the optimal read threshold voltage based on the first and second probability distribution parameter sets. The optimal read threshold voltage is a read threshold voltage such that the first probability density function (PDF) value of the skewed normal distribution of the first PV level is the same as the second PDF value of the skewed normal distribution of the second PV level.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] Embodiments of this disclosure relate to a scheme for determining the optimal read threshold voltage in a memory system. Background Technology

[0002] The computing environment paradigm has evolved into ubiquitous computing systems that can be used anytime, anywhere. Consequently, the use of portable electronic devices such as mobile phones, digital cameras, and laptops has increased rapidly. These portable electronic devices typically use memory systems with memory devices (i.e., data storage devices). Data storage devices serve as either the main memory or auxiliary memory devices in portable electronic devices.

[0003] Because memory devices have no moving parts, memory systems using memory devices offer excellent stability, durability, high data access speeds, and low power consumption. Examples of memory systems with these advantages include Universal Serial Bus (USB) memory devices, memory cards with various interfaces such as Universal Flash Memory (UFS), and Solid State Drives (SSDs). The optimal read threshold voltage can be determined from various schemes within a memory system. Summary of the Invention

[0004] Aspects of the present invention include a system and method for estimating optimal readout threshold voltage using parameterized PV level modeling based on a deep neural network (DNN) with reduced processing times.

[0005] In one aspect of the invention, a memory system includes a memory device having a plurality of cells and a controller. The controller receives a first programming voltage (PV) level and a second programming voltage (PV) level associated with a read operation for the plurality of cells; estimates a first probability distribution parameter set and a second probability distribution parameter set representing skewed normal distributions of the first PV level and the second PV level, respectively; and estimates an optimal read threshold voltage based on the first probability distribution parameter set and the second probability distribution parameter set, the optimal read threshold voltage being a read threshold voltage such that the first probability density function (PDF) value of the skewed normal distribution of the first PV level is the same as the second PDF value of the skewed normal distribution of the second PV level.

[0006] In another aspect of the invention, a method of operating a memory system including a memory device having a plurality of cells and a controller includes: receiving a first programming voltage (PV) level and a second PV level associated with a read operation for the plurality of cells; estimating a first probability distribution parameter set and a second probability distribution parameter set representing skewed normal distributions of the first PV level and the second PV level, respectively; and estimating an optimal read threshold voltage based on the first probability distribution parameter set and the second probability distribution parameter set, the optimal read threshold voltage being a read threshold voltage such that the first probability density function (PDF) value of the skewed normal distribution of the first PV level is the same as the second PDF value of the skewed normal distribution of the second PV level.

[0007] Additional aspects of the invention will become apparent from the following description. Attached Figure Description

[0008] Figure 1 This is a block diagram illustrating a data processing system according to an embodiment of the present invention.

[0009] Figure 2 This is a block diagram illustrating a memory system according to an embodiment of the present invention.

[0010] Figure 3 This is a circuit diagram illustrating a memory block of a memory device according to an embodiment of the present invention.

[0011] Figure 4 This is a diagram illustrating the state distribution of different types of cells in a memory device according to an embodiment of the present invention.

[0012] Figure 5A This is a diagram illustrating an example of encoding for a multilayer cell (MLC) according to an embodiment of the present invention.

[0013] Figure 5B This is a diagram illustrating the state distribution of pages in a multi-layer cell (MLC) according to an embodiment of the present invention.

[0014] Figure 6A This is a diagram illustrating an example of Gray coding for a three-layer cell (TLC) according to an embodiment of the present invention.

[0015] Figure 6B This is a diagram illustrating the state distribution of a page in a three-layer cell (TLC) according to an embodiment of the present invention.

[0016] Figure 7 This is a flowchart illustrating the error recovery algorithm in a memory system according to an embodiment of the present invention.

[0017] Figure 8This is a diagram illustrating the operation of estimating the optimal read threshold voltage using various eBoost algorithms according to embodiments of the present invention.

[0018] Figure 9A and Figure 9B The distribution of read threshold voltage (Vt) of a memory cell according to an embodiment of the present invention is shown.

[0019] Figure 10 This is a diagram illustrating a memory system according to an embodiment of the present invention.

[0020] Figure 11 This is a diagram illustrating a neural network according to an embodiment of the present invention.

[0021] Figure 12 This is a diagram illustrating a training component according to an embodiment of the present invention.

[0022] Figure 13 This is a diagram illustrating a reasoning component according to an embodiment of the present invention.

[0023] Figure 14 The operation for determining the optimal read threshold voltage according to an embodiment of the present invention is illustrated.

[0024] Figure 15 This is a diagram illustrating an optimal reading threshold determination device according to an embodiment of the present invention.

[0025] Figure 16 A graph showing an example optimal read threshold voltage estimated by a voltage determiner according to an embodiment of the present invention is provided. Detailed Implementation

[0026] Various embodiments of the invention are described in more detail below with reference to the accompanying drawings. However, the invention may be implemented in different forms and therefore should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to make this disclosure thorough and complete, and to fully convey the scope of the invention to those skilled in the art. Furthermore, references herein to “embodiment,” “another embodiment,” etc., are not necessarily directed to only one embodiment, and different references to any such phrases are not necessarily directed to the same embodiment. The term “embodiment” as used herein does not necessarily refer to all embodiments. Throughout this disclosure, the same reference numerals refer to the same parts in the drawings and embodiments of the invention.

[0027] This invention can be implemented in many ways, including: processes; apparatus; systems; computer program products implemented on computer-readable storage media; and / or processors, such as processors adapted to execute instructions stored on and / or provided by memory linked to and / or linked to the processor. In this specification, these embodiments or any other forms in which the invention may take may be referred to as techniques. Generally, the order of operation of the disclosed processes can be varied within the scope of this invention. Unless otherwise stated, components described as suitable for performing a task, such as processors or memory, may be implemented as general-purpose components temporarily configured to perform that task at a given time or manufactured as dedicated components to perform that task. As used herein, the term "processor," etc., refers to one or more means, circuits, and / or processing cores suitable for processing data such as computer program instructions.

[0028] The methods, processes, and / or operations described herein can be executed by code or instructions to be run by a computer, processor, controller, or other signal processing device. The computer, processor, controller, or other signal processing device can be those described herein or those other than the elements described herein. Because the algorithm (or operation of the computer, processor, controller, or other signal processing device) upon which the method is based is described in detail, the code or instructions for implementing the methods can transform a computer, processor, controller, or other signal processing device into a dedicated processor for executing the methods herein.

[0029] When implemented at least in part in software, controllers, processors, devices, modules, units, multiplexers, generators, logic, interfaces, decoders, drivers, and other signal generation and signal processing functions may include, for example, memory or other storage devices for storing code or instructions to be executed by, for example, a computer, processor, microprocessor, controller, or other signal processing device.

[0030] The following provides a detailed description of embodiments of the invention, along with accompanying drawings illustrating various aspects of the invention. The invention is described in conjunction with these embodiments, but is not limited to any particular embodiment. The scope of the invention is defined only by the claims. The invention encompasses many alternatives, modifications, and equivalents within the scope of the claims. Numerous specific details are set forth in the following description to provide a thorough understanding of the invention. These details are provided for illustrative purposes; the invention may be practiced without some or all of these specific details. For clarity, technical materials known in the art related to the invention have not been described in detail so as not to unnecessarily obscure the invention.

[0031] Figure 1 This is a block diagram illustrating a data processing system 2 according to an embodiment of the present invention.

[0032] Reference Figure 1 The data processing system 2 may include a host device 5 and a memory system 10. The memory system 10 may receive requests from the host device 5 and operate in response to the received requests. For example, the memory system 10 may store data to be accessed by the host device 5.

[0033] The host device 5 can be implemented using any of a variety of electronic devices. In various embodiments, the host device 5 may include electronic devices such as: a desktop computer, a workstation, a 3D television, a smart television, a digital audio recorder, a digital audio player, a digital picture recorder, a digital picture player, and / or a digital video recorder and a digital video player. In various embodiments, the host device 5 may include portable electronic devices such as: a mobile phone, a smartphone, an e-book reader, an MP3 player, a portable multimedia player (PMP), and / or a portable game console.

[0034] The memory system 10 can be implemented using any of a variety of storage devices such as solid-state drives (SSDs) and memory cards. In various embodiments, the memory system 10 can be configured as a component of a variety of electronic devices such as: computers, ultra-mobile personal computers (PCs) (UMPCs), workstations, netbook computers, personal digital assistants (PDAs), portable computers, network tablet PCs, wireless phones, mobile phones, smartphones, e-book readers, portable multimedia players (PMPs), portable gaming devices, navigation devices, black boxes, digital cameras, digital multimedia broadcasting (DMB) players, 3D televisions, smart televisions, digital audio recorders, digital audio players, digital picture recorders, digital picture players, digital video recorders, digital video players, data center storage devices, devices capable of receiving and transmitting information in a wireless environment, radio frequency identification (RFID) devices, and a variety of electronic devices for home networks, a variety of electronic devices for computer networks, a variety of electronic devices for telematics networks, or a variety of components for computing systems.

[0035] The memory system 10 may include a controller 100 and a memory device 200. The controller 100 can control all operations of the memory device 200.

[0036] The memory device 200 can perform one or more erase, program, and read operations under the control of the controller 100. The memory device 200 can receive commands (CMD), addresses (ADDR), and data (DATA) via input / output lines. The memory device 200 can receive power (PWR) via power lines and control signals (CTRL) via control lines. Depending on the design and configuration of the memory system 10, the control signal CTRL may include command latch enable signals, address latch enable signals, chip enable signals, write enable signals, read enable signals, and other operation signals.

[0037] The controller 100 and the memory device 200 can be integrated into a single semiconductor device such as a solid-state drive (SSD). The SSD may include a storage device for storing data therein. When the memory system 10 is used in an SSD, a host device (e.g., ...) is coupled to the memory system 10. Figure 1 The operating speed of the host device 5) can be significantly improved.

[0038] The controller 100 and memory device 200 can be integrated into a single semiconductor device such as a memory card. For example, the controller 100 and memory device 200 can be integrated to configure PC cards, compact flash memory (CF) cards, smart media (SM) cards, memory sticks, multimedia cards (MMC), miniature multimedia cards (RS-MMC), micro-sized versions of MMC (micro MMC), secure digital cards (SD cards), mini secure digital cards (mini SD cards), micro secure digital cards (micro SD cards), high-capacity secure digital cards (SDHC), and / or universal flash memory (UFS).

[0039] Figure 2 This is a block diagram illustrating a memory system according to an embodiment of the present invention. For example, Figure 2 The memory system can be described Figure 1 The memory system 10 shown.

[0040] Reference Figure 2 The memory system 10 may include a controller 100 and a memory device 200. The memory system 10 can respond to input from a host device (e.g., Figure 1 The host device 5) operates upon request and, in particular, stores data to be accessed by the host device.

[0041] The memory device 200 can store data to be accessed by the host device.

[0042] The memory device 200 may be implemented using volatile memory devices such as dynamic random access memory (DRAM) and / or static random access memory (SRAM) or non-volatile memory devices such as read-only memory (ROM), mask ROM (MROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), ferroelectric random access memory (FRAM), phase change RAM (PRAM), magnetoresistive RAM (MRAM) and / or resistive RAM (RRAM).

[0043] The controller 100 can control the storage of data in the memory device 200. For example, the controller 100 can control the memory device 200 in response to a request from the host device. The controller 100 can provide data read from the memory device 200 to the host device, and can store data provided from the host device into the memory device 200.

[0044] The controller 100 may include a storage device 110, a control component 120, an error correction code (ECC) component 130, a host interface (I / F) 140, and a memory interface (I / F) 150 connected via a bus 160. The control component 120 may be implemented as a processor such as a central processing unit (CPU).

[0045] Storage device 110 can be used as working memory for memory system 10 and controller 100, and stores data for driving memory system 10 and controller 100. When controller 100 controls the operation of memory device 200, storage device 110 can store data used by controller 100 and memory device 200 for operations such as read operations, write operations, programming operations and erase operations.

[0046] Storage device 110 may be implemented using volatile memory such as static random access memory (SRAM) or dynamic random access memory (DRAM). As described above, storage device 110 can store data used by the host device in storage device 200 for read and write operations. To store data, storage device 110 may include program memory, data memory, write buffer, read buffer, mapping buffer, etc.

[0047] Control component 120 can control the general operation of memory system 10 and control write or read operations on memory device 200 in response to write or read requests from host device. Control component 120 can drive firmware called Flash Translation Layer (FTL) to control the general operation of memory system 10. For example, FTL can perform operations such as logical-to-physical (L2P) mapping, wear leveling, garbage collection, and / or bad block handling. L2P mapping is called logical block addressing (LBA).

[0048] During a read operation, the ECC component 130 can detect and correct errors in the data read from the memory device 200. When the number of error bits is greater than or equal to the threshold number of correctable error bits, the ECC component 130 may not correct the error bits, but may instead output an error correction failure signal indicating that the correction of the error bits has failed.

[0049] In various embodiments, ECC component 130 may perform error correction operations based on coding modulation such as low-density parity-check (LDPC) codes, Bose-Chaudhuri-Hocquenghem (BCH) codes, turbo codes, turbo product codes (TPC), Reed-Solomon (RS) codes, convolutional codes, recursive systematic codes (RSC), trellis-coded modulation (TCM), or block-coded modulation (BCM). However, error correction is not limited to these techniques. Therefore, ECC component 130 may include any and all circuitry, systems, or devices suitable for error correction operations.

[0050] The host interface 140 can communicate with the host device through one or more of the following communication standards or interfaces: Universal Serial Bus (USB), Multimedia Card (MMC), High-Speed ​​Peripheral Component Interconnect (PCI-e or PCIe), Small Computer System Interface (SCSI), Serial SCSI (SAS), Serial Advanced Technology Attachment (SATA), Parallel Advanced Technology Attachment (PATA), Enhanced Small Disk Interface (ESDI), and Electronic Integrated Drive (IDE).

[0051] Memory interface 150 provides an interface between controller 100 and memory device 200, allowing controller 100 to control memory device 200 in response to requests from host device. Memory interface 150 can generate control signals for memory device 200 and process data under the control of control component 120. When memory device 200 is flash memory such as NAND flash memory, memory interface 150 can generate control signals for memory and process data under the control of control component 120.

[0052] Memory device 200 may include a memory cell array 210, control circuitry 220, voltage generation circuitry 230, row decoder 240, page buffer 250 (which may be in the form of a page buffer array), column decoder 260, and input / output (I / O) circuitry 270. The memory cell array 210 may include multiple memory blocks 211 capable of storing data. The voltage generation circuitry 230, row decoder 240, page buffer array 250, column decoder 260, and I / O circuitry 270 may form peripheral circuitry for the memory cell array 210. The peripheral circuitry may perform programming, reading, or erasing operations on the memory cell array 210. The control circuitry 220 may control the peripheral circuitry.

[0053] The voltage generation circuit 230 can generate operating voltages of various levels. For example, in an erase operation, the voltage generation circuit 230 can generate operating voltages of various levels, such as erase voltage and pass voltage.

[0054] The line decoder 240 can communicate electrically with the voltage generation circuit 230 and a plurality of memory blocks 211. The line decoder 240 can select at least one memory block among the plurality of memory blocks 211 in response to a line address generated by the control circuit 220, and transmit the operating voltage supplied from the voltage generation circuit 230 to the selected memory block.

[0055] Page buffer 250 can be accessed via bit line BL ( Figure 3 (As shown in the diagram) is connected to the memory cell array 210. The page buffer 250 can precharge the bit line BL with a positive voltage in response to a page buffer control signal generated by the control circuit 220, transfer data to and receive data from the selected memory block during programming and reading operations, or temporarily store the transferred data.

[0056] The column decoder 260 can transmit data to and receive data from the page buffer 250, and can also transmit data to and receive data from the input / output circuit 270.

[0057] Input / output circuit 270 can input from external devices (e.g., Figure 1 The controller 100 receives commands and addresses and transmits them to the control circuit 220, which transmits data from external devices to the column decoder 260, or outputs data from the column decoder 260 to external devices.

[0058] The control circuit 220 can control the peripheral circuits in response to commands and addresses.

[0059] Figure 3This is a circuit diagram illustrating a memory block of a memory device according to an embodiment of the present invention. For example, Figure 3 The storage block can be Figure 2 Any one of the storage blocks 211 in the memory cell array 210 shown.

[0060] Reference Figure 3 Storage block 211 may include multiple word lines WL0 to WLn-1, drain select line DSL, and source select line SSL connected to line decoder 240. These lines may be arranged in parallel with multiple word lines between DSL and SSL.

[0061] The memory block 211 may further include multiple cell strings 221 respectively connected to bit lines BL0 to BLm-1. Each column of cell strings may include one or more drain-select transistors (DSTs) and one or more source-select transistors (SSTs). In the illustrated embodiment, each cell string has one DST and one SST. In the cell string, multiple memory cells or memory cell transistors MC0 to MCn-1 may be connected in series between the drain-select transistors (DSTs) and source-select transistors (SSTs). Each of the memory cells may be formed as multiple layer cells. For example, each of the memory cells may be formed as a single-layer cell (SLC) storing 1 bit of data. Each of the memory cells may be formed as a multi-layer cell (MLC) storing 2 bits of data. Each of the memory cells may be formed as a three-layer cell (TLC) storing 3 bits of data. Each of the memory cells may be formed as a four-layer cell (QLC) storing 4 bits of data.

[0062] The source of each SST in a cell string can be connected to the common source line CSL, and the drain of each DST can be connected to the corresponding bit line. The gate of an SST in a cell string can be connected to SSL, and the gate of a DST in a cell string can be connected to DSL. The gates of memory cells spanning cell strings can be connected to the corresponding word lines. That is, the gate of memory cell MC0 is connected to the corresponding word line WL0, the gate of memory cell MC1 is connected to the corresponding word line WL1, and so on. A group of memory cells connected to a specific word line can be called a physical page. Therefore, the number of physical pages in memory block 211 can correspond to the number of word lines.

[0063] Page buffer array 250 may include multiple page buffers 251 connected to bit lines BL0 to BLm-1. Page buffers 251 may operate in response to page buffer control signals. For example, during a read operation or a verification operation, page buffers 251 may temporarily store data received through bit lines BL0 to BLm-1 or sense the voltage or current of the bit lines.

[0064] In some embodiments, memory block 211 may include NAND flash memory cells. However, memory block 211 is not limited to this cell type, but may include NOR flash memory cells. Memory cell array 210 may be implemented as a hybrid flash memory combining two or more types of memory cells, or as a 1-NAND flash memory with the controller embedded within the memory chip.

[0065] Figure 4 This is a diagram illustrating the state distribution or programming voltage (PV) level distribution of different types of cells in a memory device according to an embodiment of the present invention.

[0066] Reference Figure 4 Each memory cell can be implemented using a specific type of cell, such as a single-level cell (SLC) for storing 1 bit of data, a multi-level cell (MLC) for storing 2 bits of data, a three-level cell (TLC) for storing 3 bits of data, or a four-level cell (QLC) for storing 4 bits of data. Typically, all memory cells in a particular memory device are of the same type, but this is not required.

[0067] An SLC can include two states, P0 and P1. P0 can represent the erase state, and P1 can represent the programmable state. Because an SLC can be set to one of two different states, each SLC can be programmed or store one bit according to a set encoding method. An MLC can include four states, P0, P1, P2, and P3. Among these states, P0 can represent the erase state, and P1 through P3 can represent the programmable state. Because an MLC can be set to one of four different states, each MLC can be programmed or store two bits according to a set encoding method. A TLC can include eight states, P0 through P7. Among these states, P0 can represent the erase state, and P1 through P7 can represent the programmable state. Because a TLC can be set to one of eight different states, each TLC can be programmed or store three bits according to a set encoding method. A QLC can include 16 states, P0 through P15. Among these states, P0 can represent the erase state, and P1 through P15 can represent the programmable state. Because a QLC can be set to one of sixteen different states, each QLC can be programmed or store four bits according to the set encoding method.

[0068] Re-reference Figure 2 and Figure 3 The memory device 200 may include a plurality of memory cells (e.g., NAND flash memory cells). The memory cells are arranged as follows: Figure 3The array arrangement of rows and columns is shown. Cells in each row are connected to word lines (e.g., WL0), while cells in each column are connected to bit lines (e.g., BL0). These word lines and bit lines are used for read and write operations. During a write operation, when the word line is asserted, the data to be written ("1" or "0") is provided to the bit line. During a read operation, the word line is asserted again, and the threshold voltage for each cell can then be obtained from the bit line. Multiple pages can share memory cells belonging to (i.e., connected to) the same word line. When the memory cell is implemented using MLC, the multiple pages include a most significant bit (MSB) page and a least significant bit (LSB) page. When the memory cell is implemented using TLC, the multiple pages include an MSB page, a middle significant bit (CSB) page, and an LSB page. When the memory cell is implemented using QLC, the multiple pages include an MSB page, a middle most significant bit (CMSB) page, a middle least significant bit (CLSB) page, and an LSB page. Encoding schemes (e.g., Gray encoding) can be used to program memory cells in order to increase the capacity of a memory system 10 such as an SSD.

[0069] Figure 5A This is a diagram illustrating an example of encoding for a multilayer cell (MLC) according to an embodiment of the present invention.

[0070] Reference Figure 5A The MLC can be programmed using a specified type of encoding. The MLC can have four programming states: erase state E (or PV0) and first programming states PV1 through third programming states PV3. Erase state E (or PV0) can correspond to "11". First programming state PV1 can correspond to "10". Second programming state PV2 can correspond to "00". Third programming state PV3 can correspond to "01".

[0071] In MLC, such as Figure 5B As shown, there are two types of pages: LSB pages and MSB pages. One or two thresholds can be applied to retrieve data from the MLC. For MSB pages, the single threshold is VT1. VT1 distinguishes between the first programming state PV1 and the second programming state PV2. For LSB pages, the two thresholds are threshold VT0 and threshold VT2. VT0 distinguishes between the erase state E and the first programming state PV1. VT2 distinguishes between the second programming state PV2 and the third programming state PV3.

[0072] Figure 6A This is a diagram illustrating an example of Gray coding for a three-layer cell (TLC) according to an embodiment of the present invention.

[0073] Reference Figure 6AGray coding can be used to program a TLC. A TLC can have eight programming states, including an erase state E (or PV0) and first programming states PV1 through seventh programming states PV7. Eraser state E (or PV0) can correspond to "111". First programming state PV1 can correspond to "011". Second programming state PV2 can correspond to "001". Third programming state PV3 can correspond to "000". Fourth programming state PV4 can correspond to "010". Fifth programming state PV5 can correspond to "110". Sixth programming state PV6 can correspond to "100". Seventh programming state PV7 can correspond to "101".

[0074] In TLC, such as Figure 6B As shown, there are three types of pages: LSB pages, CSB pages, and MSB pages. Two or three thresholds can be applied to retrieve data from the TLC. For MSB pages, the two thresholds include threshold VT0, which distinguishes between erase state E and first programming state PV1, and threshold VT4, which distinguishes between fourth programming state PV4 and fifth programming state PV5. For CSB pages, the three thresholds include VT1, VT3, and VT5. VT1 distinguishes between first programming state PV1 and second programming state PV2. VT3 distinguishes between third programming state PV3 and fourth programming state PV4. VT5 distinguishes between fifth programming state PV5 and sixth programming state PV6. For LSB pages, the two thresholds include VT2 and VT6. VT2 distinguishes between second programming state PV2 and third programming state PV3. VT6 distinguishes between sixth programming state PV6 and seventh programming state PV7.

[0075] In such Figure 5A and Figure 6A After programming a memory array comprising multiple memory cells, when performing a read operation on the memory array using a specific reference voltage such as a read threshold voltage (also referred to as a "read voltage level" or "read threshold"), the charge level of the memory cell (e.g., the threshold voltage level of the memory cell's transistor) is compared with one or more reference voltages to determine the state of each memory cell. When a specific read threshold is applied to the memory array, those memory cells with threshold voltage levels higher than the specific reference voltage are turned on and detected as "on" cells, while those memory cells with threshold voltage levels lower than the specific reference voltage are turned off and detected as "off" cells. Therefore, each read threshold is set between adjacent threshold voltage distribution windows corresponding to different programming states, such that each read threshold can distinguish these programming states by turning the memory cell transistors on or off.

[0076] When performing a read operation on a memory cell in a data storage device using MLC technology, the threshold voltage level of the memory cell is compared with more than one read threshold level to determine the state of each memory cell. Distorted or overlapping threshold voltage distributions can cause read errors. For example, the ideal memory cell threshold voltage distribution can be significantly distorted or overlapping due to program / erase (P / E) cycles, inter-cell interference, and / or data retention errors. For instance, as the number of program / erase cycles increases, the margin between adjacent threshold voltage distributions of different programming states decreases, and the distributions eventually overlap. Therefore, a memory cell whose threshold voltage falls within the overlapping region of adjacent distributions may be read as having been programmed to a value different from the original target value, thus causing a read error. In most cases, such read errors can be managed by using error correction codes (ECC). When the number of bit errors in a read operation exceeds the ECC correction capability of the data storage, the read operation using a set read threshold voltage fails. The set read threshold voltage can be a previously used read threshold voltage (i.e., a historical read threshold voltage). The historical read threshold voltage can be the read threshold voltage used in the most recent successful decoding, that is, the read voltage used in the read operation performed before the read retry operation. When a read operation using the set read threshold voltage fails, the controller 120 can control as follows: Figure 7 The error recovery algorithm shown.

[0077] Reference Figure 7 The controller 120 may use one or more read threshold voltages applied in a set order to perform one or more read retry operations on memory cells (S100). For example, the read threshold voltages may include N (e.g., N is 5 or 10) read threshold voltages (or read voltage levels), including a first read threshold voltage to an Nth read threshold voltage. The first read threshold voltage may be a previously used read threshold voltage (i.e., a historical read threshold voltage). A historical read threshold voltage may be a read threshold voltage used in the most recent successful decoding, i.e., a read voltage used in a read-through read operation performed before the read retry operation. The controller 120 may perform read retry operations until it is determined that decoding success is associated with the corresponding read retry operation.

[0078] When all read retry operations using the read threshold voltage fail, the controller 120 may perform additional recovery operations. For example, the additional recovery operations may include optimal read threshold voltage search (S200), soft decoding using error correction code (ECC) (S300), and independent disk redundant array (RAID) recovery (S400).

[0079] As described above, in memory systems such as NAND flash memory systems, after receiving a read command, a series of data recovery steps are run to retrieve noise-free data from the memory device (i.e., the NAND flash memory device). In the first attempt, a read operation (i.e., a historical read) is performed using historical read threshold voltages. Historical reads can be maintained separately for each physical block, and can be updated if decoding associated with a historical read fails. If a historical read fails, a read retry attempt is performed, known as a high-priority read retry (HRR). HRR is a series of read threshold voltages (i.e., Vt) that remain constant. The read retry threshold voltage does not change based on NAND conditions or the physical location of the data to be read. Typically, there are 5 to 10 HRR read attempts. If all HRR reads fail, an optimal read threshold voltage is found through an optimal read level search (i.e., the eBoost algorithm), and soft read and soft decode operations are performed. The eBoost algorithm can perform multiple reads to find the optimal intermediate Vt for the soft read. There are many different eBoost algorithms, such as Gaussian modeling (GM), cumulative cell counting search (CCS), and advanced valley search (AVA).

[0080] Figure 8 This is a diagram illustrating the operation of eBoost algorithms such as GM, CCS and AVA algorithms to estimate the optimal read threshold voltage according to an embodiment of the present invention.

[0081] Reference Figure 8 The table shows the read threshold voltages obtained by each of the GM, CCS, and AVA algorithms, compared to the optimal read threshold voltage (OPT). The GM algorithm attempts to find a read threshold voltage (Vt) assuming all PV states (i.e., PV11, PV12) follow a Gaussian distribution with known / constant variance and unknown mean. The CCS algorithm attempts to find a Vt that results in an equal number of cells falling into both sides. The AVA algorithm attempts to find the lowest point around the valley in the overall distribution as Vt. Figure 8 As shown, when one of the two PV states (e.g., PV12) is asymmetrical and has a heavier tail than the other, all these algorithms (i.e., GM, CCS, AVA) may estimate a read threshold voltage that deviates from the optimal read threshold voltage OPT.

[0082] Figure 9A and Figure 9B Examples of threshold voltage (Vt) distributions or programming states for three-layer cell (TLC) and four-layer cell (QLC) according to embodiments of the present invention are shown respectively.

[0083] exist Figure 9A and Figure 9B In the example shown, one or more of the various PV states are asymmetric and have a heavier tail than the other states. Figure 9A In the example, the three states PV21 to PV27, PV25 to PV27, are asymmetrical due to the heavy tail. Figure 9B In the example, state PV315 among PV301 to PV315 is asymmetrical due to its heavy tail. In this case, all three types of existing algorithms mentioned above may give an estimate of the offset Vt. Therefore, it is desirable to provide a scheme for estimating the readout threshold voltage to overcome the shortcomings of all existing algorithms.

[0084] The embodiments use deep learning and provide a parameterized framework for modeling programming voltage or programming verification (PV) levels and estimating the optimal read threshold voltage (Vt).

[0085] Figure 10 This is a diagram illustrating a memory system 10 according to an embodiment of the present invention.

[0086] Reference Figure 10 The memory system 10 includes a controller 100 and a memory device 200. The memory device 200 may include a plurality of memory cells 210 (e.g., NAND flash memory cells). Memory cells such as... Figure 3 The array shown is arranged in rows and columns. Cells in each row are connected to word lines (e.g., WL0), while cells in each column are connected to bit lines (e.g., BL0). These word lines and bit lines are used for read and write operations. During a write operation, when the word line is asserted, the data to be written ("1" or "0") is provided to the bit line. During a read operation, the word line is asserted again, and the threshold voltage for each cell can then be obtained from the bit line. Multiple pages can share memory cells belonging to (i.e., connected to) the same word line. When the memory cell is implemented using MLC, the multiple pages include a most significant bit (MSB) page and a least significant bit (LSB) page. When the memory cell is implemented using TLC, the multiple pages include an MSB page, a middle significant bit (CSB) page, and an LSB page. When the memory cell is implemented using QLC, the multiple pages include an MSB page, a middle most significant bit (CMSB) page, a middle least significant bit (CLSB) page, and an LSB page. Encoding schemes (e.g., Gray encoding) can be used to program memory cells in order to increase the capacity of a memory system 10 such as an SSD.

[0087] The controller 100 may include a read processor 1010, a decoder 1020, and an optimal read threshold determiner 1030. Although the components of the controller 100 are shown to be implemented separately, these components can utilize... Figure 2The control component 120 is implemented using internal components (i.e., firmware (FW)). Although in Figure 10 Not shown, but the controller 100 and memory device 200 may include, for example... Figure 2 The various other components shown.

[0088] The read processor 1010 can respond to data from the host (e.g., Figure 1 The host device 5) controls one or more read operations on the memory device 200 based on read requests. The read processor 1010 can use various read thresholds to control the read operations. The decoder 1020 can decode the data associated with the read operations.

[0089] In some embodiments, the read processor 1010 can use a selected read threshold from a set read level table to control read operations on memory cells. In some embodiments, the read level table may include multiple read thresholds, and the selected read threshold may include a default read threshold. Figure 6B As shown, when performing a read operation on the MSB page of a TLC, the selection of read thresholds can include a pair of first read thresholds and second read thresholds [VT0, VT4]. The first read threshold VT0 is used to distinguish between the erase state (i.e., E) and the first programming state (i.e., PV1), and the second read threshold VT4 is used to distinguish between the fourth programming state (i.e., PV4) and the fifth programming state (i.e., PV5). Figure 6B As shown, when performing a read operation on the LSB page of the TLC, the selection of read thresholds may include a pair of first read thresholds and second read thresholds [VT2, VT6]. The first read threshold VT2 is used to distinguish between the second programming state (i.e., PV2) and the third programming state (i.e., PV3), and the second read threshold VT6 is used to distinguish between the sixth programming state (i.e., PV6) and the seventh programming state (i.e., PV7).

[0090] Based on the decoding result of decoder 1020, it can be determined whether the read operation using the read threshold selected from the read threshold set succeeded or failed. When the read operation using the selected read threshold fails, such as... Figure 7 As shown, the read processor 1010 can use read retry entries to control one or more read retry operations for memory cells.

[0091] The optimal read threshold determiner 1030 can provide a parameterized framework for modeling programming voltage or programming verification (PV) levels and estimating the optimal read threshold voltage (Vt). The optimal read threshold determiner 1030 can be implemented using one or more deep neural networks (DNNs). For a parameterized DNN framework, the optimal read threshold determiner 1030 may include a training component 1030A and an inference component 1030B.

[0092] Figure 11 This is a diagram illustrating an example of a neural network 1100 according to an embodiment of the present invention. In some embodiments, the neural network 1100 may be included... Figure 10 In the controller 100 of the memory system 10.

[0093] Reference Figure 11 A feature map 1102 associated with one or more input conditions can be input into a neural network 1100. The neural network 1100 can then output information 1104. As shown, the neural network 1100 includes an input layer 1110, one or more hidden layers 1120, and an output layer 1130. Features from the feature map 1102 can be connected to input nodes in the input layer 1110. Information 1104 can be generated by output nodes in the output layer 1130. One or more hidden layers 1120 can exist between the input layer 1110 and the output layer 1130. The neural network 1100 can be pre-trained to process features from the feature map 1102 through different layers 1110, 1120, and 1130 to output information 1104.

[0094] Neural network 1100 can be a multi-layer neural network representing a network of interconnected nodes, such as an artificial deep neural network, where knowledge about nodes (e.g., information about specific features represented by nodes) is shared across layers, and also retains knowledge specific to each layer. Each node represents a piece of information. Knowledge can be exchanged between nodes through interconnection. Input to neural network 1100 can activate a set of nodes. This set of nodes can then activate other nodes, thus propagating knowledge about the input. This activation process can be repeated across other nodes until a node in output layer 1130 is selected and activated.

[0095] As shown in the figure, the neural network 1100 includes a hierarchical structure of layers, which represents the hierarchical structure of nodes interconnected in a feedforward manner. The input layer 1110 can be located at the lowest level of the hierarchical structure. The input layer 1110 can include a set of nodes referred to herein as input nodes. When the feature map 1102 is input into the neural network 1100, each of the input nodes in the input layer 1110 can be connected to each feature of the feature map 1102. Each of the connections can have weights. These weights can be a set of parameters derived from the training of the neural network 1100. The input nodes can transform the features by applying activation functions to these features. The information obtained from this transformation can be passed to higher-level nodes in the hierarchical structure.

[0096] Output layer 1130 can be located at the highest hierarchical level. Output layer 1130 may include one or more output nodes. Each output node can provide a specific value for output information 1104. The number of output nodes may depend on the required amount of output information 1104. In other words, there is a one-to-one correspondence or mapping between the number of output nodes and the amount of output information 1104.

[0097] Hidden layer 1120 may be located between input layer 1110 and output layer 1130. Hidden layer 1120 may include "N" hidden layers, where "N" is an integer greater than or equal to 1. Each of hidden layers 1120 may include a set of nodes referred to herein as hidden nodes. Example hidden layers may include upsampling layers, convolutional layers, fully connected layers, and / or data transformation layers.

[0098] At the lowest level of hidden layer 1120, the hidden nodes at that level can be interconnected with the input nodes. At the highest level of hidden layer 1120, the hidden nodes at that level can be interconnected with the output nodes. Input nodes may not be directly interconnected with output nodes. If multiple hidden layers exist, the input nodes are interconnected with the hidden nodes of the lowest hidden layer. Furthermore, these hidden nodes are interconnected with the hidden nodes of the next hidden layer. Interconnections can represent a piece of information known about two interconnected nodes. Interconnections can have numerical weights that can be adjusted (e.g., based on the training dataset) to allow the neural network 1100 to adapt to the input and learn.

[0099] Typically, hidden layer 1120 allows knowledge about the input nodes of input layer 1110 to be shared among the output nodes of output layer 1130. To this end, a transformation f can be applied to the input nodes through hidden layer 1120. In this example, the transformation f is non-linear. Different non-linear transformations f can be used, including, for example, the rectified function f(x) = max(0,x). In this example, a specific non-linear transformation f is selected based on cross-validation. For example, given a known example pair (x,y), where x∈X and y∈Y, the function f:X→Y is chosen when it produces the best match.

[0100] For example, neural network 1100 can be a deep learning neural network for a memory system including a NAND flash memory device. The deep learning neural network can be created using output nodes and "K" input nodes, where "K" is the number of factors (e.g., features) defining the input conditions for the memory system. The output nodes can be used to execute activation functions for specific combinations of the input conditions. The number of layers in neural network 1100 and the size of each layer can depend on the NAND flash memory device and the amount of data that the NAND flash memory device can store.

[0101] The inventors observed that in some cases (e.g., Figure 9A Using the curves indicated by PV26 and PV27, the cell level distribution for a specific threshold voltage range corresponding to the PV level can be modeled using a skewed normal distribution. Typically, a skewed normal distribution is defined as a continuous probability distribution that generalizes to the normal distribution to allow for non-zero skewness. Each cumulative distribution function (CDF) value can represent a skewed normal distribution model for a specific PV level. To describe the properties of the skewed normal distribution, the probability distribution parameters can be a set of parameters including position ξ, scale ω, and shape α.

[0102] In some embodiments, the neural network 1100 can be Figure 12 And / or a first deep neural network (DNN1) or a second deep neural network (DNN2) in 13. For DNN1, one or more input conditions may be CDF values ​​and the output information 1104 may be probability distribution parameters. For example, neural network 1100 (e.g., DNN1) may be pre-trained to process features (i.e., CDF values) from feature map 1102 through different layers 1110, 1120, and 1130 to output probability distribution parameters 1104. For DNN2, one or more input conditions may be probability distribution parameters and the output information 1104 may be probability density function (PDF) values.

[0103] Figure 12 This is a diagram illustrating a training component 1030A according to an embodiment of the present invention.

[0104] Reference Figure 12 The training component 1030A may include a synthetic model 1210 for training a first deep neural network (DNN1) 1200. To train the DNN1 1200, a synthetic dataset (or data) may be collected by the synthetic model 1210. The synthetic dataset may be any production data applicable to a given situation that is not obtained through direct measurement. In some embodiments, the deep neural network 1200 may be used in a memory system including a NAND flash memory device. As described above, the threshold voltage (Vt) distribution (i.e., PV level) of the memory cell can be modeled by a parameterized distribution (i.e., a skewed normal distribution). The skewed normal distribution has a cumulative distribution function (CDF) as shown below:

[0105]

[0106] T(h,a) is Owen's T function.

[0107]

[0108] As shown in the equation above, the skewed normal distribution has three parameters: position ξ, scale ω, and shape α. In other words, for the range of readout threshold voltages corresponding to each PV level, the probability distribution of each PV level can be described using three probability distribution parameters. In the equation above, x represents the sampled readout threshold voltage (i.e., the PV level), and T(h,a) defines Ewen's T function.

[0109] The synthetic model 1210 can generate a synthetic dataset (x, CDF(x)) based on the probability distribution parameter p, which collectively represents three parameters (i.e., ξ, ω, and α). The DNN 11200 can be trained on the synthetic dataset and output the probability distribution parameter p′ as the training result. The synthetic dataset includes the CDF values ​​CDF(x) from the parameterized distribution at each sampled voltage x. The probability distribution parameter p′ can be used to determine the characteristics of the voltage range curve of the memory cell.

[0110] The training results of DNN1 1200 can be provided to loss function component 1220. Loss function component 1220 can use a loss function (or cost function) to find the optimal solution for the trained probability distribution parameters p′. DNN1 1200 can be trained to improve the probability distribution parameters, minimizing the difference (or error) between the actual probability distribution parameters p of the synthetic model 1210 and the probability distribution parameters p′ predicted by DNN1 1200.

[0111] Therefore, the relationship between the CDF value CDF(x) and the probability distribution parameter Θ can be trained by the DNN11200. Once trained, the inference component 1030B can use the training results.

[0112] Figure 13 This is a diagram illustrating the inference component 1030B according to an embodiment of the present invention.

[0113] Reference Figure 13 The inference component 1030B may include a first deep neural network (DNN1) 1200A, 1200B, a second deep neural network (DNN2) 1300A, 1300B, and a crossover calculation component 1310. Each DNN1 1200A, 1200B can be as follows: Figure 12 The training is performed as shown. Each of DNN1 and DNN2 can be implemented in SoC or FW, depending on the size of the DNN used.

[0114] exist Figure 13 In the middle, PV A and PV B This represents two adjacent PV levels. CDF A (x i ) indicates that at voltage x iPV of the downsampled threshold voltage distribution A CDF value. B (x i ) indicates that at voltage x i PV of the downsampled threshold voltage distribution B CDF value. p A PV A The probability distribution parameters of p. B PV B The probability distribution parameters. The optimal read threshold voltage Vt_opt represents PV. A and PV B The voltage at the crossover point between the threshold voltage distributions.

[0115] The threshold voltage distribution of cells at PV levels can be modeled using a parameterized distribution (i.e., a skewed normal distribution). The parameterized distribution models the relationship between the CDF value of each PV level and the probability distribution parameter p. A and PV B The skewed normal distributions are respectively used with probability distribution parameters p A and p B To model.

[0116] For an n-bit multi-cell NAND flash memory, the threshold voltage of each cell can be programmed to be 2. n There are several possible values. In an ideal multi-cell NAND flash memory, each value corresponds to a non-overlapping threshold voltage range. However, in many systems, due to operating conditions, the threshold voltage ranges for each value may partially overlap. Figures 8 to 9B An example of this overlap is shown in the image.

[0117] The DNN1 1200A can receive the first CDF value of a skewed normal distribution model representing the first threshold voltage range. A (x i ).like Figure 12 As shown, the first CDF value can be generated through iterative modeling. A (x i A parameterized representation to determine the threshold voltage range. First CDF value (CDF) A (x i This can correspond to the first layer of a multi-cell NAND flash memory. Each multi-cell has multiple layers based on how many bits are stored in the cell. In one example, each multi-cell in a three-layer cell (TLC) stores three bits and has 2... n One level, or eight levels. Each of the eight levels of a three-bit TLC is associated with a first CDF value. A (x iThe voltage range represented by ) corresponds to this. The DNN1 1200B can receive the second CDF value of the skewed normal distribution model representing the second threshold voltage range. B (x i Second CDF value CDF B (x i This can correspond to the second layer of a multi-layer unit.

[0118] Each DNN1 1200A, 1200B can estimate the probability distribution parameter p based on the CDF value given by measurements of memory cells. For example, DNN1 1200A can estimate the probability distribution parameter p based on the first CDF value. A (x i To estimate the probability distribution parameter p A The DNN11200B can sample CDF based on the second CDF. B (x i To estimate the probability distribution parameter p B The probability distribution parameter is represented as p. A =(ξ A ,ω A ,α A ) and p B =(ξ B ,ω B ,α B ).

[0119] Each DNN2 1300A, 1300B can determine the PDF value of the distribution of each candidate readout threshold voltage based on the estimated probability distribution parameter p. For example, the DNN2 1300A can receive the estimated probability distribution parameter p. A And based on the estimated probability distribution parameter p A Determine PDF values A The DNN2 1300B can receive the estimated probability distribution parameters p. B And based on the estimated probability distribution parameter p B Determine PDF values B In some embodiments, each DNN2 1300A, 1300B can determine the PDF value of the distribution of each candidate read threshold voltage based on the estimated probability distribution parameter p by using the following equation.

[0120] Where ξ represents position, ω represents scale, and α represents shape, and these are probability distribution parameters p.

[0121] In some embodiments, each DNN2 1300A, 1300B may include a lookup table (LUT) that stores the relationship between probability distribution parameters and PDF values.

[0122] The crossover point calculation component 1310 can find two candidate read threshold voltages that produce approximately equal PDF values. Further, the crossover point calculation component 1310 can determine the crossover point of the two candidate read threshold voltages as the optimal read threshold voltage Vt_opt.

[0123] In this way, the inference component 1030B can estimate the crossover points of the underlying PDF values ​​based on some noisy samples of the CDF values.

[0124] Figure 14 This illustrates an operation according to an embodiment of the present invention for obtaining an ICMF sample to be used for estimating the optimal readout threshold voltage. This operation can be performed by... Figure 10 The read processor 1010 of the controller 100 in the TLC executes the process. By way of example, ICMF samples can be obtained to estimate the optimal read threshold voltage for the LSB pages of the TLC.

[0125] exist Figure 14 In the middle, 1410 indicates that it is related to... Figure 6A The parameterized distribution corresponding to the threshold voltage distribution of the PV level of the TLC is shown. As shown in the figure, the TLC distribution includes 8 zones, including Zone 0 to Zone 7, and each zone corresponds to the programming state or PV level.

[0126] In operation S1410, controller 100 can read the LSB page, MSB page, and CSB page to generate PV state counts. Further, controller 100 can generate a first ICMF sample set for PV2, PV3, PV6, and PV7 based on the PV state counts. Here, ICMF represents the reciprocal of the Cumulative Quality Function (CMF). In some embodiments, for each read threshold voltage, the Cumulative Quality Function (CMF) value can be determined based on the number of cells (cell count) associated with the read operation using each read threshold voltage and the number of specific binary values ​​(1 or 0) among the cells. For example, each CMF value can be determined as {the number of 1s or 0s (e.g., 1) / cell count}, i.e., the percentage of 1s or 0s.

[0127] 1420 represents the first ICMF sample set of PV2, PV3, PV6, and PV7 generated at S1410. The samples for each page in the first ICMF sample set are generated according to a parameterized distribution under each read threshold voltage. In the example shown, for the MSB page, ICMF samples are generated under read threshold voltages Vt0 and Vt4. The ICMF samples under read threshold voltage Vt0 indicate 75% 1 and 25% 0, and the ICMF samples under read threshold voltage Vt4 indicate 80% 1 and 20% 0. For the CSB page, ICMF samples are generated under read threshold voltages Vt1, Vt3, and Vt5. The ICMF samples under read threshold voltage Vt1 indicate 10% 1 and 90% 0, the ICMF samples under read threshold voltage Vt3 indicate 30% 0 and 70% 1, and the ICMF samples under read threshold voltage Vt5 indicate 30% 1 and 70% 0. For LSB pages, ICMF samples are generated at read threshold voltages Vt2 and Vt6. ICMF samples at read threshold voltage Vt2 indicate 40% of 1s and 60% of 0s, while ICMF samples at read threshold voltage Vt6 indicate 40% of 0s and 60% of 1s.

[0128] In operation S1420, controller 100 can read one or more LSB pages and CSB pages to generate a second ICMF sample set and a third ICMF sample set for PV2, PV3, PV6, and PV7. In the embodiment shown in 1430, two LSB pages and CSB pages can be read. This embodiment considers a distribution based on a skewed normal model (SNM). The number of LSB pages and CSB pages to be read can vary depending on the distribution model. For a modified Gaussian model (IGM) distribution, one LSB page and CSB page can be read. For a non-centralized model (NCTM) distribution, three LSB pages and CSB pages can be read.

[0129] 1430 represents the second and third ICMF sample sets PV2, PV3, PV6, and PV7 generated in S1410. The samples for each page in the second ICMF sample set are generated according to a parameterized distribution under each read threshold voltage. In the example shown, for the CSB page, the ICMF samples are generated at read threshold voltages Vt'3 and Vt'5. For the LSB page, the ICMF samples are generated at read threshold voltages Vt'2 and Vt'6. The samples for each page in the third ICMF sample set are generated according to a parameterized distribution under each read threshold voltage. In the example shown, for the CSB page, the ICMF samples are generated at read threshold voltages Vt'3 and Vt'5. For the LSB page, the ICMF samples are generated at read threshold voltages Vt'2 and Vt'6. Although not shown, each ICMF sample in the second and third ICMF sample sets indicates a percentage of 0 and a percentage of 1.

[0130] Therefore, operations S1410 and S1420 are performed to estimate the distribution for a specific page (i.e., the LSB page) and eliminate other components (e.g., noise). In operation S1430, the controller 100 can provide ICMF samples to the DNN1 1200A, 1200B to estimate the optimal read threshold voltage. In some embodiments, such as Figure 13 As shown, the CDF value corresponding to the ICMF sample (or CMF sample) can be provided to DNN1 1200A, 1200B.

[0131] As described above, in order to estimate the optimal read threshold voltage, the controller 100 can utilize, for example... Figure 13 The components shown are used to implement this. This embodiment is described in U.S. Patent Application No. 17 / 233,167, entitled "Systems and Methods for Parametric PV-Level Modeling and Readthreshold Voltage Estimate," which is incorporated herein by reference in its entirety. Figure 13 In the illustrated implementation, controller 100 performs a three-step process to estimate the optimal read threshold voltage Vt_opt. As the number of processes increases, more errors may be introduced. Therefore, it is desirable to provide a scheme for estimating the optimal read threshold voltage while reducing the number of processes.

[0132] Another embodiment of the controller is described in U.S. Patent Application No. 17 / 011,983 (hereinafter referred to as the '983 Patent Application), entitled "EFFICIENT READ-THRESHOLD CALCULATION METHOD FOR PARAMETRIC PV-LEVEL MODELING," which is incorporated herein by reference. As shown in Figure 5 of the '983 Patent Application, the '983 Patent Application provides a computer system 500 having a structure including a neural network 506, a voltage readout threshold generator 510, a floating-point unit (FPU) 512, and an approximation generator 516. The voltage readout threshold generator 510, the floating-point unit 512, and the approximation generator 516 correspond to firmware implemented to calculate the crossover voltage based on parameters estimated by the neural network 506. The floating-point unit 512 performs crosspoint voltage calculations via floating-point digital signal processing (or floating-point multiply-accumulate (MAC) operations), where the quantity is represented by the mantissa and exponent (e.g., A×2). B (where "A" is the mantissa and "B" is the exponent). Therefore, for gates associated with floating-point digital signal processing, the floating-point unit 512 may require a relatively larger hardware area and increased peak power consumption. Therefore, it is desirable to provide a scheme for estimating the optimal read threshold voltage while reducing hardware area and peak power consumption.

[0133] Figure 15 This is a diagram illustrating an optimal reading threshold determination device 1500 according to an embodiment of the present invention.

[0134] Reference Figure 15 The optimal reading threshold determination device 1500 may include a parameter determiner 1510 and a voltage determiner 1520.

[0135] The parameter determiner 1510 can receive two adjacent PV levels ML(x) i ), MR(x j The measured value and its measured value x) i and x j The corresponding position. The parameter determiner 1510 can estimate the position relative to the PV level ML(x). i ) and MR(x j The corresponding first probability distribution parameter set and second probability distribution parameter set (or channel parameter set) Θ L and Θ R The parameter determiner 1510 can utilize the parameters used to generate the PV level ML(x) i ) and MR(x j The estimated channel parameter set Θ L and ΘR The algorithm is used for implementation. As an example and not a limitation, the parameter determiner 1510 can utilize... Figure 13 We will implement this using two instances of DNN11200A and 1200B.

[0136] In some embodiments, the first probability distribution parameter set and the second probability distribution parameter set Θ L and Θ R The skewed normal distributions of the first PV level and the second PV level can be represented separately. Each set of probability distribution parameters can include the position ξ, scale ω, and shape α of the curve associated with the skewed normal distribution of the PV level.

[0137] Voltage determiner 1520 can receive a first probability distribution parameter set and a second probability distribution parameter set (or channel parameter set) from parameter determiner 1510. L and Θ R The voltage determiner 1520 can be based on a first probability distribution parameter set and a second probability distribution parameter set Θ. L and Θ R To estimate the optimal read threshold voltage Vt_opt. As mentioned above, two adjacent PV levels, i.e., PV level ML(x i ) and MR(x j The unit level distribution of a given element can be modeled as having a probability distribution parameter set Θ. L =(ξ L ,ω L ,α L ) and Θ R =(ξ R ,ω R ,α R The skewed normal distribution is given by the probability density function (PDF) of the skewed normal distribution.

[0138] In the equation above, and These are the PDF and cumulative distribution function (CDF) of the normal distribution, respectively.

[0139] Voltage determiner 1520 can make the first PV level ML(x) i The first PDF value and the second PV level ML(x) of the skewed normal distribution. i The second PDF value of the skewed normal distribution is the same as the read threshold voltage estimate for the optimal read threshold voltage Vt_opt. In other words, the optimal read threshold voltage Vt_opt can have such that f(x) = ... * ;Θ L )=f(x * ;Θ RThe value of x) * Thus, the optimal read threshold voltage Vt_opt can be determined by the first PV level ML(x). i The first PDF value and the second PV level ML(x) of the skewed normal distribution. i The crossover point between the second PDF values ​​of the skewed normal distribution is calculated to estimate the value.

[0140] In other embodiments, a distribution other than a skewed normal distribution (e.g., a Gaussian distribution) can be used to model the cell level distribution of the PV level. When using a Gaussian distribution, each probability distribution parameter set Θ can include the mean and variance of the PV level. In this embodiment, the estimated parameters (i.e., PV level) L and PV R The mean and variance can be used as inputs to avoid situations where, for example, Figure 13 Determine the optimal intersection point for the LUT shown.

[0141] Figure 16 A graph 1600 shows an example voltage reading threshold estimated by voltage determiner 1520 according to an embodiment of the present invention.

[0142] Reference Figure 16 Graph 1600 depicts the first curve 1602 (i.e., f(x; Θ)) representing the probability distribution of the first voltage range 1608. L =(ξ L ,ω L ,α L ))) and the second curve 1604 (i.e., f(x; Θ) representing the probability distribution of the second voltage range 1610. R =(ξ R ,ω R ,α R The optimal read threshold voltage (i.e., -0.29151) 1606 was determined to be the intersection point between the first voltage range 1608 and the second voltage range 1610, indicated by the intersection of the first curve 1602 and the second curve 1604. Figure 16 In the example shown, the first curve 1602 and the second curve 1604 depict the logarithmic function of the probability distribution parameter p = (ξ, ω, α) over two voltage ranges. Specifically, the first curve 1602 corresponds to f(x; Θ). L =(ξ L =-1.0,ω L =0.4,α L =0.0), and the second curve 1604 corresponds to f(x; Θ) R =(ξ R =1.0,ω R =0.7,αR =-2.0). Each of the first voltage range 1608 and the second voltage range 1610 is modeled by the skewed normal distribution parameter set shown in the legend of graph 1600.

[0143] Return to reference Figure 15 The voltage determiner 1520 can be implemented using a deep neural network (DNN) based on the probability distribution parameter set Θ. L and Θ R The optimal read threshold voltage Vt_opt is calculated based on fixed-point multiply-accumulate (MAC) operations. The DNN can include hardware blocks of extremely small size: the inventors estimate that at 1 GHz operation, the DNN performs 32 MAC operations per loop, and the estimated runtime of the DNN is approximately 1 µs. If the parameter determiner 1510 utilizes... Figure 13 If implemented using DNN1, the MAC hardware block (i.e., gates) used for DNN1 can be reconfigured for the DNN of voltage determiner 1520. The weights of the DNN in voltage determiner 1520 can be reconfigured based on the model used in DNN1. Compared to the floating-point unit (FPU) 512 used for floating-point arithmetic (or signal processing) in '983 patent application, the MAC hardware block in the DNN used for voltage determiner 1520 can be implemented to perform fixed-point arithmetic in which quantities are represented by a fixed number of digits. Because fixed-point arithmetic is not as complex as floating-point arithmetic, the gate count and peak power consumption of voltage determiner 1520 can be reduced with minimal loss of accuracy in cross-point calculation.

[0144] As described above, the embodiments provide a scheme for estimating the optimal readout threshold voltage using a one-step process by directly applying the predicted probability distribution parameters to the crossover voltage calculation. A fixed-point MAC hardware block of a deep neural network (DNN) can be used for the crossover voltage calculation. Therefore, when implementing a DNN for crossover voltage calculation, the embodiments reduce hardware area (i.e., gate count) and peak power consumption.

[0145] While the foregoing embodiments have been shown and described in considerable detail for clarity and understanding, the invention is not limited to the details provided. As those skilled in the art will understand from the foregoing disclosure, many alternative ways of carrying out the invention exist. Therefore, the disclosed embodiments are illustrative and not restrictive. The invention is intended to cover all modifications and alternatives falling within the scope of the claims. Furthermore, embodiments may be combined to form additional embodiments.

Claims

1. A memory system, comprising: A memory device comprising multiple units; as well as Controller: Receive a first programming voltage level, i.e., a first PV level, and a second programming voltage level, i.e., a second PV level, associated with a read operation for the plurality of units; Estimate the first probability distribution parameter set and the second probability distribution parameter set representing the skewed normal distribution of the first PV level and the second PV level, respectively; as well as The optimal read threshold voltage is estimated based on the first probability distribution parameter set and the second probability distribution parameter set. The optimal read threshold voltage is the read threshold voltage that makes the first probability density function value (i.e., the first PDF value) of the skewed normal distribution of the first PV level the same as the second probability density function value (i.e., the second PDF value) of the skewed normal distribution of the second PV level. The first PDF value and the second PDF value are respectively represented as a combination of the PDF and the cumulative distribution function (CDF) of the normal distribution using the first probability distribution parameter set and the second probability distribution parameter set, and The optimal read threshold voltage is estimated by a neural network that receives a first probability distribution parameter set and a second probability distribution parameter set, uses the first probability distribution parameter set and the second probability distribution parameter set to determine the first PDF value and the second PDF value, and uses the first PDF value and the second PDF value to estimate the optimal read threshold voltage.

2. The memory system of claim 1, wherein the controller further uses the optimal read threshold voltage to perform a next read operation on the plurality of cells.

3. The memory system of claim 1, wherein the first probability distribution parameter set and the second probability distribution parameter set respectively include the position, scale, and shape of curves associated with the skewed normal distributions of the first PV level and the second PV level.

4. The memory system of claim 1, wherein the optimal read threshold voltage is estimated as the voltage at the intersection of the first PDF value and the second PDF value.

5. The memory system according to claim 1, wherein the neural network performs fixed-point multiplication and accumulation operations, i.e., fixed-point MAC operations.

6. The memory system of claim 1, wherein each of the first probability distribution parameter set and the second probability distribution parameter set is estimated by another neural network, the other neural network being trained to output each of a plurality of probability distribution parameter sets corresponding to each of the first PV level and the second PV level.

7. The memory system according to claim 6, wherein the neural network performs fixed-point multiplication and accumulation operations, i.e., fixed-point MAC operations.

8. A method for operating a memory system, the memory system including a memory device and a controller, the memory device including a plurality of units, the method comprising: Receive a first programming voltage level, i.e., a first PV level, and a second programming voltage level, i.e., a second PV level, associated with a read operation for the plurality of units; Estimate the first probability distribution parameter set and the second probability distribution parameter set representing the skewed normal distribution of the first PV level and the second PV level, respectively; as well as The optimal read threshold voltage is estimated based on the first probability distribution parameter set and the second probability distribution parameter set. The optimal read threshold voltage is the read threshold voltage that makes the first probability density function value (i.e., the first PDF value) of the skewed normal distribution of the first PV level the same as the second probability density function value (i.e., the second PDF value) of the skewed normal distribution of the second PV level. The first PDF value and the second PDF value are respectively represented as a combination of the PDF and the cumulative distribution function (CDF) of the normal distribution using the first probability distribution parameter set and the second probability distribution parameter set, and The optimal read threshold voltage is estimated by a neural network that receives a first probability distribution parameter set and a second probability distribution parameter set, uses the first probability distribution parameter set and the second probability distribution parameter set to determine the first PDF value and the second PDF value, and uses the first PDF value and the second PDF value to estimate the optimal read threshold voltage.

9. The method of claim 8, further comprising: The next read operation is performed on the plurality of cells using the optimal read threshold voltage.

10. The method of claim 8, wherein the first probability distribution parameter set and the second probability distribution parameter set respectively include the position, scale, and shape of curves associated with skewed normal distributions of the first PV level and the second PV level.

11. The method of claim 8, wherein the optimal read threshold voltage is estimated as the crossover voltage of the first PDF value and the second PDF value.

12. The method according to claim 8, wherein the neural network performs fixed-point multiplication and accumulation operations, i.e., fixed-point MAC operations.

13. The method of claim 8, wherein each of the first probability distribution parameter set and the second probability distribution parameter set is estimated by another neural network, the other neural network being trained to output each of a plurality of probability distribution parameter sets corresponding to each of the first PV level and the second PV level.

14. The method of claim 13, wherein the neural network performs fixed-point multiplication and accumulation operations, i.e., fixed-point MAC operations.