Flash memory controller and its data processing method

By performing matrix multiplication, addition, or convolution operations and error correction in the memory cell array using a flash memory controller, the problem of high energy consumption in traditional storage systems is solved, achieving efficient data processing and energy saving.

CN122308736APending Publication Date: 2026-06-30UNITED MEMORY TECHNOLOGY (JIANGSU) LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
UNITED MEMORY TECHNOLOGY (JIANGSU) LTD
Filing Date
2026-03-26
Publication Date
2026-06-30

Smart Images

  • Figure CN122308736A_ABST
    Figure CN122308736A_ABST
Patent Text Reader

Abstract

This application discloses a flash memory controller and its data processing method. The method includes: acquiring input data; encoding the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are within preset ranges; performing preset operations on a preset memory array based on the pulse sequence and outputting the operation results; wherein the preset operations are matrix multiplication-addition or convolution operations; and performing error correction processing on the operation results based on a preset error prediction model and outputting the target calculation result. Therefore, by encoding the input data into a pulse sequence that conforms to preset parameters and directly performing preset operations such as matrix multiplication-addition or convolution on a preset memory array, in-situ data computation is achieved, effectively reducing the frequent data transfer between storage units and computing units in traditional architectures, thereby significantly reducing energy consumption caused by data transmission.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of semiconductor storage technology, and in particular to a flash memory controller and its data processing method. Background Technology

[0002] With the rapid development and widespread adoption of edge AI applications, the issue of data transfer between storage and computing units has become increasingly prominent, representing a key bottleneck restricting the overall system's energy efficiency. In traditional von Neumann architectures, especially in NAND Flash-based storage systems, data cannot be processed directly within the storage device; instead, it requires frequent migration between NAND storage chips and the control chip.

[0003] Based on the above architecture, the data transfer process not only increases the complexity of the system but also leads to significant energy consumption. Studies have shown that in such systems, up to 65% of the total energy consumption is not used for actual computing tasks but is consumed in the back-and-forth data transfer process. Summary of the Invention

[0004] The main objective of this application is to propose a flash memory controller and its data processing method to solve the problem of high energy consumption in control chips.

[0005] The first aspect of this application provides a data processing method for a flash memory controller, comprising: acquiring input data; encoding the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are respectively within preset ranges; performing a preset operation in a preset memory cell array based on the pulse sequence and outputting the operation result; wherein the preset operation is a matrix multiplication-addition operation or a convolution operation; and performing error correction processing on the operation result based on a preset error prediction model and outputting the target calculation result.

[0006] In some embodiments, a preset operation is performed in a preset memory cell array based on a pulse sequence and the operation result is output, including: inputting the pulse sequence to a word line voltage modulator and outputting a word line voltage signal based on a preset word line voltage range; applying the word line voltage signal to the word lines of a preset transistor array to control the preset transistor array to perform matrix multiplication or convolution operations; and determining the operation result based on the conduction current generated by the preset transistor array using a bit line charge integrator.

[0007] In some embodiments, the calculation results are corrected and the target calculation result is output based on a preset error prediction model, including: processing the calculation results based on the preset error prediction model to determine the error location distribution; and controlling the output of the target calculation result based on the preset error correction strategy and the error location distribution.

[0008] In some embodiments, the computation results are processed based on a preset error prediction model to determine the error location distribution, including: extracting features from the computation results through a preset convolutional layer to determine a first feature map; processing the first feature map through a preset long short-term memory network layer to determine a probability distribution, and determining the error location distribution based on the probability distribution; wherein the error location distribution includes preset high-probability error positions and preset low-probability error positions.

[0009] In some embodiments, the output target calculation result is controlled based on a preset error correction strategy and error location distribution, including: adjusting the symbol allocation of a preset BCH code based on the error location distribution, so that the number of first error correction symbols corresponding to preset high-probability error bits is greater than the number of second error correction symbols corresponding to preset low-probability error bits; correcting the calculation result based on the adjusted preset BCH code, and controlling the output target calculation result.

[0010] In some embodiments, the method further includes: obtaining an external instruction; querying a mapping table in a preset two-layer mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address; wherein the mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table, the first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, with a mapping granularity of a preset coarse-grained granularity.

[0011] In some embodiments, querying a mapping table in a preset two-layer mapping structure based on the logical address in an external instruction to determine the physical address corresponding to the logical address includes: querying the first-layer mapping table in the preset two-layer mapping structure based on the logical address in the external instruction; if the first-layer mapping table contains a physical address corresponding to the logical address, the physical address is directly determined; if the first-layer mapping table does not contain a physical address corresponding to the logical address, the second-layer mapping table is queried; the first-layer mapping table is updated based on the query result, and the physical address is determined.

[0012] A second aspect of this application provides a flash memory controller, comprising: an interface module for acquiring input data; an encoding module connected to the interface module for encoding the input data into a pulse sequence based on a preset pulse interval modulation strategy, wherein the width and duty cycle of the pulse sequence are within preset ranges; a storage and computation module connected to the encoding module for performing preset operations in a preset storage and computation unit array based on the pulse sequence and outputting the operation results; wherein the preset operations are matrix multiplication and addition operations or convolution operations; and an error correction module connected to the storage and computation module for performing error correction processing on the operation results based on a preset error prediction model and outputting the target calculation result.

[0013] In some embodiments, the memory computing module includes: a word line driving circuit for receiving a pulse sequence and outputting a word line voltage signal based on a preset word line voltage range; a transistor array for applying the word line voltage signal to the word lines of the preset transistor array to control the preset transistor array to perform matrix multiplication and addition operations or convolution operations; and a bit line integrating circuit for determining the operation result based on the conduction current generated by the preset transistor array.

[0014] In some embodiments, the flash memory controller further includes an address mapping module, which has a preset two-layer mapping structure. The address mapping module is used to query the mapping table in the preset two-layer mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address. The mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table. The first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, with a preset coarse-grained mapping granularity.

[0015] The beneficial technical effects of this application are as follows: Based on the flash memory controller and data processing method provided in this application, the method includes: acquiring input data; encoding the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are within preset ranges; performing preset operations in a preset memory cell array based on the pulse sequence and outputting the operation results; wherein the preset operations are matrix multiplication and addition or convolution; and performing error correction processing on the operation results based on a preset error prediction model and outputting the target calculation result. Therefore, by encoding the input data into a pulse sequence that conforms to preset parameters and directly performing preset operations such as matrix multiplication and addition or convolution in the preset memory cell array, in-situ data computation is realized, effectively reducing the frequent data transfer between storage units and computing units in traditional architectures, thereby significantly reducing energy loss caused by data transmission. Attached Figure Description

[0016] Figure 1 This is a schematic flowchart of an embodiment of the data processing method for a flash memory controller provided in this application; Figure 2 This is a schematic flowchart of another embodiment of the data processing method for the flash memory controller provided in this application; Figure 3 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application; Figure 4 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application; Figure 5 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application; Figure 6 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application; Figure 7 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application; Figure 8 This is a structural block diagram of an embodiment of the flash memory controller provided in this application; Figure 9 This is a structural block diagram of an embodiment of the flash memory controller provided in this application; Figure 10 This is a structural block diagram of another embodiment of the flash memory controller provided in this application. Detailed Implementation

[0017] The solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments in this application, and not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0018] It should be noted that all directional indicators (such as up, down, left, right, front, back, etc.) in the embodiments of this application are only used to explain the relative positional relationship and movement of each component in a certain specific posture (as shown in the figure). If the specific posture changes, the directional indicator will also change accordingly.

[0019] It should also be noted that when a component is referred to as "fixed to" or "set on" another component, it can be directly on the other component or an intervening component can be present simultaneously. When a component is referred to as "connected to" another component, it can be directly connected to the other component or an intervening component can be present simultaneously.

[0020] Furthermore, the use of terms such as "first" and "second" in this application is for descriptive purposes only and should not be construed as indicating or implying their relative importance or implicitly specifying the number of technical features indicated. Therefore, features defined as "first" or "second" explicitly or implicitly include at least one of those features. Additionally, the technical solutions of the various embodiments can be combined with each other, but only on the basis of being achievable by those skilled in the art. If the combination of technical solutions is contradictory or impossible to implement, such a combination of technical solutions should be considered non-existent and not within the scope of protection claimed in this application.

[0021] The first aspect of this application provides a data processing method for a flash memory controller. This method can be applied to the NAND Flash controller chip of a NAND Flash storage system. The NAND Flash controller chip is the core processor that manages the NAND Flash storage system, i.e., the flash memory controller. Figure 1 This is a flowchart illustrating an embodiment of the data processing method for a flash memory controller provided in this application, in conjunction with... Figure 1 This method includes the following steps: S101: Obtain input data.

[0022] The input data can be data from an external host, such as feature data in artificial intelligence algorithms, user data to be stored, etc. The interface module of the flash controller can receive this data through the communication protocol with the external device.

[0023] In some application scenarios, the flash memory controller's interface module can be a hybrid optoelectronic high-speed interface module, capable of receiving external 25Gbps optical signals and converting them into electrical signals, while also supporting reverse electro-optical output. This interface module can integrate a low-jitter clock recovery circuit and a protocol parsing unit, enabling direct parsing of computation task descriptors, weight data, and control parameters, ensuring high-bandwidth, low-latency data injection and results.

[0024] S102: Based on a preset pulse interval modulation strategy, the input data is encoded into a pulse sequence; wherein the width and duty cycle of the pulse sequence are within preset ranges.

[0025] Specifically, the preset pulse interval modulation strategy can be determined based on the preset ranges of the pulse sequence width and duty cycle, thereby achieving accurate encoding of the input data. By ensuring that the pulse sequence width and duty cycle are within preset ranges, computational errors in the memory cell array caused by abnormal pulse parameters can be effectively avoided.

[0026] In some applications, the pulse width can be adjusted to 5-15 nanoseconds, and the duty cycle (defined as the ratio of pulse width to pulse interval) can be controlled within the range of 10%-30%. This step can be performed by a pulse interval modulation encoder, which encodes the input data into a series of pulses. For example, a pulse interval modulation encoder can compress and encode 32-bit floating-point data into a sequence of 1-4 sparse pulses. In this case, the numerical information is encoded in the number of pulses and the time interval between adjacent pulses; a longer interval indicates a larger numerical component, and a shorter interval indicates a smaller numerical component.

[0027] S103: Based on the pulse sequence, perform preset operations in the preset storage unit array and output the operation results; wherein, the preset operation is matrix multiplication and addition or convolution operation.

[0028] It should be understood that the preset in-memory computing unit matrix includes a NAND transistor array, which deeply integrates storage units and computing units, enabling data to be processed without being moved out. The preset operations are matrix multiplication-addition or convolution operations. In the in-memory computing unit array, a weight matrix is ​​pre-programmed and stored in the NAND transistor array (each transistor's threshold voltage corresponds to a weight value). A pulse sequence is applied to the NAND transistor array to complete the matrix multiplication-addition or convolution operations.

[0029] S104: Based on the preset error prediction model, perform error correction processing on the calculation results and output the target calculation results.

[0030] The architecture of the preset error prediction model can be set according to actual needs. For example, a hybrid model architecture based on deep learning can be adopted, which can combine the local feature extraction capability of convolutional neural networks with the ability of long short-term memory networks to capture sequence dependencies.

[0031] In some application scenarios, error prediction models can employ lightweight convolutional-LSTM cascaded networks, which may include feature extraction layers with 3×3 convolutional kernels and LSTM layers with 64 hidden units. The 3×3 convolutional kernel feature extraction layer can perform sliding window-style local feature extraction on the computation results, capturing subtle error patterns in the data, such as local data deviations caused by transistor threshold voltage drift. The 64-dimensional hidden unit LSTM layer, through a gating mechanism, performs temporal modeling on the output of the convolutional layers, learning the temporal correlation and long-range dependency of the error distribution in the computation results, thereby generating the probability distribution of error occurrence.

[0032] In summary, the data processing method based on the flash memory controller provided in this application encodes the input data into a pulse sequence conforming to preset parameters and directly performs preset operations such as matrix multiplication or convolution in a preset memory cell array. This achieves in-situ data computation, effectively reducing the frequent data transfer between storage and computation units in traditional architectures, thereby significantly reducing energy consumption caused by data transmission. Furthermore, error correction processing can accurately predict and specifically repair potential errors in the computation results, further improving the reliability of the computation results.

[0033] Figure 2 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0034] Combination Figure 2 In some embodiments, based on a pulse sequence, a preset operation is performed in a preset memory array and the operation result is output, including: S201: Input the pulse sequence to the word line voltage modulator and output the word line voltage signal based on the preset word line voltage range.

[0035] Specifically, the preset word line voltage range can be set from 1.8V to 3.3V, with an adjustment step of 0.1V, for a total of 16 discrete voltage levels. By precisely adjusting the word line voltage (i.e., the analog representation of the input data), in conjunction with the weighted threshold voltages stored in the transistors, the conduction current generated by each transistor satisfies a specific approximate relationship. The word line voltage adjustment range of 1.8-3.3V covers the input encoding requirements of different numerical values, and the 0.1V step ensures approximately 16 levels of quantization accuracy, ensuring that the overall calculation accuracy loss does not exceed 8%.

[0036] After the pulse sequence is input to the word line voltage modulator, the word line voltage modulator will output a word line voltage signal based on the preset word line voltage range. This signal can dynamically match the time characteristics and amplitude information of the pulse sequence to achieve precise driving of the transistors in the memory cell array.

[0037] S202: Apply word line voltage signals to the word lines of the preset transistor array to control the preset transistor array to perform matrix multiplication or convolution operations.

[0038] The transistor array consists of 128 MLC transistors configured as a 4×32 computing unit array, with each MLC transistor capable of storing 2 bits of data. After outputting the word line voltage signal, this step further applies the word line voltage signal to the word lines of the preset transistor array. When the word line voltage is applied, the conduction state of the transistors is determined by both the word line voltage and the stored threshold voltage: when the word line voltage is higher than the threshold voltage, the transistors conduct and generate a conduction current related to the voltage difference; when the word line voltage is lower than the threshold voltage, the transistors are cut off, and the current is zero. In matrix multiplication and addition mode, the conduction current of each column of transistors is summed through the bit lines, and the sum is the dot product of the input vector and the weight vector.

[0039] S203: Based on the bit line charge integrator, the calculation result is determined according to the conduction current generated by the preset transistor array.

[0040] The bit line charge integrator, connected to the end of the bit line, is responsible for collecting the accumulated value of the conduction current of all transistors on the same bit line, converting the current signal into a voltage signal, and performing a vector inner product summation operation. When a conduction current is generated on a bit line in the preset transistor array, the bit line charge integrator integrates the current, converting the current signal into a voltage signal. After integration, the 12-bit ADC quantizes the integrated voltage, and the quantization result is filtered and calibrated by the digital signal processing unit to finally obtain the digital operation result corresponding to matrix multiplication or convolution.

[0041] Figure 3 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0042] Combination Figure 3 In some embodiments, based on a preset error prediction model, error correction processing is performed on the calculation results and the target calculation result is output, including: S301: Process the calculation results based on the preset error prediction model to determine the error location distribution.

[0043] Based on the above embodiments, if the error prediction model uses a lightweight convolutional-LSTM cascaded network, including a feature extraction layer with 3×3 convolutional kernels (stride 2) and an LSTM layer with 64 hidden units, then the computation results can be processed first through the convolutional layer, and then through the LSTM layer to determine the error location distribution.

[0044] S302: Based on the preset error correction strategy and error location distribution, control the output target calculation result.

[0045] The preset error correction strategy can be: using a correlation-aware adaptive BCH error correction code, dynamically adjusting the code allocation according to the error location distribution, and then correcting errors based on the adjusted code, thereby outputting the target calculation result.

[0046] Figure 4 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0047] Combination Figure 4 In some embodiments, the computation results are processed based on a preset error prediction model to determine the error location distribution, including: S401: The first feature map is determined by extracting features from the operation results through a preset convolutional layer.

[0048] Specifically, the computation results are first processed by a convolutional layer to extract features, determining a first feature map that captures spatial correlation patterns of errors. For example, when there are local data deviations in the computation results due to transistor threshold voltage drift, the convolutional layer can scan these local areas using a sliding window to extract feature information such as the location of the error and the degree of difference between the error value and the surrounding normal data, forming the first feature map. Each pixel in the first feature map corresponds to the error features of a specific region in the computation results, providing basic data for subsequent determination of the error location distribution.

[0049] S402: The first feature map is processed by a preset long short-term memory network layer to determine the probability distribution, and the error location distribution is determined based on the probability distribution; wherein, the error location distribution includes preset high-probability error positions and preset low-probability error positions.

[0050] In this step, the time-series features of the first feature map are processed by an LSTM layer to determine the probability distribution, and the error location distribution is determined based on the probability distribution. The error location distribution includes preset high-probability error locations (e.g., probability > 0.7) and preset low-probability error locations (e.g., probability < 0.3). Preset high-probability error locations indicate that the probability of an error occurring at that location is extremely high, requiring priority for precise error correction; preset low-probability error locations indicate a lower risk of error occurrence, allowing for simplified verification mechanisms or fault-tolerant processing in subsequent data applications.

[0051] Figure 5 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0052] Combination Figure 5 In some embodiments, the output target calculation result is controlled based on a preset error correction strategy and error location distribution, including: S501: Adjust the symbol allocation of the preset BCH code based on the error location distribution, so that the number of first error correction symbols corresponding to preset high-probability error bits is greater than the number of second error correction symbols corresponding to preset low-probability error bits.

[0053] Specifically, this embodiment allocates a first error-correcting code element (a large number, such as 3 redundant code elements per bit) to preset high-probability error bits and a second error-correcting code element (a small number, such as 1 redundant code element per bit) to preset low-probability error bits, thereby achieving non-uniform error correction protection. The improved BCH code supports 8-bit / 1KB error correction capability, which improves error correction capability in storage-and-computation scenarios compared to traditional fixed BCH codes, while reducing the overhead of redundant check bits.

[0054] S502: Corrects the calculation results based on the adjusted preset BCH code and controls the output of the target calculation result.

[0055] After adjusting the symbol allocation of the preset BCH code based on the error location distribution, this step further utilizes the adjusted preset BCH code to perform error correction on the calculation results. Specifically, for regions marked as preset high-probability error bits, a larger number of first error correction symbols are allocated, enabling precise location and repair of multiple possible errors within these regions. For regions marked as preset low-probability error bits, although fewer second error correction symbols are allocated, it is sufficient to handle their lower error risk, effectively controlling overall redundancy while ensuring basic error correction capabilities. After error correction, the verified target calculation result is output, thus ensuring the accuracy and reliability of data processing.

[0056] Figure 6This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0057] Combination Figure 6 In some embodiments, this method further includes: S601: Get external instructions.

[0058] The external instructions can be computation instructions, which may include computation task type, operation configuration, etc., such as specifying to perform matrix multiplication and addition operations or convolution operations, as well as parameter settings during the operation process, such as the width range of the pulse sequence, the duty cycle range, the word line voltage adjustment step size, etc.

[0059] S602: Based on the logical address in the external instruction, query the mapping table in the preset two-layer mapping structure to determine the physical address corresponding to the logical address; wherein, the mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table. The first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, and its mapping granularity is a preset coarse-grained granularity.

[0060] The first-level mapping table is a fine-grained cache maintained based on the LRU strategy, for example, with a cache granularity of 4KB pages, a capacity of 8MB, and a hit rate of no less than 98%, recording the dynamic mapping from logical blocks to application virtual addresses. The system maintains a sliding window counter on each logical block mapping entry, recording the number of times that address has been accessed in the last 100 FTL queries. If an address has been accessed at least 5 times in the last 100 queries, it is marked as hot data and preferentially resides in the LRU cache. The second-level mapping table is a fixed mapping table from physical blocks to logical blocks, for example, with a mapping granularity of 128KB, using a static mapping relationship. This step queries the mapping table in the preset two-level mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address. This can be done by sequentially utilizing the first-level and second-level mapping tables to determine the physical address corresponding to the logical address.

[0061] Figure 7 This is a flowchart illustrating another embodiment of the data processing method for the flash memory controller provided in this application.

[0062] Combination Figure 7 In some embodiments, the physical address corresponding to the logical address is determined by querying a mapping table in a preset two-layer mapping structure based on the logical address in the external instruction, including: S701: Query the first-level mapping table in the preset two-level mapping structure based on the logical address in the external instruction.

[0063] This step begins by querying the first-level mapping table because it serves as a fine-grained cache maintained based on the LRU policy, enabling rapid responses to frequently accessed hot data address queries. Its 4KB cache granularity closely matches the fine-grained data operations typically involved in external instructions, and its 8MB capacity and hit rate of at least 98% ensure that the logical addresses of most hot data can directly find their corresponding physical address mappings at this level, significantly reducing address translation latency.

[0064] S702: If the physical address corresponding to the logical address exists in the first-level mapping table, then the physical address is determined directly.

[0065] It should be understood that if the query hits the first-level mapping table, that is, the mapping relationship corresponding to the logical address exists in the fine-grained cache at the 4KB page level, and it is determined to be hot data according to the sliding window counter, then the corresponding physical address is directly read from the first-level mapping table.

[0066] S703: If the physical address corresponding to the logical address does not exist in the first-level mapping table, then query the second-level mapping table.

[0067] If the physical address corresponding to a logical address does not exist in the first-level mapping table, it indicates that the logical address corresponds to cold data or data being accessed for the first time. In this case, it is necessary to further query the second-level mapping table. The second-level mapping table, as a fixed mapping table from physical blocks to logical blocks, uses a 128KB coarse-grained static mapping relationship and can cover the entire logical address space. By querying the second-level mapping table, the coarse-grained physical block information where the logical address resides can be obtained, thus providing the physical address basis for subsequent data access or computational operations.

[0068] S704: Update the first-level mapping table based on the query results and determine the physical address.

[0069] After retrieving the coarse-grained physical block information corresponding to the logical address from the second-level mapping table, the system allocates fine-grained physical pages matching the logical address from that physical block according to preset rules, and updates the mapping relationship between this logical address and the fine-grained physical page to the first-level mapping table. The update process follows an LRU (Least Recently Used) strategy; when the capacity of the first-level mapping table reaches its 8MB limit, the mapping entry with the fewest recent accesses is evicted to free up space for the new mapping relationship. After the update is complete, the physical address corresponding to the logical address is determined, and subsequent accesses to that logical address will prioritize a fast lookup through the first-level mapping table, thereby optimizing address translation efficiency.

[0070] A second aspect of this application provides a flash memory controller 20, Figure 8 This is a structural block diagram of an embodiment of the flash memory controller 20 provided in this application.

[0071] Combination Figure 8 The flash memory controller 20 includes an interface module 21, an encoding module 22, a storage-and-computation module 23, and an error correction module 24. The interface module 21 is connected to the encoding module 22, the encoding module 22 is connected to the storage-and-computation module 23, and the storage-and-computation module 23 is connected to the error correction module 24. The interface module 21 is used to acquire input data. The encoding module 22 is used to encode the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are within preset ranges. The storage-and-computation module 23 is connected to the encoding module and is used to perform preset operations on the pulse sequence in a preset storage-and-computation unit array and output the operation results; wherein the preset operations are matrix multiplication and addition operations or convolution operations. The error correction module 24 is connected to the storage-and-computation module and is used to perform error correction processing on the operation results based on a preset error prediction model and output the target calculation result.

[0072] Specifically, interface module 21 can be a hybrid optoelectronic high-speed interface module, capable of receiving external 25Gbps optical signals and converting them into electrical signals, while also supporting reverse electro-optical output. Encoding module 22 can employ a pulse-interval modulation encoder, which integrates a 10-bit DAC (digital-to-analog converter) and a timing controller. The in-memory computing module 23 incorporates a NAND transistor array. The in-memory computing function is primarily implemented through this array, which consists of 128 MLC (multilayer cell) transistors configured in a 4x32 column structure as computing unit arrays. Each MLC transistor can store 2 bits of data, thus enabling the entire array to possess basic storage and parallel computing capabilities. Error correction module 24 integrates a lightweight convolutional-LSTM cascaded network error prediction model and a correlation-aware adaptive BCH error correction code unit, capable of predicting error locations and accurately correcting errors in the computation results output by in-memory computing module 23.

[0073] Figure 9 This is a structural block diagram of another embodiment of the flash memory controller 20 provided in this application.

[0074] Combination Figure 9 In some embodiments, the memory computing module 23 includes: a word line driving circuit 231, for receiving a pulse sequence and outputting a word line voltage signal based on a preset word line voltage range; a transistor array 232, for applying the word line voltage signal to the word lines of the preset transistor array to control the preset transistor array to perform matrix multiplication and addition operations or convolution operations; and a bit line integrating circuit 233, for determining the operation result based on the conduction current generated by the preset transistor array.

[0075] Based on the above, the word line driving circuit 231 can receive the pulse sequence output by the encoding module 22 and dynamically adjust the output word line voltage signal within a preset word line voltage range according to the pulse sequence width and duty cycle parameters. This circuit can internally include a high-precision voltage regulator and a high-speed switching transistor, capable of voltage switching and stabilization within 10ns to meet the fast timing control requirements of the memory cell array. When a pulse signal indicating a high level is received, the word line driving circuit 231 outputs the corresponding word line voltage. The transistor array 232 is the core computing component of the memory cell module, composed of 128 MLC transistors arranged in a 4x32 column structure. When the word line voltage signal output by the word line driving circuit 231 is applied to a specific word line of the transistor array 232, the transistors on that word line determine whether to conduct and the degree of conduction based on their stored threshold voltage states. For example, if a transistor stores the data 11, its threshold voltage is low, allowing it to conduct at a low word line voltage, resulting in a large conduction current. Conversely, if the stored data is 00, the threshold voltage is high, requiring a higher word line voltage to conduct, resulting in a smaller conduction current. The bit line integrator circuit 233 integrates the conduction current on each column of the bit lines in the transistor array 232, converting the current signal into a voltage signal corresponding to the calculation result. This circuit may include a low-noise operational amplifier and a high-precision integrating capacitor, capable of accumulating the conduction current within a preset integration time. For instance, when the transistor array 232 performs matrix multiplication and addition, the sum of the conduction currents of multiple transistors on each column of bit lines is converted into an analog voltage value by the bit line integrator circuit 233. The magnitude of this voltage value is proportional to the sum of the parts and the result of the matrix multiplication and addition operation. Subsequently, the bit line integrator circuit 233 outputs this analog voltage signal to the subsequent analog-to-digital converter module, where it is quantized to obtain a digital calculation result, thus completing the calculation process of the storage module 23.

[0076] Figure 10 This is a structural block diagram of another embodiment of the flash memory controller 20 provided in this application.

[0077] Combination Figure 10 In some embodiments, the flash memory controller further includes an address mapping module 25, which has a preset two-layer mapping structure. The address mapping module is used to query the mapping table in the preset two-layer mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address. The mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table. The first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, with a preset coarse-grained mapping granularity.

[0078] Specifically, when the interface module 21 receives an external instruction containing a logical address, it transmits the logical address to the address mapping module 25. The address mapping module 25 first queries the first-level mapping table. The first-level mapping table acts as a fine-grained cache; its data structure uses the page number of the logical address as an index to store the corresponding physical page address, and each entry is associated with a sliding window counter to record the access count in the last 100 queries. If the query hits, it is determined to be hot data, and the corresponding physical address is returned directly. If it misses, the address mapping module 25 queries the second-level mapping table. The second-level mapping table uses the logical block number as an index to store the corresponding physical block number; its mapping relationship is static and does not change with access frequency. After finding the physical block number, the address mapping module 25 determines the specific physical page within that physical block based on the page offset in the logical address, thus obtaining the complete physical address. Simultaneously, this new mapping relationship between the logical address and the physical page is inserted into the first-level mapping table. If the first-level mapping table has reached its capacity limit when inserting, the LRU eviction mechanism is activated, removing the entry with the fewest recent accesses recorded by the sliding window counter from the cache to ensure that the new mapping relationship can be stored. This achieves address caching after the first access to cold data, improving the efficiency of subsequent possible accesses.

[0079] In summary, based on the flash memory controller and data processing method provided in this application, the method includes: acquiring input data; encoding the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are within preset ranges; performing preset operations in a preset memory cell array based on the pulse sequence and outputting the operation results; wherein the preset operations are matrix multiplication and addition or convolution; and performing error correction processing on the operation results based on a preset error prediction model and outputting the target calculation result. Therefore, by encoding the input data into a pulse sequence that conforms to preset parameters and directly performing preset operations such as matrix multiplication and addition or convolution in the preset memory cell array, in-situ data computation is achieved, effectively reducing the frequent data transfer between storage units and computing units in traditional architectures, thereby significantly reducing energy consumption caused by data transmission.

[0080] The above are only some or preferred embodiments of this application. Neither the text nor the drawings should limit the scope of protection of this application. All equivalent structural transformations made using the content of this application's specification and drawings under the overall concept of this application, or direct / indirect applications in other related technical fields, are included within the scope of protection of this application.

Claims

1. A data processing method for a flash memory controller, characterized in that, include: Get the input data; Based on a preset pulse interval modulation strategy, the input data is encoded into a pulse sequence; wherein the width and duty cycle of the pulse sequence are within preset ranges. Based on the pulse sequence, a preset operation is performed in a preset storage unit array and the operation result is output; wherein, the preset operation is a matrix multiplication-addition operation or a convolution operation; Based on a preset error prediction model, the calculation results are corrected and the target calculation result is output.

2. The method according to claim 1, characterized in that, The step of performing a preset operation on the pulse sequence in a preset memory array and outputting the operation result includes: The pulse sequence is input to the word line voltage modulator, and the word line voltage signal is output based on a preset word line voltage range; The word line voltage signal is applied to the word line of the preset transistor array to control the preset transistor array to perform matrix multiplication or convolution operations; The calculation result is determined based on the bit line charge integrator and the conduction current generated by the preset transistor array.

3. The method according to claim 1, characterized in that, The step of correcting the calculation results based on a preset error prediction model and outputting the target calculation result includes: The calculation results are processed based on a preset error prediction model to determine the error location distribution; Based on the preset error correction strategy and the error location distribution, the target calculation result is controlled and output.

4. The method according to claim 3, characterized in that, The process of processing the calculation results based on a preset error prediction model to determine the error location distribution includes: The first feature map is determined by extracting features from the operation results through a preset convolutional layer. The first feature map is processed by a preset long short-term memory network layer to determine a probability distribution, and an error location distribution is determined based on the probability distribution; wherein the error location distribution includes preset high-probability error locations and preset low-probability error locations.

5. The method according to claim 4, characterized in that, The step of controlling the output of the target calculation result based on the preset error correction strategy and the error location distribution includes: The symbol allocation of the preset BCH code is adjusted based on the error location distribution, so that the number of first error correction symbols corresponding to the preset high probability error bits is greater than the number of second error correction symbols corresponding to the preset low probability error bits. The calculation result is corrected based on the adjusted preset BCH code, and the target calculation result is output.

6. The method according to claim 1, characterized in that, The method further includes: Obtain external commands; Based on the logical address in the external instruction, the mapping table in the preset two-layer mapping structure is queried to determine the physical address corresponding to the logical address; The mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table. The first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, with a preset coarse-grained mapping granularity.

7. The method according to claim 6, characterized in that, The step of querying the mapping table in the preset two-layer mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address includes: Based on the logical address in the external instruction, query the first layer mapping table in the preset two-layer mapping structure; If the physical address corresponding to the logical address exists in the first-level mapping table, then the physical address is directly determined. If the physical address corresponding to the logical address does not exist in the first-level mapping table, then the second-level mapping table is queried; The first-level mapping table is updated based on the query results, and the physical address is determined.

8. A flash memory controller, characterized in that, include: The interface module is used to obtain input data; An encoding module, connected to the interface module, is used to encode the input data into a pulse sequence based on a preset pulse interval modulation strategy; wherein the width and duty cycle of the pulse sequence are respectively within a preset range; The storage and computation module, connected to the encoding module, is used to perform a preset operation in a preset storage and computation unit array based on the pulse sequence and output the operation result; wherein, the preset operation is a matrix multiplication and addition operation or a convolution operation; The error correction module, connected to the storage and calculation module, is used to perform error correction processing on the calculation results based on a preset error prediction model and output the target calculation result.

9. The flash memory controller according to claim 8, characterized in that, The storage module includes: A word line driving circuit is used to receive the pulse sequence and output a word line voltage signal based on a preset word line voltage range; A transistor array is used to apply the word line voltage signal to the word line of a preset transistor array to control the preset transistor array to perform matrix multiplication or convolution operations. Bit-line integration circuit is used to determine the calculation result based on the conduction current generated by the preset transistor array.

10. The flash memory controller according to claim 8, characterized in that, The flash memory controller also includes an address mapping module, which has a preset two-layer mapping structure. The address mapping module is used to query the mapping table in the preset two-layer mapping structure based on the logical address in the external instruction to determine the physical address corresponding to the logical address. The mapping table in the preset two-layer mapping structure includes a first-layer mapping table and a second-layer mapping table. The first-layer mapping table is a fine-grained cache maintained based on a preset LRU policy, and the second-layer mapping table is a fixed mapping table from physical blocks to logical blocks, with a preset coarse-grained mapping granularity.