Storage device, storage controller, and operation method of storage controller

By embedding model buffers and accelerators in storage devices, the problems of data transmission communication overhead and excessive resource consumption in the process of embedding vector generation are solved, thus optimizing the performance and resource utilization of artificial intelligence systems.

CN122219829APending Publication Date: 2026-06-16SAMSUNG ELECTRONICS CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SAMSUNG ELECTRONICS CO LTD
Filing Date
2025-08-27
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

In artificial intelligence systems, the data transmission and communication overhead and excessive resource consumption caused by the embedding vector generation process are particularly problematic in AI training and inference tasks, where existing technologies struggle to efficiently utilize storage device resources.

Method used

By embedding model buffers and accelerators in storage devices, embedding vector generation operations are implemented, reducing reliance on external I/O paths and optimizing data processing and resource utilization.

🎯Benefits of technology

It improved the performance of the artificial intelligence system, reduced data transmission and communication overhead, optimized resource utilization, and enhanced system efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122219829A_ABST
    Figure CN122219829A_ABST
Patent Text Reader

Abstract

A storage device, a storage controller, and an operating method of the storage controller are provided. The storage device includes a storage controller including an embedding model buffer and an accelerator; and a non-volatile memory operatively connected to the storage controller, wherein the non-volatile memory is configured to store target data and model data of an embedding model, and wherein the storage controller is configured to: based on a first request from a host, send a read command for the target data to the non-volatile memory, receive the target data from the non-volatile memory, and based on the received target data and the model data loaded into the embedding model buffer, generate an embedding vector using the accelerator; and based on a second request from the host, send the target data and the generated embedding vector to the host.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Cross-reference to related applications

[0002] This application is based on and claims priority to Korean Patent Application No. 10-2024-0187474, filed with the Korean Intellectual Property Office on December 16, 2024, the disclosure of which is incorporated herein by reference in its entirety. Technical Field

[0003] This disclosure relates to memory devices, and more specifically, to memory devices, memory controllers, and methods of operating memory controllers. Background Technology

[0004] With the latest advancements in artificial intelligence (AI) technology, the demand for systems equipped with AI capabilities is growing exponentially. This growth is based on various large models, including large language models (LLMs), and technologies such as vector databases (Vector DBs) are gaining attention.

[0005] In AI training and / or inference tasks, generating embedding vectors for input data is essential. Typically, this generation is performed by reading embedding model data, which can reach tens of megabytes (MB), and transferring it to high-speed computing devices such as Graphics Processing Units (GPUs) or Neural Processing Units (NPUs), while the input data file is transferred separately from the storage device to the GPU or NPU outside the storage device via the host's memory. However, such data transfer incurs significant communication overhead and is a major factor in reducing system performance.

[0006] Additionally, in AI training and / or inference systems, since GPUs or NPUs (outside of storage devices) are used to process large-scale AI models, performing additional computations to generate embedding vectors can lead to excessive consumption of the resources of GPUs or NPUs outside of storage devices. Summary of the Invention

[0007] This disclosure provides a storage device, a storage controller, and a method of operating the storage controller, which greatly improves the performance of artificial intelligence (AI) systems by supporting offloading through an on-device embedding vector generation operation performed in the storage device and sending the generated embedding vector to an application, thereby optimizing data processing and resource utilization.

[0008] According to one aspect of this disclosure, a storage device includes: a storage controller including an embedding model buffer and an accelerator; and a non-volatile memory operatively connected to the storage controller, wherein the non-volatile memory is configured to store target data and model data of an embedding model, and wherein the storage controller is configured to: send a read command for the target data to the non-volatile memory based on a first request from a host; receive the target data from the non-volatile memory; and generate an embedding vector using the accelerator based on the received target data and the model data loaded into the embedding model buffer; and send the target data and the generated embedding vector to the host based on a second request from the host.

[0009] According to one aspect of this disclosure, a storage controller configured to control a non-volatile memory storing target data and model data of an embedded model includes: an embedded model buffer; and an accelerator, wherein the storage controller is configured to: send a read command for target data to the non-volatile memory based on a first request from a host, receive target data from the non-volatile memory, and generate an embedding vector using the accelerator based on the received target data and model data loaded into the embedded model buffer; and send the target data and the generated embedding vector to the host based on a second request from the host.

[0010] According to one aspect of this disclosure, a method of operating a storage controller including an embedded model buffer and an accelerator and controlling a non-volatile memory includes: sending a read command for target data to the non-volatile memory based on a first request from a host; receiving target data from the non-volatile memory, and generating an embedding vector using the accelerator based on the received target data and model data loaded into the embedded model buffer; and sending the target data and the generated embedding vector to the host based on a second request from the host. Attached Figure Description

[0011] The above and other aspects, features and advantages of certain embodiments of this disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings, wherein:

[0012] Figure 1 A storage system according to an embodiment is shown;

[0013] Figure 2 A storage device according to an embodiment is shown;

[0014] Figure 3 A non-volatile memory (NVM) according to an embodiment is shown;

[0015] Figure 4 A storage controller according to an embodiment is shown;

[0016] Figure 5 A method of operating a storage device according to an embodiment is shown;

[0017] Figure 6 The operation methods of a host, a storage controller, and a non-volatile memory device according to an embodiment are illustrated;

[0018] Figure 7 The operation methods of a host, a storage controller, and a non-volatile memory device according to an embodiment are illustrated;

[0019] Figure 8 The operation methods of a host, a storage controller, and a non-volatile memory device according to an embodiment are illustrated;

[0020] Figure 9 The following are illustrated: a host, a storage controller, and a method of operating a non-volatile memory device according to an embodiment; and

[0021] Figure 10 A system having a storage device according to an embodiment is shown. Detailed Implementation

[0022] In the following description, one or more embodiments are illustrated with reference to the accompanying drawings. The same reference numerals are used for the same components in the drawings, and redundant descriptions of these components are omitted.

[0023] Figure 1 A storage system 10 according to an embodiment is shown.

[0024] refer to Figure 1 The storage system 10 may include a storage device 100 and a host 200, and therefore, the storage system 10 may be referred to as a host-storage system.

[0025] Storage device 100 may include a storage medium for storing data upon request from host 200. As an example, storage device 100 may include at least one of a solid-state drive (SSD), embedded memory, and removable external memory. If storage device 100 is an SSD, it may be a device conforming to the Non-Volatile Memory Express (NVMe) standard. If storage device 100 is embedded memory or external memory, it may be a device conforming to the Universal Flash Storage (UFS) or Embedded MultiMedia Card (eMMC) standard. Host 200 and storage device 100 may each generate and send packets according to their respective standard protocols.

[0026] The host 200 may include a host controller 210 and a host memory 220. The host controller 210 can manage operations for storing data from a buffer area of ​​the host memory 220 to the non-volatile memory device 120, or conversely, operations for storing data from the non-volatile memory device 120 to a buffer area of ​​the host memory 220. The host memory 220 can be used as a buffer for temporarily storing write data to be sent to the storage device 100 or read data to be sent from the storage device 100.

[0027] As an example, host controller 210 may be one of multiple modules provided in the application processor, and the application processor may be implemented as a system on chip (SoC). Additionally, host memory 220 may be embedded memory provided within the application processor, or non-volatile memory or memory module located outside the application processor.

[0028] Storage device 100 may include storage controller 110 and non-volatile memory device 120. According to embodiments, storage controller 110 may be referred to as a controller, memory controller, or non-volatile memory controller. According to embodiments, non-volatile memory device 120 may include multiple non-volatile memories such as multiple memory chips, multiple memory dies, or multiple memory planes. (See reference...) Figure 2 Let me explain this in more detail.

[0029] The storage controller 110 can receive a request REQ from the host 200, control memory operations of the non-volatile memory device 120 in response to (or based on) the request REQ, and send a response to the host 200 based on the memory operation. For example, the memory operation may include a read operation, a program operation, or an erase operation.

[0030] The memory controller 110 can be connected to the non-volatile memory device 120 via channel CH. The memory controller 110 can send and receive signals with the non-volatile memory device 120 via channel CH. For example, the memory controller 110 can send commands CMD, addresses ADDR, and data to the non-volatile memory device 120, or receive data from the non-volatile memory device 120 via channel CH.

[0031] Storage controller 110 can respond to a request REQ from host 200 by sending an embedding vector of data corresponding to the request REQ to host 200.

[0032] The storage controller 110 may include an acceleration module 111 for performing embedding operations, a vector embedding module 112, and an embedding model buffer 113-1. The acceleration module 111 can perform embedding operations. The vector embedding module 112 can control the acceleration module 111, causing the acceleration module 111 to perform embedding operations on input data based on embedding model 1. Embedding model 1 can be loaded into the embedding model buffer 113-1.

[0033] In some embodiments, acceleration module 111 or vector embedding module 112 refers to a hardware component (included in storage controller 110) such as a processor or circuitry, a software component executed by a hardware component such as storage controller 110, or a combination of hardware and software components. Acceleration module 111 or vector embedding module 112 may be implemented by a program stored in an addressable storage medium and executed by a processor. For example, acceleration module 111 or vector embedding module 112 may be implemented by components such as software components, object-oriented software components, class components, and task components, processes, functions, attributes, flows, subroutines, program code segments, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and parameters. Through this disclosure, acceleration module 111 is interchangeable with accelerators, acceleration components, acceleration processors, acceleration code, or acceleration computer code. Furthermore, vector embedding module 112 is interchangeable with vector embedding code, vector embedding computer code, vector embedding processors, or vector embedding components.

[0034] Here, embedding model 1 can refer to a model that transforms high-dimensional data (e.g., text data or image data) into a low-dimensional vector space and generates embedding vectors, enabling computers to understand and process these embedding vectors. For example, embedding model 1 can correspond to Word-to-Vector (Word2Vec) embedding models for natural language processing, Global Vectors for Word Representation (GloVe) embedding models, Bidirectional Encoder Representations from Transformer (BERT) embedding models, and deep feature embedding models for image processing. In some embodiments, embedding model 1 can be stored in memory or a storage device. In some embodiments, embedding model 1 can be implemented by a dedicated processor. In some embodiments, embedding model 1 can be implemented by one or more hardware components.

[0035] In other words, since the storage controller 110 includes the acceleration module 111, the storage controller 110 (or the vector embedding module 112) can use the acceleration module 111 to generate embedding vectors based on the embedding model 1 without using the external input / output (IO) path of the data, and provide the generated embedding vectors to the host 200.

[0036] As described above, according to the embodiment, the embedding vector generation operation can be performed "on the device" within the storage device 100. (See reference...) Figures 2 to 9 This will be described in detail.

[0037] According to an embodiment, since the embedding vector is generated (or provided) simultaneously with the execution of a read request (or fetch request) from host 200, the number of I / O paths used to generate (or provide) the embedding vector can be reduced. Therefore, redundant access to independent I / O paths can be minimized, thereby improving the efficiency of system resource utilization.

[0038] Figure 2 A storage device 100 according to an embodiment is shown.

[0039] refer to Figure 2 Storage device 100 can support multiple channels CH1 to CHm, and non-volatile memory device 120 and storage controller 110 can be interconnected through multiple channels CH1 to CHm, where "m" is a natural number (e.g., equal to or greater than 2). Non-volatile memory device 120 can include multiple non-volatile memories NVM11 to NVMmn, where "m" and "n" are natural numbers (e.g., equal to or greater than 2). Each of the multiple non-volatile memories NVM11 to NVMmn can be connected to one of the multiple channels CH1 to CHm through a corresponding way.

[0040] For example, non-volatile memories NVM11 to NVM1n can be connected to the first channel CH1 via paths W11 to W1n, and non-volatile memories NVM21 to NVM2n can be connected to the second channel CH2 via paths W21 to W2n. In embodiments, each of the non-volatile memories NVM11 to NVMmn can be implemented in any memory cell that can be operated according to individual commands from the memory controller 110. For example, each of the non-volatile memories NVM11 to NVMmn can be implemented as a chip or a die, but this disclosure is not limited thereto.

[0041] The memory controller 110 can send and receive signals with the non-volatile memory device 120 through multiple channels CH1 to CHm. For example, the memory controller 110 can send commands CMDa to CMDm, addresses ADDRa to ADDRm, and data DATAa to DATAm to the non-volatile memory device 120 through multiple channels CH1 to CHm, or receive data DATAa to DATAm from the non-volatile memory device 120.

[0042] The storage controller 110 can select one of the non-volatile memories NVM11 to NVM1n connected to each channel via each channel, and send and receive signals with the selected non-volatile memory. For example, the storage controller 110 can select non-volatile memory NVM11 among the non-volatile memories NVM11 to NVM1n connected to the first channel CH1. The storage controller 110 can send command CMDa, address ADDRa, and data DATAa to the selected non-volatile memory NVM11 via the first channel CH1, or it can receive data DATAa from the selected non-volatile memory NVM11.

[0043] The storage controller 110 can send and receive signals with the non-volatile memory device 120 in parallel through different channels. For example, the storage controller 110 can send the command CMDb to the non-volatile memory device 120 through the second channel CH2, and simultaneously send the command CMDa to the non-volatile memory device 120 through the first channel CH1. For example, the storage controller 110 can receive data DATAb from the non-volatile memory device 120 through the second channel CH2, and simultaneously receive data DATAa from the non-volatile memory device 120 through the first channel CH1.

[0044] The storage controller 110 can control the overall operation of the non-volatile memory device 120. The storage controller 110 can control each of the multiple non-volatile memories NVM11 to NVM1n connected to multiple channels CH1 to CHm by sending signals to them. For example, the storage controller 110 can control a selected one of the non-volatile memories NVM11 to NVM1n by sending the command CMDa and the address ADDRa to the first channel CH1.

[0045] Each of the non-volatile memories NVM11 to NVMmn can operate under the control of the memory controller 110. For example, the non-volatile memory NVM11 can be programmed with data DATAa according to the command CMDa and address ADDRa provided to the first channel CH1. For example, data DATAb can be read from the non-volatile memory NVM21 according to the command CMDb and address ADDRb provided through the second channel CH2, and the read data DATAb can be sent to the memory controller 110.

[0046] exist Figure 2 In this embodiment, the non-volatile memory device 120 communicates with the memory controller 110 through m channels and includes n non-volatile memories corresponding to each channel. However, the number of channels and the number of non-volatile memories connected to a single channel can vary depending on the embodiment.

[0047] Figure 3 A non-volatile memory (NVM) according to an embodiment is shown.

[0048] refer to Figure 3 The non-volatile memory (NVM) may include control logic circuitry 121, memory cell array 122, page buffer circuitry 123, voltage generator 124, and row decoder 125. The non-volatile memory (NVM) may correspond to... Figure 1 Non-volatile memory device 120 or Figure 2 One of the multiple non-volatile memories NVM11 to NVMmn.

[0049] The memory cell array 122 may include multiple memory blocks BLK1 to BLKz, each of which may include multiple cell strings, and the multiple cell strings may include multiple memory cells connected in series. The memory cell array 122 may be connected to the page buffer circuit 123 via the bit line BL, and to the row decoder 125 via the word line WL, the serial select line SSL, and the ground select line GSL.

[0050] In an embodiment, the memory cell array 122 may include a three-dimensional memory cell array that may contain multiple cell strings. Each of the cell strings may include memory cells in which each is connected to a word line vertically stacked on a substrate. U.S. Patent Publications Nos. 7,679,133, 8,553,466, 8,654,587, 8,559,235, and 2011 / 0233648 are incorporated herein by reference in their entirety.

[0051] In embodiments, the memory cell array 122 may include flash memory that can contain a 2D NAND memory array or a 3D vertical NAND (V-NAND) memory array. In embodiments, the memory cell array 122 may include magnetic RAM (MRAM), spin-transfer torque MRAM (STT-MRAM), conductive bridging RAM (CBRAM), ferroelectric RAM (FeRAM), phase-change RAM (PRAM), resistive RAM (ReRAM), and various other types of memory.

[0052] Control logic circuit 121 can control various operations within the non-volatile memory (NVM). Control logic circuit 121 can output various control signals in response to commands CMD and / or addresses ADDR. For example, control logic circuit 121 can output voltage control signals CTRL_vol, row address X_ADDR, and column address Y_ADDR. Voltage generator 124 can generate various types of voltages for performing programming, reading, and erasing operations based on the voltage control signal CTRL_vol. Row decoder 125 can select at least one of multiple word lines WL and one of multiple string select lines SSL in response to row address X_ADDR. Page buffer circuit 123 can select at least one bit line BL in response to column address Y_ADDR. Page buffer circuit 123 can operate as a write driver or a sense amplifier depending on the operating mode.

[0053] Figure 4 A storage controller 110 according to an embodiment is shown.

[0054] refer to Figure 4 The storage controller 110 may include an acceleration module 111, a vector embedding module 112, a buffer memory 113, a working memory 114, a host interface 115, a non-volatile memory interface 116, a central processing unit (CPU) 118, and a block parsing module 119, all of which can communicate with each other via a bus 117. A Flash Translation Layer (FTL) can be loaded into the working memory 114, and data programming and reading operations on the non-volatile memory device 120 can be controlled by the CPU 118 running the FTL.

[0055] In some embodiments, the block resolution module 119 refers to a hardware component such as a processor or circuit, a software component running by the hardware component, or a combination of hardware and software components. The block resolution module 119 may be implemented by a program stored in an addressable storage medium and executed by a processor. For example, the block resolution module 119 may be implemented by components such as software components, object-oriented software components, class components, and task components, processes, functions, attributes, flows, subroutines, program code segments, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, and parameters. Through this disclosure, the block resolution module 119 is interchangeable with a block resolver, a block resolution component, a block resolution processor, block resolution code, or block resolution computer code.

[0056] Host interface 115 can send and receive packets with host 200. Packets sent from host 200 to host interface 115 may include write data or commands to be stored in non-volatile memory device 120, and packets sent from host interface 115 to host 200 may include read data received from non-volatile memory device 120 or responses to commands.

[0057] In this embodiment, host interface 115 may sequentially receive multiple requests from host 200, and in response to the multiple requests, sequentially send multiple responses or multiple read data to host 200. For example, host interface 115 may sequentially receive multiple read requests from host 200, and in response to the multiple read requests, sequentially send multiple read data to host 200.

[0058] In this embodiment, host 200 and storage device 100 may communicate with each other based on a predefined interface. The predefined interface may support at least one of various interfaces such as Universal Serial Bus (USB), Small Computer System Interface (SCSI), PCI express, ATA, Parallel ATA (PATA), Serial ATA (SATA), Serial Attached SCSI (SAS), UFS, NVMe, Compute eXpress Link (CXL), etc., but the scope of this disclosure is not limited thereto.

[0059] Acceleration module 111 can perform embedding operations. That is, acceleration module 111 can generate embedding vectors by performing embedding operations on input data based on embedding model 1. Here, for example, acceleration module 111 may include specialized circuitry for high-speed data operations, such as graphics processing units (GPUs), neural processing units (NPUs), and / or data processing units (DPUs). Furthermore, embedding operations refer to operations that transform input data (e.g., words, sentences, images, etc.) into a vector space. In other words, embedding operations can refer to operations that map input data to embedding vectors.

[0060] In an embodiment, the acceleration module 111 can generate an embedding vector based on the embedding model 1 loaded into the embedding model buffer 113-1 under the control of the vector embedding module 112.

[0061] The vector embedding module 112 can control the acceleration module 111, enabling the acceleration module 111 to perform embedding operations on the input data based on the embedding model 1. In other words, the vector embedding module 112 can act as a command to the acceleration module 111 to perform the actual computation for generating the embedding vector.

[0062] Buffer memory 113 can temporarily store write data to be written to non-volatile memory device 120 or read data to be read from non-volatile memory device 120. Buffer memory 113 can be configured to be provided within memory controller 110, but it can also be placed outside memory controller 110. For example, memory controller 110 may also include a buffer memory interface or buffer memory manager for communicating with buffer memory 113.

[0063] Additionally, the buffer memory 113 may include static random access memory (SRAM), and since the embedding vector can have a constant size regardless of the size of the input data used for the embedding operation, the generated embedding vector can be stored in the SRAM.

[0064] Additionally, the buffer memory 113 may also include an embedded model buffer 113-1 and an embedded buffer 113-2.

[0065] The embedded model buffer 113-1 can temporarily store the model data of the embedded model 1. In an embodiment, the embedded model 1 can be loaded into the embedded model buffer 113-1, and the embedding operation of the acceleration module 111 can be controlled by the vector embedding module 112 that runs the embedded model 1.

[0066] Embedding buffer 113-2 can temporarily store intermediate embedding vectors required to generate the final embedding vector. Acceleration module 111 may need intermediate embedding vectors as intermediate results of the embedding operation to generate the embedding vector, and embedding buffer 113-2 can temporarily store the intermediate embedding vectors.

[0067] The block parsing module 119 can process data read from the non-volatile memory device 120 to generate block data. Here, block data may refer to input data used for the embedding operation of the acceleration module 111. Furthermore, the data read from the non-volatile memory device 120 as raw data may be text data in page units, such as 4KB or 8KB, and the raw data read may not be directly used in the embedding operation.

[0068] In an embodiment, the chunk parsing module 119 can generate chunked data by segmenting raw data read from the non-volatile memory device 120 into semantic units. For example, the chunk parsing module 119 can perform semantic analysis on the read raw data and divide it into meaningful units such as words, sentences, and paragraphs as needed. That is, the chunked data can be text data of at least one unit of words, sentences, and paragraphs.

[0069] In some embodiments, the block parsing module 119 can convert page cell data (the raw data read) into meaningful cell data (block data).

[0070] The non-volatile memory interface 116 can send write data to the non-volatile memory device 120 or receive read data from the non-volatile memory device 120. Such a non-volatile memory interface 116 can be implemented in accordance with standard protocols such as Switching NAND Interface or Open NAND Flash Interface (ONFI).

[0071] Figure 5 A method of operating a storage device 100 according to an embodiment is shown.

[0072] refer to Figure 5 The operation method of the storage device according to this embodiment may include, for example, in Figure 1 The operations performed in a time-series manner in the storage device 100 shown above. (Refer to the above...) Figures 1 to 4 The details described can also be applied Figure 5 The example shown.

[0073] In operation S110, storage device 100 may open the model data of embedded model 1 stored in non-volatile memory device 120 in response to (or based on) a model open request from host 200. (See reference...) Figure 6 Please explain this operation in detail.

[0074] In operation S120, storage device 100 may, in response to (or based on) a model read request from host 200, load model data of embedded model 1 stored in non-volatile memory device 120 into embedded model buffer 113-1. (Refer to...) Figure 7 Please explain this operation in detail.

[0075] In operation S130, storage device 100 may read target data and generate an embedding vector for the target data in response to (or based on) a read request from host 200 for reading target data and an embedding vector. Additionally, storage device 100 may provide host 200 with the read target data and the generated embedding vector for the target data in response to (or based on) a request from host 200 for obtaining the embedding vector. (See reference...) Figure 8 Please explain this operation in detail.

[0076] In operation S140, storage device 100 may close the model data embedded in model 1 in response to a request from host 200. (See reference...) Figure 9 Please explain this operation in detail.

[0077] Figure 6 A method of operating a host, a storage controller, and a non-volatile memory device according to an embodiment is illustrated.

[0078] refer to Figure 6 The operation method according to this embodiment can be, for example, in... Figure 1 Executed in host 200, storage controller 110, and non-volatile memory device 120. (See reference) Figure 6 The opening operation for the model data embedded in model 1 will be described in detail. Figure 5 Operation S110).

[0079] Here, the model data of embedded model 1 has been pre-stored in the non-volatile memory device 120. In addition, the model data may correspond to file data in the file system, and the metadata of the model data is also pre-stored in the non-volatile memory device 120.

[0080] In operation S210, host 200 may send a model open request to storage controller 110. Here, the model open request may include file path information used in the file system of the directory structure, and the model open request from host 200 may be a request for storage device 100 to check the metadata of the model data corresponding to the file path. Here, the metadata may include information about the logical location of the model data, file size, and access permissions.

[0081] In other words, host 200 can request metadata (or file descriptors) of the model data embedded in model 1 from storage device 100 based on the file path of the operating system (OS).

[0082] In operation S220, the storage controller 110 may send a command to read the metadata of the model data to the non-volatile memory device 120 based on the model open request.

[0083] In operation S230, the non-volatile memory device 120 can perform a read operation on the metadata of the model data in response to a read command for the metadata of the model data.

[0084] In operation S240, the non-volatile memory device 120 can send the read metadata to the storage controller 110.

[0085] In operation S250, the storage controller 110 can generate file descriptors based on metadata. Here, a file descriptor is a structure used by the host 200 to identify and access files, and the host 200 can reference the file descriptor when a specific operation (e.g., a read and / or write request for file data) is requested from the storage device 100 of the host 200. In other words, the storage controller 110 can generate file descriptors for model data based on metadata.

[0086] In operation S260, storage controller 110 may send file descriptors to host 200. Since the file descriptors are generated based on metadata, they may include information about the logical location (e.g., logical block address (LBA)) of the model data embedded in model 1, file size, and access permissions.

[0087] Figure 7 A method of operating a host, a storage controller, and a non-volatile memory device according to an embodiment is illustrated.

[0088] refer to Figure 7 The operation method according to this embodiment can be, for example, in... Figure 1 Executed in host 200, storage controller 110, and non-volatile memory device 120. (See reference) Figure 7 The loading operation of the model data embedded in model 1 will be described in detail. Figure 5 Operation S120).

[0089] In operation S310, host 200 may send a model read request to storage controller 110. Here, the model read request may include information about the logical address of the model data, and the model read request from host 200 may be a request for storage device 100 to load the model data of embedded model 1 corresponding to the logical address from non-volatile memory device 120 into embedded model buffer 113-1.

[0090] In other words, host 200 can request storage device 100 to load model data corresponding to the logical address included in the model read request from non-volatile memory device 120 into embedded model buffer 113-1.

[0091] In operation S320, the storage controller 110 can send a read command for model data to the non-volatile memory device 120 based on the model read request. That is, the storage controller 110 can send a read command for model data to the non-volatile memory device 120 based on the logical address included in the model read request.

[0092] In operation S330, the non-volatile memory device 120 can perform a read operation on the model data in response to a read command for the model data.

[0093] In operation S340, the non-volatile memory device 120 can send the model data of the embedded model 1 that has been read to the memory controller 110.

[0094] In operation S350, the storage controller 110 can load the model data of the embedded model 1 into the embedded model buffer 113-1. That is, the storage controller 110 can store the received model data in the embedded model buffer 113-1.

[0095] In operation S360, when the model data of embedded model 1 has been loaded into embedded model buffer 113-1, storage controller 110 can send a model data loading completion response to host 200.

[0096] Here, unlike a typical read request, storage device 100 may respond to (or be based on) a model read request by loading only the model data corresponding to the logical address from non-volatile memory device 120 into embedded model buffer 113-1, and may not return the model data loaded into embedded model buffer 113-1 to host 200.

[0097] In other words, storage device 100 may load model data into embedded model buffer 113-1 in response to (or based on) a model read request, without returning the model data to host 200.

[0098] Figure 8 A method of operating a host, a storage controller, and a non-volatile memory device according to an embodiment is illustrated.

[0099] refer to Figure 8 The operation method according to this embodiment can be, for example, in... Figure 1 Executed in host 200, storage controller 110, and non-volatile memory device 120. (See reference) Figure 8 The process of generating and providing embedding vectors using embedding model 1 will be described in detail. Figure 5 Operation S130).

[0100] In operation S410, host 200 may send a read request for target data and embedding vector to storage controller 110. Here, the read request for target data and embedding vector may include information about the logical address of the target data, and the read request from host 200 may be a request for storage device 100 to read the target data corresponding to the logical address and generate the embedding vector corresponding to the target data.

[0101] In other words, host 200 can request storage device 100 to read target data corresponding to the logical address included in the read request for target data and embedding vector, and generate an embedding vector for the target data.

[0102] In operation S420, the storage controller 110 can check whether the model data of the embedded model 1 has been loaded into the embedded model buffer 113-1 based on the read request for the target data and the embedding vector.

[0103] In operation S420-1, the storage controller 110 can send an IO failure response to the host 200 if the model data based on embedded model 1 has not yet been loaded into the embedded model buffer 113-1.

[0104] In operation S420-2, the model data based on the embedded model 1 has been loaded into the embedded model buffer 113-1 as a check result. The storage controller 110 can send a read command for the target data to the non-volatile memory device 120 based on the logical address included in the read request for the target data and the embedding vector.

[0105] In operation S430, the non-volatile memory device 120 can perform a read operation on the target data in response to a read command for the target data.

[0106] In operation S440, the non-volatile memory device 120 can send the target data to the memory controller 110.

[0107] In operation S450, the storage controller 110 can convert the received target data into block data.

[0108] For example, the block parsing module 119 of the storage controller 110 can process target data read from the non-volatile memory device 120 to generate block data. Here, block data may refer to input data used for the embedding operation of the acceleration module 111. Furthermore, the target data read from the non-volatile memory device 120 as raw data may be text data in page units, such as 4KB or 8KB, and the read target data may not be directly used in the embedding operation.

[0109] In an embodiment, the chunk parsing module 119 can generate chunked data by dividing target data read from the non-volatile memory device 120 into meaningful units. For example, the chunk parsing module 119 can perform semantic analysis on the read target data and divide it into meaningful units such as words, sentences, and paragraphs as needed. That is, the chunked data can be text data of at least one unit of words, sentences, and paragraphs.

[0110] In other words, the block parsing module 119 can convert target data in page units into block data of meaningful units.

[0111] In operation S460, the storage controller 110 can generate an embedding vector by performing an embedding operation on the block data.

[0112] For example, the vector embedding module 112 of the storage controller 110 can control the acceleration module 111 of the storage controller 110 to perform embedding operations on the block data based on the embedding model 1.

[0113] In other words, the acceleration module 111 can generate the embedding vector of the block data based on the embedding model 1 loaded into the embedding model buffer 113-1, according to the control of the vector embedding module 112.

[0114] Additionally, the embedding buffer 113-2 of the storage controller 110 can temporarily store intermediate embedding vectors required to generate the final embedding vector. The acceleration module 111 may need intermediate embedding vectors as intermediate results of the embedding operation to generate the embedding vector, and the embedding buffer 113-2 can temporarily store the intermediate embedding vectors.

[0115] In operation S470, when the generation of the embedding vector is complete, the storage controller 110 can send a response indicating that the generation of the embedding vector is complete to the host 200.

[0116] In operation S480, host 200 may send a request to the storage controller 110 to obtain the embedded vector.

[0117] In operation S490, the storage controller 110 may send the target data and the embedding vector for the target data to the host 200 in response to (or based on) a request for the embedding vector.

[0118] Here, the request for the embedding vector from the host 200 can be a request for the storage device 100 to return the target data and the embedding vector for the target data.

[0119] Figure 9 A method of operating a host, a storage controller, and a non-volatile memory device according to an embodiment is illustrated.

[0120] refer to Figure 9 The operation method according to this embodiment can be, for example, in... Figure 1 Execution occurs in the host 200, storage controller 110, and non-volatile memory device 120 shown. (See reference) Figure 9 The operation of closing the embedded model data of model 1 will be described in detail. Figure 5 Operation S140).

[0121] In operation S510, host 200 may send a model shutdown request to storage controller 110. Here, the model shutdown request from host 200 may be a request for storage device 100 to release file descriptors containing model data.

[0122] In other words, by requesting storage device 100 to close the model data embedded in model 1, host 200 can terminate its reference to the model data.

[0123] In operation S520, the storage controller 110 can perform a shutdown operation on the model data of the non-volatile memory device 120 based on a model shutdown request.

[0124] For example, the storage controller 110 can control the non-volatile memory device 120 to release corresponding file descriptors. File descriptors can be released so that they are no longer used by the non-volatile memory device 120. That is, any connections referencing files, such as model data, can be terminated.

[0125] Furthermore, if model data has already been mapped to a specific address range in virtual memory, the storage controller 110 can unmap the model data. This allows the virtual address space to be reclaimed and used for other operations.

[0126] Additionally, storage controller 110 can update the metadata of the model data. For example, storage controller 110 can record in the metadata of the model data that a file has been closed. Storage controller 110 can update timestamps such as the closure time in the metadata of the model data. Storage controller 110 can reflect the resource management status by decrementing the reference count of the metadata of the model data or setting the count to 0.

[0127] Additionally, the storage controller 110 can control the non-volatile memory device 120 to release the cache or buffer that is being used for the corresponding model data.

[0128] In operation S530, host 200 may send a refresh request for model data to storage controller 110. Here, the refresh request from host 200 may be a request for storage device 100 to remove model data loaded into embedded model buffer 113-1.

[0129] In operation S540, the storage controller 110 can remove model data loaded into the embedded model buffer 113-1 in response to a refresh request.

[0130] As described above, according to the embodiments, reference Figures 1 to 9 The described embedding vector generation operation can be performed "on the device" in storage device 100.

[0131] According to this disclosure, due to the execution of a read request from host 200 (e.g., Figure 8 A request to obtain target data and embedding vectors (or a request to obtain (e.g., Figure 8 The embedding vector is generated (or provided) simultaneously with the embedding vector retrieval request, thus reducing the number of I / O paths used to generate (or provide) the embedding vector. Therefore, redundant access to independent I / O paths can be minimized, thereby improving the efficiency of system resource utilization.

[0132] Figure 10 A system 2000 having a storage device according to an embodiment is shown.

[0133] Figure 10 The system 2000 shown can be a mobile system such as a mobile phone, smartphone, tablet computer, wearable device, healthcare device, or Internet of Things (IoT) device. However, Figure 10 The system 2000 shown is not necessarily limited to a mobile system, and can be a personal computer, laptop computer, server, media player, or automotive device such as a navigation system. (Reference) Figure 10The system 2000 may include a main processor 2100, memories 2200a and 2200b, storage devices 2300a and 2300b, and may additionally include one or more of the following: an image capture device 2410, a user input device 2420, a sensor 2430, a communication device 2440, a display 2450, a speaker 2460, a power supply device 2470, and a connection interface 2480.

[0134] The main processor 2100 can control the overall operation of the system 2000, and more specifically, control the operation of other components constituting the system 2000. Such a main processor 2100 can be implemented as a general-purpose processor, a dedicated processor, or an application processor. The main processor 2100 may include one or more CPU cores 2110, and may also include a controller 2120 for controlling memories 2200a and 2200b and / or storage devices 2300a and 2300b. According to embodiments, the main processor 2100 may also include an accelerator 2130 as dedicated circuitry for high-speed data operations such as artificial intelligence (AI) data operations. Such an accelerator 2130 may include a GPU, NPU, and / or DPU, and may be implemented as a separate chip physically independent of other components of the main processor 2100.

[0135] Memory 2200a and 2200b can be used as the main memory device of system 2000 and can include one or more volatile memories such as SRAM and / or DRAM, but can also include one or more non-volatile memories such as flash memory, PRAM and / or RRAM. Memory 2200a and 2200b can also be implemented in the same package as main processor 2100.

[0136] Storage devices 2300a and 2300b can be used as non-volatile storage devices that store data regardless of whether power is supplied to them, and can have a relatively large storage capacity compared to memories 2200a and 2200b. Storage devices 2300a and 2300b may include storage controllers 2310a and 2310b and non-volatile memories 2320a and 2320b that store data under the control of storage controllers 2310a and 2310b. Non-volatile memories 2320a and 2320b may include flash memory with a 2D NAND structure or a 3D V-NAND structure, but may also include other types of non-volatile memories such as PRAM and / or RRAM.

[0137] Storage devices 2300a and 2300b can be included in system 2000 in a physically separate state from main processor 2100, or they can be implemented within the same package as main processor 2100. Furthermore, storage devices 2300a and 2300b, in the form of SSDs or memory cards, can be detachably connected to other components of system 2000 via interfaces such as connection interface 2480, which will be described later. Storage devices 2300a and 2300b can be devices applying standard specifications such as UFS, eMMC, or NVMe, but are not limited to the examples described above. (See above references) Figures 1 to 9 The described embodiments can be implemented in storage devices 2300a and 2300b.

[0138] Image capture device 2410 can record still or moving images and can be a camera, video camera, and / or webcam. User input device 2420 can receive various types of data input from the user of system 2000 and can be a touchpad, keypad, keyboard, mouse, and / or microphone. Sensor 2430 can detect various types of physical quantities that can be obtained from outside system 2000 and convert the detected physical quantities into electrical signals. Such sensor 2430 can be a temperature sensor, pressure sensor, light sensor, position sensor, accelerometer, biosensor, and / or gyroscope sensor.

[0139] Communication device 2440 can send and receive signals between other devices outside system 2000 according to various communication protocols. Such communication device 2440 can be implemented in a configuration including an antenna, transceiver, and / or modem. Display 2450 and speaker 2460 can be used as output devices to output visual and auditory information to the user of system 2000, respectively. Power supply device 2470 can appropriately convert power supplied from a battery built into system 2000 and / or an external power source, and supply it to each component of system 2000. Connection interface 2480 can provide connectivity between system 2000 and external devices connected to system 2000 and capable of exchanging data with system 2000.

[0140] Although this disclosure has been specifically shown and described with reference to embodiments thereof, it should be understood that various changes in form and detail may be made therein without departing from the spirit and scope of the appended claims.

[0141] The terms “send,” “receive,” and “communicate,” and their derivatives, include both direct and indirect communication. The terms “include,” “comprise,” and their derivatives, mean including but not limited to. The term “or” is an inclusive term meaning “and / or.” The phrase “associated with,” and its derivatives, mean including, being included in, interconnected with, contained, housed in, connected to or linked to, coupled to or coupled to, able to communicate with, cooperate with, interleave, juxtapose, proximate, bound to or bound to, having, possessing the properties of, related to, or having a relationship with. The term “controller” (e.g., storage controller 110) means any device, system, or part thereof that controls at least one operation. The functionality associated with any specified controller can be centralized or distributed, whether local or remote. The phrase “at least one of…” when used with a list of items means that different combinations of one or more of the listed items may be used, and only one item from the list may be required. For example, "at least one of A, B, and C" includes any of the following combinations: A, B, C, A and B, A and C, B and C, and A and B and C, and any variations thereof. As an additional example, the expression "at least one of a, b, or c" can indicate only a, only b, only c, both a and b, both a and c, both b and c, all of a, b, and c, or variations thereof. Similarly, the term "set" means one or more. Thus, a set of items can be a single item or a collection of two or more items. Furthermore, the following multiple functions can be implemented or supported by one or more computer programs, each of which is formed from computer-readable program code and embodied in a computer-readable medium. The terms "application" and "program" refer to one or more computer programs, software components, instruction sets, processes, functions, objects, classes, instances, associated data, or portions thereof that are implemented in suitable computer-readable program code. The phrase "computer-readable program code" includes any type of computer code containing source code, object code, and executable code. The phrase "computer-readable media" includes any type of media that can be accessed by a computer, such as read-only memory (ROM), random access memory (RAM), hard disk drive, compact disc (CD), digital video disc (DVD), or any other type of storage. "Non-transitory" computer-readable media does not include wired, wireless, optical, or other communication links that transmit transient electrical or other signals. Non-transitory computer-readable media includes media capable of permanently storing data and media capable of storing and later rewriting data, such as rewritable optical discs or erasable memory devices.

Claims

1. A storage device comprising: The storage controller includes an embedded model buffer and an accelerator; and A non-volatile memory operatively connected to the memory controller. in, The non-volatile memory is configured to store target data and model data of the embedded model, and, The storage controller is configured as follows: Based on the first request from the host, a read command for the target data is sent to the non-volatile memory. The target data is received from the non-volatile memory, and based on the received target data and the model data loaded into the embedding model buffer, the accelerator is used to generate an embedding vector; and Based on a second request from the host, the target data and the generated embedding vector are sent to the host.

2. The storage device according to claim 1, wherein, The storage controller is also configured to: Based on the first request, check whether the model data has been loaded into the embedded model buffer; Based on the first check result that the model data has not been loaded into the embedded model buffer, an input / output IO failure response is sent to the host. as well as Based on the second check result of the model data being loaded into the embedded model buffer, the read command for the target data is sent to the non-volatile memory.

3. The storage device according to claim 1, wherein, The storage controller is also configured to: Convert the received target data into block data; and The embedding vector is generated by performing an embedding operation on the segmented data using the accelerator.

4. The storage device according to claim 3, wherein, The target data includes text data in page units, and The segmented data includes text data from at least one of words, sentences, or paragraphs.

5. The storage device according to claim 1, wherein, The storage controller is also configured to: Receive a model open request from the host; Based on the model open request, a command to read the metadata of the model data is sent to the non-volatile memory; Receive the metadata from the non-volatile memory; Generate a file descriptor based on the metadata; as well as The generated file descriptor is sent to the host.

6. The storage device according to claim 1, wherein, The storage controller is also configured to: Receive a model read request from the host; Based on the model read request, a read command for the model data is sent to the non-volatile memory; Receive the model data from the non-volatile memory; as well as The model data is loaded into the embedded model buffer.

7. The storage device according to claim 1, wherein, The storage controller is also configured to: Receive a model shutdown request from the host, and perform a shutdown operation on the model data in the non-volatile memory based on the model shutdown request; as well as The host receives a model data refresh request and removes the model data loaded into the embedded model buffer based on the model data refresh request.

8. A storage controller configured to control a non-volatile memory storing target data and model data of an embedded model, the storage controller comprising: Embedded model buffer; and accelerator, in, The storage controller is configured as follows: Based on the first request from the host, a read command for the target data is sent to the non-volatile memory. Receive the target data from the non-volatile memory, and Based on the received target data and the model data loaded into the embedding model buffer, the accelerator is used to generate embedding vectors; as well as Based on a second request from the host, the target data and the generated embedding vector are sent to the host.

9. The storage controller according to claim 8, wherein, The storage controller is also configured to: Based on the first request, check whether the model data has been loaded into the embedded model buffer; Based on the first check result that the model data has not been loaded into the embedded model buffer, an input / output IO failure response is sent to the host. as well as Based on the second check result of the model data being loaded into the embedded model buffer, the read command for the target data is sent to the non-volatile memory.

10. The storage controller according to claim 8, wherein, The storage controller is also configured to: Convert the received target data into block data; and The embedding vector is generated by performing an embedding operation on the segmented data using the accelerator.

11. The storage controller according to claim 10, wherein, The target data includes text data in page units, and The segmented data includes text data from at least one of words, sentences, or paragraphs.

12. The storage controller according to claim 8, wherein, The storage controller is also configured to: Receive a model open request from the host; Based on the model open request, a command to read the metadata of the model data is sent to the non-volatile memory; Receive the metadata from the non-volatile memory; Generate a file descriptor based on the metadata; as well as The generated file descriptor is sent to the host.

13. The storage controller according to claim 8, wherein, The storage controller is also configured to: Receive a model read request from the host; Based on the model read request, a read command for the model data is sent to the non-volatile memory; Receive the model data from the non-volatile memory; as well as The model data is loaded into the embedded model buffer.

14. The storage controller according to claim 8, wherein, The storage controller is also configured to: Receive a model shutdown request from the host, and perform a shutdown operation on the model data in the non-volatile memory based on the model shutdown request; as well as The host receives a model data refresh request and removes the model data loaded into the embedded model buffer based on the model data refresh request.

15. A method of operating a memory controller, the memory controller including an embedded model buffer and an accelerator and controlling non-volatile memory, the method comprising: A read command for the target data is sent to the non-volatile memory based on the first request from the host; The target data is received from the non-volatile memory, and based on the received target data and the model data loaded into the embedding model buffer, the accelerator is used to generate an embedding vector. as well as Based on a second request from the host, the target data and the generated embedding vector are sent to the host.

16. The operating method according to claim 15, wherein, Sending the read command for the target data to the non-volatile memory further includes: Based on the first request, check whether the model data of the embedded model has been loaded into the embedded model buffer; Based on the first check result that the model data of the embedded model has not been loaded into the embedded model buffer, an input / output IO failure response is sent to the host; as well as Based on the second check result of the model data being loaded into the embedded model buffer based on the embedded model, the read command for the target data is sent to the non-volatile memory.

17. The operating method according to claim 15, wherein, Generating the embedding vector includes: Convert the received target data into block data; and The embedding vector is generated by performing an embedding operation on the segmented data using the accelerator.

18. The operating method according to claim 17, wherein, The target data includes text data in page units, and The segmented data includes text data from at least one of words, sentences, or paragraphs.

19. The operating method according to claim 15, further comprising: The host receives a model open request, and based on the model open request, sends a read command for the metadata of the model data stored in the non-volatile memory to the non-volatile memory; and The metadata is received from the non-volatile memory, a file descriptor is generated based on the metadata, and the generated file descriptor is sent to the host.

20. The operating method according to claim 15, further comprising: The host receives a model read request, and based on the model read request, sends a read command for the model data stored in the non-volatile memory to the non-volatile memory; and The model data is received from the non-volatile memory and loaded into the embedded model buffer.