Storage devices and storage systems
By generating and managing vector data items in storage devices, the problem of increased storage space and CPU computation in vector databases is solved, achieving more efficient data storage and search performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SAMSUNG ELECTRONICS CO LTD
- Filing Date
- 2025-08-21
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies, when using vector databases, increase the storage space of storage devices and the amount of CPU computation, and the large amount of input/output data leads to a decrease in performance.
By generating vector data items in the storage device and storing them separately from the original data in different memory groups, using a storage controller for management, defining logical address spaces and using mapping tables for data mapping, the CPU computational burden is reduced and data search is optimized.
It effectively reduces the amount of input/output data to storage devices, improves data search performance, reduces CPU load, and enhances storage and search efficiency.
Smart Images

Figure CN122240010A_ABST
Abstract
Description
[0001] Cross-references to related applications
[0002] This application claims the benefit of priority to Korean Patent Application No. 10-2024-0188170, filed on December 17, 2024, with the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference in its entirety. Technical Field
[0003] This disclosure relates to storage devices and storage systems. Background Technology
[0004] With the development of Large Language Model (LLM) techniques utilizing Retrieval Augmentation (RAG), vector database technology, potentially a core technology of RAG, has garnered attention. Vector databases can be designed to store and search vector data, which can be high-dimensional mathematical expressions of the original data. For example, when the original data has similar meanings, the distance between vectors in the original data can be shortened. Using vector databases, data similar to a given query can be easily found.
[0005] Storing vector data alongside raw data in a database system may increase the required storage space and may necessitate reading large amounts of vector data to find data similar to a query. Consequently, the computational load on the database system's central processing unit (CPU) and the amount of data input / output between the CPU and storage devices may increase. Summary of the Invention
[0006] A storage device is provided that offloads the CPU computation of a database system, reduces the amount of input / output data, and further improves data search performance.
[0007] According to one aspect of this disclosure, a storage device includes: a plurality of memory groups, each including a plurality of non-volatile memories; a storage controller configured to control the plurality of memory groups; and a plurality of channels respectively connected to the storage controller and the plurality of memory groups, wherein the storage controller is configured to: generate one or more vector data items by performing vector embedding on original data items, and store the original data items and the one or more vector data items respectively in different memory groups of the plurality of memory groups.
[0008] According to one aspect of this disclosure, a method performed by a storage device includes: generating one or more vector data items by performing vector embedding on original data items; and storing the original data items and the one or more vector data items in different memory groups in a plurality of memory groups, each including a plurality of non-volatile memories.
[0009] The method further includes: defining a logical address space for at least one of a plurality of memory groups, and assigning the same logical address to the original data item and one or more vector data items associated with the original data item.
[0010] The method further includes: storing a first mapping table, the first mapping table including multiple memory groups and multiple data types, wherein the multiple data types include primitive data types and one or more vector data types.
[0011] The method also includes storing a second mapping table for mapping logical and physical addresses of multiple memory groups.
[0012] The one or more vector data items include at least one of a vector, a link to an associated vector of the vector, a distance between the vector and the associated vector, and a gradient between the vector and the associated vector.
[0013] The one or more vector data items include multiple vector data items with multiple different vector data types.
[0014] The original data items include data with different attributes, and the plurality of vector data items include vectors for the data with different attributes.
[0015] The method also includes controlling multiple memory groups to simultaneously write raw data items and one or more vector data items.
[0016] The method further includes, in response to a query from the host, simultaneously reading raw data items and one or more vector data items by controlling multiple memory groups.
[0017] The method further includes storing raw data items obtained from multiple memory groups and one or more vector data items in a buffer memory, and outputting the raw data items stored in the buffer memory to the host in a state where the one or more vector data items correspond to adjacent vectors adjacent to the query vector corresponding to the query.
[0018] According to one aspect of this disclosure, a storage device includes: a plurality of memory groups, each including a plurality of non-volatile memories; a storage controller configured to control the plurality of memory groups; and a plurality of channels respectively connected to the storage controller and the plurality of memory groups, wherein the storage controller is configured to: generate vectors by performing vector embedding on original data items, generate a plurality of vector data items connecting the vectors and different vectors based on different vector indexing algorithms, and store the original data items and the plurality of vector data items respectively in different memory groups of the plurality of memory groups.
[0019] According to one aspect of this disclosure, a method performed by a storage device includes: performing vector embedding on an original data item to generate a vector, generating a plurality of vector data items that connect the vector and the different vectors based on different vector indexing algorithms, and storing the original data item and the plurality of vector data items in different memory groups in a plurality of memory groups that each includes a plurality of non-volatile memories.
[0020] According to one aspect of this disclosure, a storage system includes: at least one processor configured to perform vector embedding on raw data items and generate one or more vector data items; a storage device including a plurality of memory groups, each of the plurality of memory groups including a plurality of non-volatile memories; a storage controller configured to control the plurality of memory groups; and a plurality of channels respectively connected to the storage controller and the plurality of memory groups, wherein the storage controller is further configured to: receive raw data items and one or more vector data items from the at least one processor, and store the raw data items and one or more vector data items respectively in different memory groups of the plurality of memory groups.
[0021] According to one aspect of this disclosure, a method performed by a storage system includes: performing vector embedding on an original data item and generating one or more vector data items; receiving the original data item and one or more vector data items, and storing the original data item and one or more vector data items in different memory groups among a plurality of memory groups including a plurality of non-volatile memories. Attached Figure Description
[0022] The above and other aspects, features and advantages of this disclosure will become clearer from the following detailed description taken in conjunction with the accompanying drawings, in which:
[0023] Figure 1 A large language model (LLM) system according to an embodiment is shown;
[0024] Figure 2 A storage device according to an embodiment is shown;
[0025] Figure 3 An example of the original data is shown;
[0026] Figure 4A and Figure 4B Examples of different types of vectors are shown;
[0027] Figure 5A and Figure 5B Examples of different types of vector spaces are shown;
[0028] Figure 6 A first mapping table of a storage device according to an embodiment is shown;
[0029] Figure 7A and Figure 7B A second mapping table of a storage device according to an embodiment is shown;
[0030] Figure 8 The data storage operation of the storage device according to an embodiment is illustrated;
[0031] Figure 9 The data search operation of the storage device according to an embodiment is illustrated;
[0032] Figure 10 An example of vector data according to an embodiment is shown;
[0033] Figure 11 The data search process of the storage device according to an embodiment is illustrated;
[0034] Figures 12A to 12C An example of a vector database is shown;
[0035] Figure 13 A data search method for a storage device according to an embodiment is illustrated;
[0036] Figure 14 A vector database system according to an embodiment is shown; and
[0037] Figure 15 A storage system according to an embodiment is shown. Detailed Implementation
[0038] The embodiments described herein are non-limiting exemplary embodiments, and therefore, this disclosure is not limited thereto, and may be implemented in various other forms. As used herein, the expression "at least one of..." following the list of elements modifies the entire list of elements and does not modify any individual element in the list. For example, the expression "at least one of a, b, and c" or "at least one of a, b, or c" should be understood to include only a, only b, only c, both a and b, both a and c, both b and c, or all of a, b, and c.
[0039] Figure 1 A large language model (LLM) system according to an embodiment is shown.
[0040] LLM can refer to artificial intelligence (or AI model) trained on large datasets of text, designed to understand and generate natural language. LLM can use deep learning to analyze patterns, context, and relationships in text to perform tasks such as translation, summarization, answering questions, and generating content.
[0041] While the development of LLM has improved natural language processing capabilities, problems can arise, such as the illusion of generating unfounded information. This illusion may stem from the tendency of LLM to generate the most reasonable response based on probability rather than precise information. Retrieval-enhanced generation (RAG) combines knowledge retrieval capabilities with LLM to generate answers based on pre-prepared data. Thus, LLM with RAG can reduce this illusion and improve the accuracy of responses.
[0042] Vector database technology (which can be a core technology of RAG) has attracted attention in the field of artificial intelligence. Vector databases can be designed to store and retrieve vectors, which can be high-dimensional mathematical expressions of the original data. For example, because the original data items have similar meanings to each other, the distances between the vectors of the original data items can be closer. Using vector database technology, it is easy to find data similar to a given query.
[0043] refer to Figure 1 The LLM system 10 may include a vector database system 20 and a user system 30. When the vector database system 20 receives a query from the user system 30, it can search for data that most closely matches the query based on the "query vector" corresponding to the query. The vector database system 20 can generate a response based on the searched data, and the vector database system 20 can provide the response to the user system 30, such as... Figure 1 As shown.
[0044] exist Figure 1 In this context, the vector database system 20 may include a host 100 and multiple storage devices 201 to 203.
[0045] Host 100 may include at least one core (or processor) for processing commands. For example, host 100 may include at least one of an application processor, microprocessor, central processing unit (CPU), processor core, multi-core processor, multiprocessor, application-specific integrated circuit (ASIC), and field-programmable gate array (FPGA).
[0046] Each or at least one of storage devices 201 to 203 may include a storage medium for storing data in response to (or in connection with) a request from host 100. For example, at least one of storage devices 201 to 203 may include at least one of a solid-state drive (SSD), embedded memory, or removable external memory. When each or at least one of storage devices 201 to 203 is an SSD, embedded memory, or external memory, each or at least one of storage devices 201 to 203 may also include a non-volatile memory device.
[0047] When storage devices 201 to 203 are SSDs, they may conform to the Non-Volatile Memory Fast (NVMe) standard. When storage devices 201 to 203 are embedded or external memory, they may conform to the Universal Flash Storage (UFS) standard or the Embedded Multimedia Card (eMMC) standard. Each or at least one of the host 100 and storage devices 201 to 203 may generate packets according to the adopted standard protocol and may send such packets.
[0048] The vector database system 20 can analyze raw data to divide it into multiple raw data items, and can generate vector data items based on the raw data items. The raw data items and vector data items can be stored in at least one of multiple storage devices 201 to 203.
[0049] When a query is received, the vector database system 20 can analyze the query to generate at least one query vector, search for the most similar vector from multiple storage devices 201 to 203, generate a response by referring to the data corresponding to the searched vector, and output the response.
[0050] The vector database system 20 can store vector data corresponding to the original data together with the original data, and therefore can process a large amount of data compared to the amount of the original data. For example, when the vector database system 20 generates and stores several types of vector data corresponding to the original data, the amount of data stored in multiple storage devices 201 to 203 can increase by about ten (10) times compared to the amount of the original data.
[0051] In related technologies, when host 100 stores raw data and vector data in multiple storage devices 201 to 203 in a distributed manner and searches all multiple storage devices 201 to 203 to find the data corresponding to the query, the computational burden of host 100 may increase, and the amount of data input / output between host 100 and multiple storage devices 201 to 203 may increase.
[0052] For example, when a vector data item (corresponding to an original data item stored in the first storage device 201) is stored in the second storage device 202, and when the original data item is updated, the host 100 can also update the vector data item. To update the vector data item, the host 100 can load the vector data item stored in the second storage device 202, modify the loaded vector data item, and then provide the modified vector data item to the second storage device 202.
[0053] Additionally, when host 100 searches for a vector data item that has a vector (similar to a query vector in second storage device 202), host 100 can obtain the original data item corresponding to the searched vector data item from first storage device 201.
[0054] As in the example above, when raw data and vector data are distributed and stored in multiple storage devices 201 to 203, the amount of data input / output between host 100 and the multiple storage devices 201 to 203 may increase.
[0055] According to an embodiment, each or at least one of the storage devices 201 to 203 can store raw data and vector data corresponding to the raw data, and the vector data can be used to search for the raw data. According to an embodiment, the computational burden of the host 100 can be offloaded, and the amount of data input / output between the host 100 and the plurality of storage devices 201 to 203 can be reduced.
[0056] Additionally, the embodiments described below can improve the performance of storing raw data and vector data, as well as the performance of searching for similar data based on queries.
[0057] Figure 2 A storage device according to an embodiment is shown.
[0058] Figure 2 Storage device 200 can be used with reference Figure 1 Any one of the described storage devices 201 to 203 corresponds to this. Storage device 200 may include storage controller 210 and memory device 220.
[0059] Storage controller 210 can control the overall operation of storage device 200. For example, storage controller 210 can store data in storage device 220 in response to a request from host 100 (or in relation to a request from host 100), as referenced. Figure 1 The data stored in the memory device 220 can be provided to the host 100 in response to a request from the host 100.
[0060] When memory device 220 includes flash memory, the flash memory may include a 2D NAND memory array or a 3D (or vertical) NAND (VNAND) memory array. As another example, memory device 200 may include various other types of non-volatile memory. For example, various types of memory, such as magnetic RAM (MRAM), spin-torque MRAM, conductive bridged RAM (CBRAM), ferroelectric RAM (FeRAM), phase RAM (PRAM), resistive RAM, etc., may be used in memory device 200.
[0061] The storage controller 210 may include a processor 211 and a buffer memory 212. The processor 211 may execute firmware. The buffer memory 212 may temporarily store data provided from the host 100 or data output from the storage device 220.
[0062] Firmware can refer to the software that controls storage device 200. For example, the firmware of storage device 200 may include a flash translation layer (FTL). For example, the FTL can translate logical addresses used in host 100 into physical addresses in memory device 220 and can perform management operations such as garbage collection or wear leveling.
[0063] The memory device 220 may include a plurality of non-volatile memories NVM11 to NVM44. At least one of the non-volatile memories NVM11 to NVM44 may be connected to one of a plurality of channels CH1 to CH4 via corresponding paths. For example, non-volatile memories NVM11 to NVM14 may be connected to the first channel CH1 via paths W11 to W14, and non-volatile memories NVM21 to NVM24 may be connected to the second channel CH2 via paths W21 to W24. For example, non-volatile memories NVM31 to NVM34 may be connected to the third channel CH3 via paths W31 to W34, and non-volatile memories NVM41 to NVM44 may be connected to the fourth channel CH4 via paths W41 to W44.
[0064] In the example embodiment, each or at least one of the non-volatile memories NVM11 to NVM44 can be implemented as any memory item that can be operated according to individual commands from the memory controller 210. For example, each or at least one of the non-volatile memories NVM11 to NVM44 can be implemented as a chip or a die.
[0065] The storage controller 210 can send signals to and receive signals from the storage device 220 through multiple channels CH1 to CH4. For example, the storage controller 210 can send commands, addresses, and data to the storage device 220, or receive data from the storage device 220, through channels CH1 to CH4.
[0066] The storage controller 210 can select a non-volatile memory connected to and corresponding to a channel through at least one of the channels, and can send signals to and receive signals from the selected non-volatile memory. The storage controller 210 can send commands, addresses, and data to the selected non-volatile memory through the channel, or receive data from the selected non-volatile memory through the channel.
[0067] The storage controller 210 can send signals to and receive signals from the storage device 220 in parallel through different channels. For example, the storage controller 210 can send commands to the storage device 220 through the first channel CH1, while simultaneously sending different commands to the storage device 220 through the second channel CH2. Furthermore, the storage controller 210 can receive data from the storage device 220 through the first channel CH1, while simultaneously receiving different data from the storage device 220 through the second channel CH2.
[0068] Each or at least two of the non-volatile memories (connected to the memory controller 210 via the same channel) can perform internal operations in parallel. For example, the memory controller 210 can sequentially send commands and addresses to the non-volatile memories NVM11 to NVM14 via the first channel CH1. When commands and addresses are sent to the non-volatile memories NVM11 to NVM14, each or at least two of the non-volatile memories NVM11 to NVM14 can perform operations in parallel according to the commands.
[0069] Figure 2 The illustration shows a memory device 220 communicating with a memory controller 210 via four channels and including four non-volatile memories connected to each or at least one channel of the memory device 220. However, this disclosure is not limited to the example embodiment described above, and the number of channels and the number of non-volatile memories connected to one channel can be varied.
[0070] According to an embodiment, non-volatile memories connected to the same channel can be grouped into a memory group. Figure 2 In the example, memory device 220 may include four memory groups corresponding to the first channel CH1, the second channel CH2, the third channel CH3, and the fourth channel CH4, respectively.
[0071] According to an embodiment, the storage controller 210 can store the raw data and the vector data corresponding to the raw data in different memory groups.
[0072] For example, when raw data RAWD is received externally, the raw data RAWD can be buffered in buffer memory 212. Processor 211 can generate different types of vector data VECD1, VECD2, and VECD3 based on the raw data RAWD buffered in buffer memory 212. The generated vector data VECD1, VECD2, and VECD3 can be buffered in buffer memory 212. Processor 211 can control memory device 220 so that the raw data RAWD and vector data VECD1, VECD2, and VECD3 buffered in buffer memory 212 can be stored in different memory groups.
[0073] According to the embodiment, raw data RAWD and vector data VECD1, VECD2, and VECD3 can be provided to the memory bank in parallel through different channels CH1 to CH4, and can be written to the memory bank simultaneously. Therefore, the storage performance (throughput) of raw data RAWD and vector data VECD1, VECD2, and VECD3 can be improved.
[0074] Furthermore, the storage controller 210 can control the storage device 220 to simultaneously read the raw data RAWD and vector data VECD1, VECD2, and VECD3 stored in multiple memory groups. Simultaneous reading of the raw data RAWD and vector data VECD1, VECD2, and VECD3 can shorten the time required to search for vectors close to the query vector and output the raw data corresponding to the vector data.
[0075] In the following text, reference will be made to Figures 3 to 5B Describe the example raw data and example vector data.
[0076] Figure 3 An example of the original data is shown.
[0077] Raw data can be unprocessed, raw information. For example, raw data can include plain text. Figure 3 In the example, the raw data may include user information for multiple users. User information may include text data with different attributes, such as the names, ages, addresses, interests, and follows of multiple users.
[0078] Raw data can be divided into multiple raw data items. For example, a raw data item may include information such as a user's name, age, address, interests, and followings. Figure 3 As shown.
[0079] In one embodiment, vector data associated with the original data can be generated to efficiently search for users who meet the criteria among multiple users.
[0080] Figure 4A and Figure 4B Examples of different types of vectors (vector data) are shown.
[0081] Vector data refers to data in which information is expressed in numerical form. For example, vector data can include vector data items. Vector data items can include multidimensional vectors representing the attributes or relationships of the original data items. For instance, vector embeddings can be used to transform words in text into vectors that capture semantic meaning. Vector data items make it easier to compare, analyze, and compute the similarity between original data items.
[0082] Figure 4A It shows that it includes Figure 3 The interest data in the original data items is transformed into an interest vector, "Interest_vector". Algorithms or models designed to capture semantic meaning can be used to transform such... Figure 3 The text items “football”, “basketball”, “guitar” and “game” are encoded as vectors such as {10,0,1}, {9,1,2}, {2,10,2} and {3,0,9}.
[0083] Texts with similar meanings can be mapped to vectors that are closer in space. For example, the distance between the vectors of "football" and "basketball" can be closer than the distance between the vectors of "football" and "guitar" or the distance between the vectors of "football" and "game".
[0084] refer to Figure 4B It shows that it includes Figure 3 The age data in the original data items is converted into an age vector "Age_vector". For example... Figure 3 The text "1988.02", "1992.10", "1998.02", and "1998.04" can be encoded into vectors such as {2,2,1}, {4,5,5}, {2,10,9}, and {1,9,10}. Figure 4B In the example, the distance between vectors that are closer in age can be closer than the distance between vectors that are farther in age.
[0085] In an embodiment, a vector data item may include a vector corresponding to the original data item. In an embodiment, a vector data item may also include at least one link. The at least one link may further include the location of the vector associated with the vector, the distance to the vector associated with the vector, the gradient of the vector associated with the vector, etc. For example, a link may connect different vectors that are closest to the vector.
[0086] When a query is received externally, the storage device can convert the query into a query vector and retrieve the vector that is closest to the query vector from the vectors included in the vector data items.
[0087] Figure 5A and Figure 5B Examples of different types of vector spaces are shown.
[0088] Vectors of the same type can be represented in the same vector space. Figure 5A and Figure 5B In the example vector space, the closer two vectors are, the higher their similarity.
[0089] Figure 5A This shows the representation in vector space. Figure 4A The case of the interest vector. Figure 5A In the example, the distance between the vectors of "soccer" and "basketball" can be closer than the distance between the vectors of "soccer" and "guitar" or the distance between the vectors of "soccer" and "game".
[0090] In this embodiment, the query can be converted into an interest vector. For example, storage device 200 (as referenced) Figure 2 The described query can receive queries that include the user's raw data, and can be converted into an interest vector to recommend other users with similar interests (similar to the user's interests). Figure 5A The query vector "Query" is shown in the vector space.
[0091] Storage device 200 can search a vector space containing multiple interest vectors and find the interest vector that is closest to the query vector, such as "basketball". Storage device 200 can output the interest vector that is closest to the query vector to an external source.
[0092] Figure 5B This shows the representation in vector space. Figure 4B Consider the age vectors. Vectors closer to an age can be closer together than vectors farther apart.
[0093] Similar to the interest vector, the query can be transformed into an age vector. Storage device 200 can search a vector space containing multiple age vectors and find the age vector closest to the query vector, such as '1998.04'. Storage device 200 can then output the age vector closest to the query vector to an external source.
[0094] Figures 3 to 5B The original data and vector data are illustrated by showing structured original data and corresponding vector data. However, this disclosure is not limited to the example embodiments described above. For example, the original data may include unstructured data such as image data or document data, and original data items may be generated by partitioning the data to have a predetermined size, and vector data may be generated based on the original data items.
[0095] According to the embodiment, the original data item (reference) Figure 3 Description), Interest Vector Data Items (Reference) Figure 4A (Description) and age vector data items (Reference) Figure 4B (Description) can be stored in the reference. Figure 2The described storage device 200 can be used in different memory groups. For example, raw data items and vector data items associated with raw data items and having different attributes can be stored in different memory groups. Furthermore, raw data items and vector data items with different attributes can be mapped to the same logical address.
[0096] According to an embodiment, when the original data item and the vector data item (associated with the original data item) are mapped to the same logical address, the storage device 200 can quickly search for data. Furthermore, the capacity of the mapping data required in the storage device 200 to associate the original data item and the vector data item may not increase significantly.
[0097] In the following text, you can refer to Figures 6 to 7B This disclosure describes the address mapping of the storage devices.
[0098] Figure 6 A first mapping table of a storage device according to an embodiment is shown.
[0099] According to an embodiment, the first mapping table can represent the relationship between channels CH1, CH2, CH3, and CH4 and data types. For example, the memory group corresponding to the first channel CH1 can be mapped to the original data. In addition, the memory groups corresponding to the second channel CH2, the third channel CH3, and the fourth channel CH4 can be mapped to vector data of the first to the third types, respectively.
[0100] Figure 7A and Figure 7B A second mapping table of a storage device according to an embodiment is shown.
[0101] The second mapping table can map the logical addresses used in the host 100 to the physical addresses used in the memory device 220. In an embodiment, the memory controller 210 can define logical address spaces and physical address spaces corresponding to multiple memory banks connected to different channels, respectively. Furthermore, the second mapping table can map the logical block address (LBA) and physical page number (PPN) of each channel. The mapping between logical addresses and physical addresses can be referred to as "address mapping".
[0102] Figure 7A A second mapping table is shown that is associated with the first memory group of the first channel CH1. Figure 7B A second mapping table is shown for the second memory group associated with the second channel CH2.
[0103] According to an embodiment, the same logical addresses in the first memory bank and the second memory bank can be assigned to data that is associated with each other. Data that is associated with each other does not necessarily need to be assigned to the same physical address. Figure 7A and Figure 7B In the example, the address mapping can be different for each memory bank.
[0104] First mapping table and second mapping table (reference) Figure 6 , Figure 7A and Figure 7B (Description) can be stored in the buffer memory 212 of the storage device 200, as per reference. Figure 2 Described.
[0105] According to an embodiment, the amount of additional mapping data required when mapping original data items and multiple vector data items associated with the original data items to the same logical address may be small. Specifically, the number of memory groups included in storage device 200 may be significantly smaller than the number of logical addresses. Therefore, the first mapping table used to associate original data items and vector data items can occupy a much smaller data capacity than the second mapping table used for address mapping.
[0106] According to an embodiment, by mapping the original data item and multiple vector data items associated with the original data item to the same logical address, the capacity burden of the buffer memory 212 can be reduced, and fast data search can be performed.
[0107] In the following text, reference will be made to Figure 8 and Figure 9 The data storage and search operations of the storage device according to the embodiment are described in detail.
[0108] Figure 8 The data storage operation of a storage device according to an embodiment is illustrated.
[0109] Figure 8 Storage device 200 can be used with reference Figure 2 The storage device 200 described corresponds to this. The storage device 200 may include a storage controller 210 and a memory device 220. Figure 8 The illustration shows the logical storage space of multiple memory groups 221, 222, 223 and 224 that can be included in memory device 220.
[0110] Storage controller 210 can receive the raw data item RAWDk corresponding to the kth logical address LBAk from host 100, as shown in the reference. Figure 1 As described above. In one embodiment, host 100 can perform the generation of multiple raw data items by dividing the raw data.
[0111] Storage controller 210 can perform vector embedding on the original data item RAWDk to generate vector data items VECD1k, VECD2k, and VECD3k of different types. For example, the different types of vector data items VECD1k, VECD2k, and VECD3k may include vectors with different attributes. When the original data item RAWDk includes the original data item, as referenced... Figure 3 The vector data items VECD1k, VECD2k, and VECD3k can be any of different age vectors, address vectors, interest vectors, or attention vectors. The storage controller 210 may include an accelerator configured to perform vector embedding.
[0112] The storage controller 210 can map at least one of the generated vector data items VECD1k, VECD2k, and VECD3k to a logical address (LBAk) that is the same as the logical address of the original data item RAWDk. Then, the storage controller 210 can determine the memory group to store the original data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k based on the data types of the original data item RAWDk and at least one of the vector data items VECD1k, VECD2k, and VECD3k. For example, to determine the memory group, the storage controller 210 can refer to a first mapping table, such as a reference table. Figure 6 As stated above.
[0113] The storage controller 210 can control the storage device 220 so that the raw data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k are stored in different storage groups 221, 222, 223, and 224. The raw data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k can be provided to storage groups 221, 222, 223, and 224 in parallel through different channels CH1, CH2, CH3, and CH4, and can be written to storage groups 221, 222, 223, and 224 simultaneously.
[0114] According to the embodiments, since the operation of generating various types of vector data items based on original data items can be offloaded to the storage device in the vector database system, the amount of data input / output between the CPU and the storage device can be reduced. Furthermore, since the storage device can simultaneously write original data items and vector data items associated with the original data items, the storage performance of the vector database can be improved.
[0115] Figure 8An embodiment is shown in which one memory group is allocated separately to raw data and vector data of different types. However, this disclosure is not limited to the above example embodiment, and the number or ratio of memory groups allocated separately to raw data and vector data of different types can be varied.
[0116] For example, in four memory groups 221, 222, 223, and 224, storage device 200 can allocate two memory groups to the raw data and one memory group to each of two different types of vector data. The number or ratio of memory groups allocated to the raw data and the different types of vector data can vary depending on the importance of the data or the user's choice. For example, the more important the data, the fewer memory groups can be allocated to the raw data, allowing for the storage of various types of vector data.
[0117] Figure 9 A data search operation of a storage device according to an embodiment is illustrated.
[0118] Figure 9 Storage device 200 can be used with reference Figure 8 The storage device 200 described corresponds to this.
[0119] When query data is received from host 100, storage controller 210 can use vector data stored in memory device 220 to search for raw data associated with the query data, and can output the searched raw data to host 100.
[0120] The storage controller 210 can perform vector embedding on the query data to generate a query vector and can find the nearest neighbor vectors to the query vector. To find the neighbor vectors of the query vector, the storage controller 210 can read vector data from the memory device 220.
[0121] According to an embodiment, the memory controller 210 can perform address translation on each or at least one of the memory groups 221, 222, 223, and 224 based on a logical address (LBAk) to find the physical address of the region storing the original data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k associated with the original data item RAWDk. Additionally, the memory controller 210 can provide read requests to each or at least one of the memory groups 221, 222, 223, and 224 in parallel, and can simultaneously retrieve the original data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k associated with the original data item RAWDk from each or at least one of the memory groups 221, 222, 223, and 224.
[0122] According to an embodiment, when the storage controller 210 finds a neighboring vector of the query vector in the vector data items VECD1k, VECD2k, VECD3k obtained from the storage device 220, the storage controller 210 can output the original data item RAWDk obtained from the storage device 220 to the host 100.
[0123] For example, depending on the type of query vector, neighboring vectors of the query vector can be found based on vector data items stored in the third memory group 223. For example, when the third memory group 223 includes interest vectors, as referenced... Figure 4A As described, and the query vector is an interest vector, neighboring vectors can be searched in the third memory group 223. When the second vector data item VECD2k in the vector data items stored in the third memory group 223 includes a neighboring vector of the query vector, the original data item RAWDk associated with the second vector data item VECD2k can be output.
[0124] According to an embodiment, when the storage controller 210 retrieves vector data items from the memory device 220 to find neighboring vectors of a query vector, the storage controller 210 can simultaneously retrieve the original data associated with the vector data items. The storage controller 210 can output the retrieved original data after finding neighboring vectors of the query vector, thus saving the time spent reading the original data. Therefore, data search performance can be improved.
[0125] exist Figures 6 to 9 In the example embodiments described above, in order to associate raw data items and vector data items, the logical addresses of the raw data items and vector data items that are associated with each other are matched. However, this disclosure is not limited to the example embodiments described above.
[0126] For example, the storage device may not match the logical addresses of associated raw data items and vector data items, and may insert the logical address of the associated raw data item into each or at least one of the vector data items. When a neighboring vector of a query vector is found, the storage device can find the logical address of the associated raw data item in the vector data items that include the neighboring vector, and can retrieve the raw data item from the memory device using the logical address and a first mapping table.
[0127] When the storage controller 210 reads all vector data items stored in the memory device 220 to find neighboring vectors of the query vector and compares the distances between all vectors in the vector space and the query vector, it may consume excessive resources. To enable the storage controller 210 to consume fewer resources and find neighboring vectors faster, the vector data items can further store link information along with the vectors.
[0128] Figure 10An example of vector data according to an embodiment is shown.
[0129] refer to Figure 10 Vector data can include interest vectors, such as reference vectors. Figure 4A As described. Specifically, vector data can include multiple vector data items, such as interest vector, link (LBA_link), distance, and gradient.
[0130] At least one of the vector data items can be mapped to logical addresses LBA1, LBA2, LBA3, and LBA4. Links can include information for connecting the vectors included in a vector data item and associated vectors associated with that vector. For example, a link can include a logical address corresponding to a vector data item that includes associated vectors. Furthermore, at least one of the vector data items can also include distance and slope information between the vector and its associated vectors.
[0131] The storage controller 210 can refer to vector data items stored in the memory device 220 to search for neighboring vectors of the query vector.
[0132] Figure 11 The data search process of a storage device according to an embodiment is illustrated.
[0133] Figure 11 The logical address space of the first memory group associated with the first channel CH1 and the logical address space of the second memory group associated with the second channel CH2 are shown. The raw data RAWD can be stored in the first memory group, and the vector data VECD1 (associated with the raw data RAWD) can be stored in the second memory group.
[0134] Multiple logical addresses LBA1 to LBAn can be assigned to the logical address spaces of the first memory group and the second memory group. The same logical address can be assigned to both a raw data item and a vector data item associated with the raw data item. Furthermore, raw data items and vector data items assigned the same logical address can be read simultaneously.
[0135] In this embodiment, to find the nearest neighbor vector to the query vector, the storage controller 210 can search the logical address space from the first logical address LBA1. The storage controller 210 may not completely search the logical address space to find the neighbor vectors of the query vector.
[0136] According to an embodiment, the storage controller 210 can obtain a link from a read vector data item based on a certain logical address, and can determine the logical addresses to be read in the next order by referring to the link. When the logical addresses to be read in the next order are determined by referring to the link (by the storage controller 210), some logical address read operations can be skipped. The storage controller 210 can search for adjacent vectors of a query vector without reading all vector data items. Figure 11 The example region shown is shaded to indicate that it is to be read to search for adjacent vectors in the logical address space.
[0137] According to an embodiment, the storage controller 210 can control the storage device 220 to simultaneously read regions corresponding to the same logical address in both the first and second storage groups. Therefore, vector data VECD1 and the associated raw data RAWD can be simultaneously acquired by the storage controller 210. When the storage controller 210 determines an adjacent vector, it can load the raw data corresponding to that adjacent vector into the storage controller 210. Upon determining an adjacent vector, the storage controller 210 can output the raw data corresponding to that adjacent vector as a response to the query.
[0138] Vector indexing algorithms can be used to determine whether vectors within a vector space are connected to each other. For example, vector indexing algorithms can be used to determine associated vectors and links included in vector data items.
[0139] Figures 12A to 12C An example of a vector indexing algorithm is shown.
[0140] Vector indexing algorithms can be used to efficiently perform nearest neighbor searches in vector databases.
[0141] Figure 12A The Local Sensitive Hash (LSH) algorithm is shown.
[0142] refer to Figure 12A Multiple hash functions can be used to map multiple vectors to multiple hash buckets. Hash functions can be designed so that vectors at similar positions are mapped to the same hash bucket with a high probability. To search for the closest neighboring vector using a query vector, the hash function can be used to determine which hash bucket the query vector is mapped to. Then, only the vectors included in the hash bucket can be compared with the query vector, thus finding the closest neighboring vector.
[0143] Figure 12B The Hierarchical Navigable Small World Graph (HNSW) algorithm is illustrated.
[0144] refer to Figure 12BIn the HNSW algorithm, vectors can be indexed based on a graph-based structure, where nodes represent vectors and edges connect adjacent vectors based on proximity.
[0145] The graph can include vectors and edges connecting vectors across multiple layers L1 to L3. The nearest neighbor vector to the query vector can be searched starting from the top layer L3. While searching for the nearest vector to the query vector in the current layer, the search for the nearest vector centered on the nearest vector can be repeated in lower layers. Then, the nearest neighbor vector can be searched starting from the lowest vector.
[0146] Figure 12C The Inverted File Index (IVF) algorithm is shown.
[0147] refer to Figure 12C In the IVF algorithm, similar vectors can be grouped into clusters, cluster centers can be determined, and vectors can be assigned to cluster centers, thus allowing the vectors to be indexed. The query vector and the cluster center vector can be compared to search for the cluster to which the query belongs. Furthermore, among the vectors included in the cluster, the closest neighboring vector to the query can be searched.
[0148] According to an embodiment, storage device 200 can use different vector indexing algorithms to index vectors with the same attributes to generate vector data with different types. Storage device 200 can distinguish between vector data with different types and store them in different memory groups. Storage device 200 can search for neighboring vectors of query vectors with the same attributes more quickly by using vector data with different types of the same attributes.
[0149] Figure 13 A data search method for a storage device according to an embodiment is illustrated.
[0150] Figure 13 The data shown represents the primary reference over time when performing a parallel search of the first to fourth memory regions associated with the first channel CH1, the second channel CH2, the third channel CH3, and the fourth channel CH4. Specifically, Figure 13 The vector data used to determine the logical addresses to be read in the next order and the raw data to be finally output are shown in shaded areas.
[0151] The first memory region can store the original data RAWD, and the second to fourth memory regions can respectively store different types of vector data VECD1 to VECD3 generated using different vector indexing algorithms, as well as vectors with the same attributes. For example, the first vector data VECD1 can be vector data based on the LSH algorithm, the second vector data VECD2 can be vector data based on the HNSW algorithm, and the third vector data VECD3 can be vector data based on the IVF algorithm.
[0152] The storage controller 210 can simultaneously acquire vector data items of different types corresponding to the same logical address, determine the vector data item with the vector closest to the query vector among the vector data items, and determine the logical address to be read in the next order by referring to the links included in the vector data item.
[0153] exist Figure 13 In the example, the first to third vector data items, respectively included in the first vector data VECD1, second vector data VECD2, and third vector data VECD3, can be retrieved from the first logical address. The logical address to be read in the next order can be determined based on these first to third vector data items. When the third vector data item included in the third vector data has a vector closest to the query vector, the logical address to be read in the next order can be determined by referring to the links included in the third vector data item. For example, the logical address to be read in the next order can be determined based on the IVF algorithm.
[0154] Even when searching for neighboring vectors based on the IVF algorithm, the original data RAWD and the first vector data VECD1, second vector data VECD2, and third vector data VECD3 can be read in parallel. Furthermore, within the first vector data VECD1, second vector data VECD2, and third vector data VECD3, the vector data of the vector closest to the query vector can be updated. Figure 13 In the example, storage controller 210 can search for vector data based on the IVF algorithm, and then search for vector data based on the HNSW algorithm. Subsequently, storage controller 210 can search for vector data based on the LSH algorithm, and can ultimately find neighboring vectors as a result of the HNSW algorithm-based vector data search.
[0155] When a neighboring vector is found, the storage controller 210 can output the original data item that was loaded into the storage controller 210 at the same time as the neighboring vector, as a response to the query.
[0156] According to the embodiment, the indexing algorithm among multiple vector indexing algorithms that can search for the neighboring vectors of the query vector fastest can be selected and changed in real time. Therefore, the speed of searching for neighboring vectors can be improved.
[0157] refer to Figures 1 to 13 An embodiment has been described as an example of a storage device generating vector data items based on raw data items, distinguishing between raw data items and vector data items, and storing them in a memory group. However, this disclosure is not limited thereto. For example, the storage device may receive raw data items and vector data items generated externally, and may distinguish between raw data items and vector data items and store them in a memory group.
[0158] In the following text, reference will be made to Figures 14 to 15 A storage system according to an embodiment is described.
[0159] Figure 14 A vector database system according to an embodiment is shown.
[0160] Figure 14 The vector database system 20 may include a host 100 and a storage device 200. The storage device 200 may include a storage controller 210 and a memory device 220. The memory device 220 may include memory groups 221, 222, 223 and 224 that perform data input / output with the storage controller 210 through multiple channels CH1 to CH4.
[0161] Host 100 can generate a raw data item RAWDk based on the raw data, and can perform vector embedding on the raw data item RAWDk to generate vector data items VECD1k, VECD2k, and VECD3k of different types. In an embodiment, in order to generate the raw data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k, host 100 may include accelerator circuitry separate from the CPU.
[0162] The storage controller 210 can receive raw data item RAWDk and vector data items VECD1k, VECD2k, and VECD3k from the host 100, and can distinguish between the raw data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k and store them in different memory groups 221, 222, 223, and 224. The storage controller 210 can assign the same logical address to the raw data item RAWDk and the vector data items VECD1k, VECD2k, and VECD3k.
[0163] Storage controller 210 can receive a query vector from host 100 and can search for neighboring vectors of the query vector. To search for neighboring vectors, storage controller 210 can simultaneously acquire the raw data item RAWDk and vector data items VECD1k, VECD2k, and VECD3k through different channels CH1 to CH4. When a neighboring vector is found among the vectors included in vector data items VECD1k, VECD2k, and VECD3k, the acquired raw data item RAWDk can be output externally.
[0164] Figure 15 A storage system according to an embodiment is shown.
[0165] Storage system 300 may include CPU 310 and multiple storage devices 321, 322, 323, and 324. In an embodiment, storage system 300 may be a petabyte (PB) SSD, and the multiple storage devices 321, 322, 323, and 324 may be SSDs. Specifically, storage system 300 may include multiple storage devices 321, 322, 323, and 324 to provide high-capacity storage space and high-bandwidth input / output at the petabyte level, and may include CPU 310 configured to control communication between the multiple storage devices 321, 322, 323, and 324 and a host.
[0166] According to an embodiment, CPU 310 can generate a raw data item RAWDk based on the raw data, and can perform vector embedding on the raw data item RAWDk to generate vector data items VECD1k, VECD2k, and VECD3k of different types.
[0167] CPU 310 can provide raw data item RAWDk and vector data items VECD1k, VECD2k, VECD3k to a single storage device, such that each or at least one of storage devices 321, 322, 323, 324 can store raw data RAWD and vector data VECD1 to VECD3 associated with the raw data.
[0168] Each or at least one of storage devices 321, 322, 323, and 324 can distinguish between raw data RAWD and vector data VECD1 to VECD3 and store them in memory groups that perform data input / output through different channels. For example, a storage device that receives raw data item RAWDk and vector data items VECD1k, VECD2k, and VECD3k from CPU 310 can assign the same logical address to raw data item RAWDk and vector data items VECD1k, VECD2k, and VECD3k, and can store raw data item RAWDk and vector data items VECD1k, VECD2k, and VECD3k in different memory groups.
[0169] The storage device according to the embodiment may include a memory group capable of independently performing data input / output, and raw data and vector data may be stored separately by the memory group to improve the performance (throughput) of write and read operations of raw data and vector data.
[0170] The storage device according to the embodiment can assign the same logical address to the original data and the vector data associated with the original data to quickly and easily search the data.
[0171] The problems to be solved by this disclosure are not limited to those mentioned above, and other problems not mentioned will be clearly understood by those skilled in the art from the following description.
[0172] While exemplary embodiments have been shown and described above, modifications and variations can be made by those skilled in the art without departing from the scope of this disclosure as defined by the appended claims.
Claims
1. A storage device comprising: Multiple memory banks, each comprising multiple non-volatile memories; A storage controller is configured to control the plurality of memory groups; as well as Multiple channels are respectively connected to the storage controller and the multiple memory groups. The storage controller is configured as follows: One or more vector data items are generated by performing vector embedding on the original data items, and The original data items and the one or more vector data items are stored in different memory groups among the plurality of memory groups.
2. The storage device according to claim 1, wherein, The storage controller is also configured to: Define a logical address space for at least one of the plurality of memory groups, and The same logical address is assigned to the original data item and the one or more vector data items associated with the original data item.
3. The storage device according to claim 2, wherein, The storage controller is also configured to store a first mapping table, the first mapping table including mappings between the plurality of memory groups and a plurality of data types, and The multiple data types include primitive data types and one or more vector data types.
4. The storage device according to claim 3, wherein, The storage controller is also configured to store a second mapping table for mapping the logical and physical addresses of at least one of the plurality of memory groups.
5. The storage device according to claim 2, wherein, The one or more vector data items include at least one of a vector, a link to an associated vector of the vector, a distance between the vector and the associated vector, and a gradient between the vector and the associated vector.
6. The storage device according to claim 1, wherein, The one or more vector data items comprise multiple vector data items with multiple different vector data types.
7. The storage device according to claim 6, wherein, The original data items contain data with different attributes, and The plurality of vector data items comprise vectors for data with different attributes.
8. The storage device according to claim 1, wherein, The storage controller is also configured to simultaneously write the raw data item and the one or more vector data items by controlling the plurality of memory groups.
9. The storage device according to claim 1, wherein, The storage controller is also configured to, in response to a query from the host, simultaneously read the raw data item and the one or more vector data items by controlling the plurality of memory groups.
10. The storage device according to claim 9, wherein, The storage controller includes a buffer memory. The storage controller is further configured as follows: The raw data items and one or more vector data items obtained from the plurality of memory groups are stored in the buffer memory, and When one or more vector data items correspond to adjacent vectors, the original data items stored in the buffer memory are output to the host, and the adjacent vectors are adjacent to the query vector corresponding to the query.
11. The storage device according to claim 1, wherein, The storage controller includes an accelerator configured to perform the vector embedding.
12. A storage device comprising: Multiple memory banks, each comprising multiple non-volatile memories; A storage controller is configured to control the plurality of memory groups; as well as Multiple channels are respectively connected to the storage controller and the multiple memory groups. The storage controller is configured as follows: Vectors are generated by performing vector embedding on the original data items, resulting in multiple vector data items that connect the original vectors to different vectors based on different vector indexing algorithms. The original data items and the plurality of vector data items are stored in different memory groups within the plurality of memory groups.
13. The storage device according to claim 12, wherein, The storage controller is also configured to: Define a logical address space for at least one of the plurality of memory groups, and The same logical address is assigned to the original data item and the plurality of vector data items associated with the original data item.
14. The storage device according to claim 13, wherein, The storage controller is also configured to: Execute vector embedding on queries from the host to generate query vectors. By controlling the multiple memory groups, the original data item and the multiple vector data items can be read simultaneously using a single logical address. Based on the vector data item among the plurality of vector data items, including the vector closest to the query vector, determine the logical address to be read in the next order.
15. The storage device according to claim 13, wherein, The plurality of vector data items include at least one of a vector, a logical address of an associated vector determined based on a vector indexing algorithm, a distance between the vector and the associated vector, and a gradient between the vector and the associated vector.
16. The storage device according to claim 15, wherein, The vector indexing algorithm is at least one of Locality Sensitive Hash (LSH), Hierarchical Navigable Small World Graph (HNSW), or Inverted Document Index (IVF).
17. A storage system comprising: At least one processor is configured to perform vector embedding on the original data items and generate one or more vector data items; A storage device includes multiple memory groups, each of which includes multiple non-volatile memories; A storage controller is configured to control the plurality of memory groups; as well as Multiple channels are respectively connected to the storage controller and the multiple memory groups. The storage controller is further configured as follows: Receive the original data item and the one or more vector data items from the at least one processor, and The original data items and the one or more vector data items are stored in different memory groups among the plurality of memory groups.
18. The storage system according to claim 17, wherein, The storage controller is also configured to: Define a logical address space for at least one of the plurality of memory groups, and The same logical address is assigned to the original data item and the one or more vector data items associated with the original data item.
19. The storage system according to claim 17, wherein, The one or more vector data items comprise multiple vector data items, which are generated by concatenating vectors generated through vector embedding with different vectors based on different vector indexing algorithms. The plurality of vector data items contain links between the vector and the different vectors.
20. The storage system according to claim 17, wherein, The storage system is a petabyte solid-state drive (PB-SSD).