Standalone storage engine and data processing method and apparatus

By dividing the raw disk into data blocks and building a trie in memory, the performance bottleneck caused by key-value databases and file systems in single-machine storage engines is solved, achieving efficient data writing and reading.

WO2026123831A1PCT designated stage Publication Date: 2026-06-18CHINA TELECOM ARTIFICIAL INTELLIGENCE TECHNOLOGY (BEIJING) CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
CHINA TELECOM ARTIFICIAL INTELLIGENCE TECHNOLOGY (BEIJING) CO LTD
Filing Date
2025-09-09
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

In a single-machine storage engine, the key-value database becomes a bottleneck, resulting in wasted performance when high-performance storage is required. Furthermore, writing metadata and WAL operations to the local file system leads to multiple disk I/O operations, resulting in poor performance.

Method used

The raw disk space management and allocation engine is used to divide the raw disk into multiple data blocks, use bitmap to record storage space, and build a trie in memory through a metadata engine for querying and reading, avoiding the file system and realizing full memory indexing.

🎯Benefits of technology

It improves data writing efficiency and read performance, enhances query and traversal capabilities, simplifies system architecture, and reduces disk I/O operations.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN2025120025_18062026_PF_FP_ABST
    Figure CN2025120025_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided in the embodiments of the present application are a standalone storage engine and a data processing method and apparatus. The processing method is applied to a standalone storage engine, and comprises: dividing a bare disk into a plurality of chunks on the basis of a preset number of bytes; for each chunk, allocating a storage space on the basis of object data to be stored, and writing each piece of object data into a corresponding storage space, wherein a bitmap is used to record the storage space for each chunk; and on the basis of name information of each piece of object data and position information corresponding to a chunk bitmap, constructing a dictionary tree, so as to query and / or read the object data on the basis of the dictionary tree. By means of the embodiments of the present application, data is written into a bare disk, thereby improving the data writing efficiency; and in-memory indexing is achieved, such that the query and traversal capabilities can be enhanced, thereby improving the data reading efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

A standalone storage engine and a data processing method and apparatus

[0001] This application claims priority to Chinese Patent Application No. 2024118055483, filed on December 9, 2024, entitled "A Standalone Storage Engine and a Data Processing Method and Apparatus", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of data storage technology, and in particular to a stand-alone storage engine and a data processing method and apparatus. Background Technology

[0003] In current single-machine storage engines, metadata is typically stored in a key-value database (kV database), while data is generally stored on the local file system. When facing high-performance storage requirements, the kV database can become the bottleneck of the entire engine, and the local file system will also generate multiple disk I / O operations (disk input / output) due to writing metadata and WAL (Write-Ahead Logging), resulting in wasted performance. Summary of the Invention

[0004] In view of the above problems, a stand-alone storage engine and a data processing method and apparatus are proposed to overcome or at least partially solve the above problems, comprising:

[0005] A single-machine storage engine, comprising a raw disk space management and allocation engine and a metadata engine, wherein:

[0006] The raw disk space management and allocation engine is configured to divide the raw disk into multiple data blocks according to a preset number of bytes, allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space.

[0007] The metadata engine is configured to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

[0008] In some embodiments, when the raw disk space management allocation engine is configured to allocate storage space for each data block according to the object data to be stored, it is configured as follows:

[0009] Each data block is divided into multiple write units, and multiple cursors are used to allocate storage space for the object data to be stored on the bitmap of the data block. Each cursor allocates a preset number of write units according to the amount of the object data to be stored.

[0010] In some embodiments, when the metadata engine is configured to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, it is configured as follows:

[0011] The metadata engine is configured to split the name information of each object data according to characters, construct a trie in sequence according to the split characters, store the corresponding character values ​​in the non-leaf nodes of the trie, and store the corresponding character values ​​and the position information in the leaf nodes of the trie.

[0012] In some embodiments, the location information includes a data block identifier, an object data length, and an offset of the object data within the data block.

[0013] In some embodiments, the metadata engine is configured to obtain target name information of the object data to be read, and to traverse and query the trie according to the target name information to determine the target location information corresponding to the target name information;

[0014] The raw disk space management and allocation engine is configured to determine the target data block among multiple data blocks based on the data block identifier in the target location information, determine the starting read position of the target object data based on the offset in the target location information, and read the target object data from the starting read position according to the object data length in the target location information.

[0015] In some embodiments, the single-machine storage engine further includes a snapshot module, which is configured to store snapshot data generated based on the current trie and bitmap on the target disk after snapshotting is enabled, and record the snapshot time.

[0016] In some embodiments, the snapshot module is configured to generate operation logs based on operations during data storage and trie construction, and delete operation logs prior to the snapshot time after the snapshot is completed.

[0017] In some embodiments, when the single-machine storage engine restarts, the snapshot module loads the snapshot data in the target disk into memory and replays the operation log.

[0018] In some embodiments, the raw disk space management allocation engine is configured to divide each data block into multiple write units and use at least two cursors to simultaneously allocate storage space for the object data to be stored from the beginning and end of the data block, respectively.

[0019] In some embodiments, when constructing the trie, if the name information of the new object data to be inserted overlaps with the character portion of an original node in the trie, the metadata engine splits the original node based on the overlapping characters before inserting the new object data.

[0020] A data processing method applied to a single-machine storage engine, the method comprising:

[0021] The raw disk is divided into multiple data blocks according to a preset number of bytes;

[0022] For each data block, storage space is allocated according to the object data to be stored, and each object data is written to the corresponding storage space. Each data block uses a bitmap to record the storage space.

[0023] A trie is constructed in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

[0024] A data processing apparatus for use in a single-machine storage engine, the apparatus comprising:

[0025] The hard disk partitioning module is configured to divide the raw disk into multiple data blocks according to a preset number of bytes.

[0026] The object data writing module is configured to allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space.

[0027] The trie construction module is configured to use the metadata engine to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

[0028] The embodiments of this application have the following advantages:

[0029] In this embodiment, the single-machine storage engine can divide the raw disk into multiple data blocks according to a preset number of bytes, and then allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. This allows data to be written directly to the raw disk without going through the file system, improving data writing efficiency. At the same time, each data block can use a bitmap to record the storage space. Furthermore, a trie can be constructed in memory according to the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie, realizing full memory indexing, thereby enhancing query and traversal capabilities and improving data reading efficiency. Attached Figure Description

[0030] To more clearly illustrate the technical solution of this application, the drawings used in the description of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0031] Figure 1a is a schematic diagram of the structure of a single-machine storage engine provided in an embodiment of this application;

[0032] Figure 1b is a schematic diagram of bare disk partitioning provided in an embodiment of this application;

[0033] Figure 1c is a schematic diagram of a trie provided in an embodiment of this application;

[0034] Figure 1d is a bitmap diagram of a data block provided in an embodiment of this application;

[0035] Figure 1e is a schematic diagram of dual-cursor memory space allocation provided in an embodiment of this application;

[0036] Figure 1f is a schematic diagram of a trie construction provided in an embodiment of this application;

[0037] Figure 1g is a schematic diagram of the storage structure of a leaf node provided in an embodiment of this application;

[0038] Figure 2a is a schematic diagram of another single-machine storage engine provided in an embodiment of this application;

[0039] Figure 2b is a schematic diagram of an operation log provided in an embodiment of this application;

[0040] Figure 3 is a flowchart of a data processing method provided in an embodiment of this application;

[0041] Figure 4 is a schematic diagram of the structure of a data device provided in an embodiment of this application. Detailed Implementation

[0042] To make the above-mentioned objectives, features, and advantages of this application more apparent and understandable, the application will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only some, not all, of the embodiments of this application. All other embodiments obtained by those skilled in the art based on the embodiments of this application without inventive effort are within the scope of protection of this application.

[0043] Referring to Figure 1a, a schematic diagram of a standalone storage engine according to an embodiment of this application is shown. The standalone storage engine 100 includes a raw disk space management and allocation engine 101 and a metadata engine 102. The standalone storage engine 100 is a storage system configured to receive, save, retrieve, and read data. The standalone storage engine in this embodiment can be an object storage engine for object data. The object storage engine can use a key-value store to organize data, where each object has a unique key, and the corresponding object data (value) can be quickly retrieved through this key.

[0044] In this embodiment, the raw disk space management and allocation engine 101 can be configured to divide the raw disk into multiple data blocks according to a preset number of bytes, allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space. The bitmap can be used to record, manage, and allocate a segment of byte data in the space. Each bit represents a segment of space. The preset number of bytes can be set according to the actual scenario requirements. In this embodiment, no restrictions are placed on this.

[0045] A raw disk refers to a physical disk or logical volume that has not been formatted with a file system. Raw disks directly expose the underlying storage device and do not contain a file system or other high-level abstraction layers. Raw disks can be used in applications that require direct access to and control of the underlying storage, such as databases, virtualization, containerization, and high-performance storage systems. Raw disks offer the following advantages:

[0046] (1) Direct access: Raw disks allow applications to directly access and manipulate the underlying storage device, bypassing the file system layer. This direct access method can improve performance, especially in scenarios requiring high-performance I / O operations.

[0047] (2) No file system overhead: Since there is no file system, raw disks can avoid the overhead of file systems, such as metadata management, file system checks and repairs. This makes raw disks more advantageous in high-performance applications.

[0048] (3) Flexibility: Raw disks can be used with various advanced storage management technologies, such as LVM (Logical Volume Management), RAID (Disk Array), and storage virtualization. Applications can flexibly manage and organize storage space as needed.

[0049] (4) Security: Raw disks can be used for encrypted storage, protecting data security through hardware or software encryption technologies. Because they lack a file system, raw disks avoid the security risks associated with file system vulnerabilities.

[0050] In this embodiment, the raw disk space management and allocation engine 101 can store object data directly into the raw disk without going through the file system. This writing process only needs to be written once. Compared with the file system, which requires multiple disk I / O operations, the data writing path is shorter and the writing process is more convenient, thereby improving disk I / O performance.

[0051] Referring to Figure 1b, a schematic diagram of a raw disk partitioning embodiment of this application is shown. Hard disks A, B, C, and D are partitioned according to a preset number of bytes of 64 GiB (GiB is a unit of measurement for computer storage capacity, representing 2 to the power of 30 bytes). Hard disks A, B, C, and D are raw disks. After partitioning, hard disk A is divided into multiple data blocks: chunk0, chunk1, chunk2, ... Each data block is 64 GiB in size.

[0052] In one embodiment of this application, when writing data to a data block, the remaining storage space of each data block can be determined, and then the data block with the larger remaining storage space can be selected for object data storage.

[0053] In this embodiment, the metadata engine 102 can be configured to construct a trie in memory based on the name information and the location information corresponding to the data block bitmap of each object data, so as to query and / or read the object data according to the trie. The trie can store the correspondence between the name information and location information of the object data; that is, a trie is used to manage metadata. Each node of the trie can store node content (the node content can be generated based on the name information and location information) and pointers to child nodes. In practical applications, the path formed from the root node to each leaf node in the trie can be used to represent the correspondence between the name information and location information of the object data. Therefore, after the trie is constructed, when a user searches for or reads a certain object data, they can traverse and query the constructed trie based on the name information of the object data to determine the location information corresponding to the data object.

[0054] The name information can consist of multiple characters and serves as a unique identifier for the object data. The location information indicates the location where the object data is stored. Specifically, the location information may include the data block identifier, the object data length, and the offset of the object data within the data block.

[0055] The data block identifier is a unique identifier used to represent a data block. The object data length is the number of bytes or bits occupied by the object in memory. This length can be used to measure the size of the object, and by determining the object's memory footprint, memory management and optimization can be performed. The offset of the object data within the data block refers to the distance between the starting position of the object data within the data block and the starting position of the data block. The offset can be calculated in bytes.

[0056] Referring to Figure 1c, a schematic diagram of a trie according to an embodiment of this application is shown. Any path in the trie represents the correspondence between the name information and location information of an object data. This trie is used to store the location information of five object data with the names abceg, abdfh, abdfi, tuvwy, tuvwz, and tuvxz.

[0057] In this embodiment, the trie is stored in memory (i.e., metadata is stored in memory), achieving full-memory indexing. Full-memory indexing is a technique that stores index data entirely in memory, offering the following advantages compared to existing disk-based key-value (KV) database storage:

[0058] (1) High performance:

[0059] Low latency: Memory access speeds are far faster than disk access speeds, and full-memory indexes can significantly reduce data access latency. Memory accesses typically occur in nanoseconds, while disk accesses typically occur in milliseconds.

[0060] High throughput: Full-memory indexes can handle higher concurrent requests because memory bandwidth and processing power are far greater than those of disk.

[0061] (2) Real-time performance:

[0062] Real-time data access: Full-memory indexes can provide real-time data access and query capabilities, making them suitable for application scenarios that require fast response, such as real-time analytics, real-time monitoring, and real-time trading systems.

[0063] Real-time updates: Data in memory can be updated quickly, making it suitable for application scenarios that require frequent updates, such as real-time recommendation systems and real-time advertising.

[0064] (3) Simplify architecture

[0065] Reduced disk I / O: Full memory indexing avoids disk I / O operations, simplifies the system architecture, and reduces the complexity and potential points of failure caused by disk read and write.

[0066] Simplified data management: Data management in memory is relatively simple and does not require complex file systems and disk management mechanisms.

[0067] (4) Flexibility:

[0068] Dynamic adjustment: In-memory indexes can be dynamically adjusted and optimized to adapt to different query patterns and load requirements.

[0069] Rapid iteration: Data in memory can be iterated and updated quickly, making it suitable for application scenarios that require rapid development and deployment.

[0070] (5) High availability:

[0071] Memory redundancy: Through memory redundancy and replication techniques, full-memory indexes can provide high availability and fault tolerance, ensuring that data remains available even when some nodes fail.

[0072] Fast recovery: Data in memory can be recovered quickly, reducing system failure recovery time.

[0073] (6) Supports complex queries:

[0074] Complex index structures: Full-memory indexes can support complex index structures, such as B+ trees, hash tables, skip lists, etc., which are suitable for complex query and analysis needs.

[0075] Multidimensional indexes: In-memory indexes can support multidimensional indexes, which are suitable for multidimensional data analysis and querying.

[0076] (7) Applicable to specific scenarios:

[0077] Caching systems: Full-memory indexes are suitable for caching systems such as Redis and Memcached, providing high-performance caching services.

[0078] Real-time analytics: Full-in-memory indexes are suitable for real-time analytics systems such as Apache Druid and ClickHouse, providing real-time data analysis and query capabilities.

[0079] In summary, the full memory index in this application embodiment can improve disk I / O performance.

[0080] In one embodiment of this application, when the raw disk space management allocation engine is configured to allocate storage space for each data block according to the object data to be stored, it is configured to: divide each data block into multiple write units, wherein the write unit is the smallest data storage unit in the data block. After dividing the write units, multiple cursors can be used to allocate storage space for the object data to be stored on the bitmap of the data block. Each cursor allocates a preset number of write units according to the amount of object data to be stored. The preset number of write units allocated by each cursor in a single operation can be the same or different.

[0081] When the difference in the amount of data of the objects to be stored is less than a preset value (e.g., the size of a write unit), the preset number of write units allocated by multiple cursors in a single operation is the same. When the difference in the amount of data of the objects to be stored is greater than or equal to a preset value (e.g., the size of a write unit), the objects to be stored can be classified according to their amount of data. If the difference in the amount of any two objects to be stored within the same object data class is less than a preset value, and if the difference in the amount of any two objects to be stored in different object data classes is less than a preset value, then after classifying the object data, a preset number of objects for that class is set based on the amount of data in each class.

[0082] In the embodiments of this application, multiple cursors can be used in the storage space allocation process. Multiple cursors can improve the allocation efficiency of storage space on the one hand, and allocate storage space at different scales on the other hand. In addition, allocating object data according to data volume can make object data with similar data volume in the data block clustered and allocated in one area as much as possible, which facilitates the management of storage space.

[0083] In this embodiment of the application, two cursors can be selected to simultaneously allocate data from the beginning and end of the data block.

[0084] Figure 1d shows a bitmap diagram of a data block in an embodiment of this application. In this data block, the write unit is the smallest unit of data storage, and each write unit is 4k. The bitmap shows 0 and 1 to indicate the allocation status of the write unit, with 0 indicating that the write unit is not allocated and 1 indicating that the write unit has been allocated.

[0085] Figure 1e shows a schematic diagram of dual-cursor memory allocation in an embodiment of this application. Allocation pointer 1 and allocation pointer 2 are allocation pointers for two cursors. Allocation pointer 1 allocates memory space starting from the beginning of the data block, with each allocation being 3 write units. Allocation pointer 2 allocates memory space starting from the end of the data block, with each allocation being 3 write units.

[0086] In one embodiment of this application, when the metadata engine 102 is configured to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, it is configured as follows:

[0087] Metadata engine 102 splits the name information of each object data according to characters. For example, when the name of object data 1 is abcd, it can be split into four characters: a, b, c, and d.

[0088] After completing the character splitting, the metadata engine 102 can construct a trie in sequence according to the split characters, such as the root node → the node corresponding to character 'a' → the node corresponding to character 'b' → the node corresponding to character 'c' → the node corresponding to character 'd'.

[0089] In this embodiment of the application, when a character corresponding to new object data is inserted into the trie, the metadata engine 102 can merge or split the character value to update the trie.

[0090] Specifically, characters with the same content can be merged to obtain new character nodes, thus obtaining the target trie.

[0091] For example, object data 1 is named "abcd" and object data 2 is named "abce". The initial trie of object data 1 is: root node → node corresponding to character 'a' → node corresponding to character 'b' → node corresponding to character 'c' → node corresponding to character 'd'. The initial trie of object data 2 is: root node → node corresponding to character 'a' → node corresponding to character 'b' → node corresponding to character 'c' → node corresponding to character 'e'. In object data 1 and object data 2, the nodes corresponding to characters 'a', 'b', and 'c' are the same. Therefore, in the two initial tries of object data 1 and object data 2, the nodes corresponding to characters 'a', 'b', and 'c' are merged into the node corresponding to character 'abc'. Thus, the generated target field tree contains two paths: root node → node corresponding to character 'abc' → node corresponding to character 'd' (this path represents object data 1) and root node → node corresponding to character 'abc' → node corresponding to character 'e' (this path represents object data 2).

[0092] In this embodiment of the application, when splitting the newly inserted object data according to the characters in the original dictionary, if there is a partial overlap between the split characters and a certain original node, the overlapping characters are determined, and the original node is split based on the overlapping characters before inserting the new object data.

[0093] Referring to Figure 1f, a schematic diagram of a trie construction in an embodiment of this application is shown. The trie on the left contains the character node "tuv". After adding the object data tutwg, the character node "tuv" can be re-split into the character node "tu" and the child node "v", and then the new object data can be inserted based on the updated character node.

[0094] In this embodiment, the metadata engine 102 can store data for each node in the trie after setting up each node. In practical applications, the corresponding character value (i.e., key value) can be stored in the non-leaf nodes of the trie, and the corresponding character value (key value) and position information can be stored in the leaf nodes of the trie. This allows the metadata of each node to be managed with minimal resource consumption.

[0095] Referring to Figure 1g, a schematic diagram of the storage structure of a leaf node is shown. The leaf node stores internal information in 8 bytes, including key information, the chunk (data block) it belongs to, the offset within the chunk, and the object data length. The key information (i.e., the character value) occupies 6 bits, with a maximum value of 64, meaning the key information can represent an integer between 0 and 63. The chunk identifier is 12 bits, with a maximum value of 4096, meaning the chunk identifier can represent an integer between 0 and 4095. The object data length is 12 bits, with a maximum value of 419304, meaning the object data length can represent an integer between 0 and 419304. The offset within the chunk is 24 bits, with a maximum offset of 64 GiB, meaning the offset within the chunk can represent an integer between 0 and 64 GiB.

[0096] In one embodiment of this application, the metadata engine 102 can be configured to obtain the target name information of the object data to be read, and perform a traversal query in the trie based on the target name information to determine the target location information corresponding to the target name information.

[0097] In practical applications, users can input the target name information of the object data to be read through a single-machine storage engine. In the single-machine storage engine, the metadata engine can obtain the target name information and then traverse and query according to the trie. Specifically, it can start traversing from the first child node pointed to by the root node of the trie, and determine whether the character value stored in the first child node is a character in the target name information (judging from the first character of the target name information). When the first child node of the target name information stores the first character value of the target name information, it determines one or more candidate second child nodes connected to the first child node of the target name information, and determines whether the candidate second child node stores the character value connected to the first character value of the target name information. When the second child node of the target name information stores the second character value of the target name information (the second character value is the character connected to the first character value of the target name information), it determines one or more candidate third child nodes connected to the second child node of the target name information, and traverses and queries according to the above nodes until the last character of the target name information is traversed. If the node corresponding to the last character is a non-leaf node, it is determined that the metadata engine does not store the target location information corresponding to the target data node. If the node corresponding to the last character is a leaf node, the target location information corresponding to the target name information is obtained from the leaf node.

[0098] It should be noted that if, before the last character, no non-leaf node matching the target object data is found in the trie, or if the node found is a leaf node, then the metadata engine does not store the target location information corresponding to the target data node.

[0099] The raw disk space management and allocation engine 101 can be configured to determine the target data block among multiple data blocks based on the data block identifier in the target location information, determine the starting read position of the target object data based on the offset in the target location information, and read the target object data from the starting read position according to the object data length in the target location information.

[0100] In practical applications, the raw disk space management and allocation engine 101 can read data from the raw disk based on the target location information obtained by the metadata engine 102. Specifically, it locates the target data block according to the data block identifier, locates the starting read position in the target data block according to the offset in the target location information, and then reads the corresponding length of data according to the object data length to realize the target object data reading.

[0101] In this embodiment of the application, a single-machine storage engine is provided. The single-machine storage engine may include a raw disk space management and allocation engine and a metadata engine. Through the raw disk space management and allocation engine and the metadata engine, object data is stored on the raw disk and metadata is stored in memory. The single-machine storage engine has an extremely short IO read and write path, which improves read and write performance. At the same time, it implements full memory indexing, which enhances query and traversal capabilities.

[0102] Referring to FIG2a, a schematic diagram of a single-machine storage engine provided in an embodiment of this application is shown. The single-machine storage engine 100 includes a raw disk space management and allocation engine 101, a metadata engine 102, and a snapshot module 103. The snapshot module 103 can be used to store snapshot data generated based on the current trie and bitmap in the target disk after a snapshot is enabled, and record the snapshot time.

[0103] In computer science and data storage, a snapshot is a complete record or copy of the state of data at a specific point in time. Snapshots can be used for backup, recovery, data consistency checks, and other scenarios.

[0104] In one embodiment of this application, the snapshot module is configured to generate operation logs based on the operations during data storage and trie construction, and delete the operation logs prior to the snapshot time after the snapshot is completed.

[0105] In practical applications, each operation performed by a user in a single-machine storage engine can generate an operation log. The operation log is used to record the user's operations and can consist of operation time information and the corresponding operation information.

[0106] When large amounts of data need to be read and written, the corresponding operation log data volume is enormous. Furthermore, as the single-machine storage engine is used over time, a large amount of operation log data accumulates, consuming significant memory and potentially impacting read / write efficiency. Each time a snapshot is generated, a trie and bitmap are recorded at the snapshot time. This allows for the reduction of operation log storage by deleting operation logs from before the snapshot time. Moreover, restoring the trie and bitmap based on snapshots and operation logs is more convenient than restoring data entirely from the operation logs.

[0107] Referring to Figure 2b, a schematic diagram of an operation log in an embodiment of this application is shown, which may specifically include the following operation records:

[0108] "time1 insert key1:chunk,offset,length

[0109] time2 insert key2:chunk,offset,length

[0110] time3 insert key3:chunk,offset,length

[0111] time4 insert key4:chunk,offset,length

[0112] time5 insert key5:chunk,offset,length

[0113] time6 insert key6:chunk,offset,length

[0114] time7 insert key7:chunk,offset,length

[0115] time8 delete key1:chunk,offset,length

[0116] time9 delete key3:chunk,offset,length”

[0117] That is, at time 1, insert key1, store the data block, and set the offset and length; at time 2, insert key2, store the data block, and set the offset and length; at time 3, insert key3, store the data block, and set the offset and length; at time 4, insert key4, store the data block, and set the offset and length; at time 5, insert key5, store the data block, and set the offset and length; at time 6, insert key6, store the data block, and set the offset and length; at time 7, insert key7, store the data block, and set the offset and length; at time 8, delete the data block, offset, and length of key1; at time 9, delete the data block, offset, and length of key3.

[0118] If the snapshot start time is time5, then the operation records corresponding to time1 to time4 can be deleted.

[0119] In one embodiment of this application, when the single-machine storage engine restarts, the snapshot module loads the snapshot data in the target disk into memory and replays the operation log.

[0120] In practical applications, when a single-machine storage engine restarts, it first loads the snapshot data from the target disk into memory, and then replays the log records. Since the log records are all object operations, for a certain object key, there are only two states: existence or non-existence. This ensures idempotency. Therefore, even if the key already exists when replaying the records after restarting, it will not affect the execution again.

[0121] In this application embodiment, a single-machine storage engine is provided. This single-machine storage engine can not only improve data read and write performance through a raw disk space management and allocation engine and a metadata engine, but also prevent data loss through a snapshot module set in the single-machine storage engine, thereby enhancing the stability of the single-machine storage engine.

[0122] Referring to Figure 3, a flowchart of a data processing method according to an embodiment of this application is shown, which is applied to a single-machine storage engine and may specifically include the following steps:

[0123] Step S301: Divide the raw disk into multiple data blocks according to a preset number of bytes;

[0124] In practical applications, a single-machine storage engine has the function of managing raw disk data. In this embodiment, to improve data writing and reading performance, raw disks can be used to directly store data objects. Before writing to the raw disk, the raw disk to be stored can be segmented to obtain multiple data blocks of a preset number of bytes. The preset number of bytes can be set according to the actual needs of the scenario, and this embodiment does not impose too many restrictions on it.

[0125] A single-machine storage engine may include a raw disk space management and allocation engine, which can be used to allocate raw disk space. In turn, the raw disk can be divided into multiple data blocks according to a preset number of bytes in the single-machine storage engine.

[0126] Step S302: Allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space.

[0127] After the raw disk is partitioned into data blocks, a raw disk space management and allocation engine can be used to allocate storage space on the data blocks according to the object data to be stored, and then each object data can be written to its corresponding storage space. The storage space can be recorded in the data block using a bitmap.

[0128] Step S303: Construct a trie in memory according to the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

[0129] In practical applications, each object data has a unique name, and after the object data is stored in a data block, its location on the data block bitmap can be determined. Furthermore, a trie can be constructed based on the name and location information. The trie stores the correspondence between the name and location information of each data object. Thus, the trie acts as an index, allowing for quick lookup of the location information of object data stored on the hard drive, and subsequently, the object data can be read from the raw disk based on that location information.

[0130] A metadata engine can also be included in a single-machine storage engine, which can then be used to construct a trie.

[0131] In this embodiment, object data is directly stored on the raw disk without going through the file system. This write process only requires one operation, which, compared to a file system that requires multiple disk I / O operations, results in a shorter data write path and a more convenient write process, thereby improving data write performance. Furthermore, metadata is stored in memory, enabling full-memory indexing and improving disk read performance.

[0132] In one embodiment of this application, the process of allocating storage space for each data block according to the object data to be stored in step S302 is as follows: each data block is divided into multiple write units, and multiple cursors are used to allocate storage space for the object data to be stored on the bitmap of the data block, wherein each cursor allocates a preset number of write units according to the amount of object data to be stored.

[0133] In one embodiment of this application, step 303, which involves constructing a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, is as follows:

[0134] The name information of each object data is split into characters, and a trie is constructed according to the characters obtained from the split. The corresponding character value is stored in the non-leaf nodes of the trie, and the corresponding character value and position information are stored in the leaf nodes of the trie.

[0135] The location information may include the data block identifier, the object data length, and the offset of the object data within the data block.

[0136] Furthermore, in this embodiment, the single-machine storage engine can read data. Specifically, the user can input the name of the target object data to be read, and the metadata engine in the single-machine storage engine can obtain the target name information of the object data to be read. Based on the target name information, it can traverse and query in the trie to determine the target location information corresponding to the target name information. Then, it can use the location information to determine the data block identifier, and determine the target data block among multiple data blocks. Based on the offset in the target location information, it can determine the starting reading position of the target object data, and read the target object data from the starting reading position according to the object data length in the target location information.

[0137] In one embodiment of this application, the single-machine storage engine further includes a snapshot module, and the data processing method further includes storing the snapshot data generated based on the current time-of-flight trie and bitmap on the target disk after enabling snapshot, and recording the snapshot time.

[0138] In one embodiment of this application, the snapshot module in the single-machine storage engine can generate operation logs based on the operations during data storage and trie construction, thereby deleting the operation logs before the snapshot time after the snapshot is completed.

[0139] In one embodiment of this application, when the single-machine storage engine restarts, the snapshot data in the target disk is loaded into memory, and the operation log is replayed.

[0140] In this embodiment, the single-machine storage engine can divide the raw disk into multiple data blocks according to a preset number of bytes, and then allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. This allows data to be written directly to the raw disk without going through the file system, improving data writing efficiency. At the same time, each data block can use a bitmap to record the storage space. Furthermore, a trie can be constructed in memory according to the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie, realizing full memory indexing, thereby enhancing query and traversal capabilities and improving data reading efficiency.

[0141] It should be noted that, for the sake of simplicity, the method embodiments are described as a series of actions. However, those skilled in the art should understand that the embodiments of this application are not limited to the described order of actions, because according to the embodiments of this application, some steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required by the embodiments of this application.

[0142] Referring to Figure 4, a schematic diagram of a data processing device according to an embodiment of this application is shown. This device is applied to a single-machine storage engine and may specifically include the following modules:

[0143] The hard disk partitioning module 401 is configured to divide the raw disk into multiple data blocks according to a preset number of bytes.

[0144] The object data writing module 402 is configured to allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space.

[0145] The trie construction module 403 is configured to use the metadata engine to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

[0146] In one embodiment of this application, the object data writing module 402 may include the following sub-modules:

[0147] Write unit division submodule, configured to divide each data block into multiple write units;

[0148] The cursor allocation submodule is configured to use multiple cursors to allocate storage space for the object data to be stored on the bitmap of the data block, wherein each cursor allocates a preset number of write units according to the amount of data of the object data to be stored.

[0149] In one embodiment of this application, the trie construction module 403 may include the following sub-modules:

[0150] The character splitting submodule is configured to split the name information of each object data according to characters;

[0151] The trie creation submodule is configured to build the trie sequentially based on the characters obtained from the splitting process.

[0152] The node data storage submodule is configured to store the corresponding character value in the non-leaf nodes of the trie, and to store the corresponding character value and the position information in the leaf nodes of the trie.

[0153] In one embodiment of this application, the location information includes a data block identifier, object data length, and the offset of the object data within the data block.

[0154] In one embodiment of this application, the apparatus further includes:

[0155] The target name information acquisition module is configured to acquire the target name information of the object data to be read.

[0156] The trie query module is configured to traverse and query the trie based on the target name information to determine the target location information corresponding to the target name information.

[0157] The target data block determination module is configured to determine the target data block from multiple data blocks based on the data block identifier in the target location information.

[0158] In fact, the reading position determination module is configured to determine the starting reading position of the target object data based on the offset in the target position information;

[0159] The target object data reading module is configured to read target object data from the starting reading position according to the object data length in the target position information.

[0160] In one embodiment of this application, the apparatus further includes:

[0161] The snapshot data storage module is configured to store the snapshot data generated based on the current trie and bitmap on the target disk after snapshotting is enabled, and record the snapshot time.

[0162] In one embodiment of this application, the device further includes,

[0163] The operation log deletion module is configured to generate operation logs based on operations during data storage and trie construction, and delete operation logs prior to the snapshot time after the snapshot is completed.

[0164] In one embodiment of this application, the apparatus includes:

[0165] The data recovery module is configured to load the snapshot data in the target disk into memory and replay the operation log when the single-machine storage engine restarts.

[0166] In this embodiment, the single-machine storage engine can divide the raw disk into multiple data blocks according to a preset number of bytes, and then allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. This allows data to be written directly to the raw disk without going through a file system, improving data writing efficiency. At the same time, each data block can use a bitmap to record the storage space. Furthermore, a trie can be constructed in memory according to the name information of each object data and the location information corresponding to the bitmap of the data block, so as to query and / or read the object data according to the trie, realizing full memory indexing, thereby enhancing query and traversal capabilities and improving data reading efficiency.

[0167] An embodiment of this application also provides an electronic device, which may include a processor, a memory, and a computer program stored in the memory and capable of running on the processor. When the computer program is executed by the processor, it implements the above-described data processing method.

[0168] One embodiment of this application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, it implements the above-described data processing method.

[0169] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.

[0170] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

[0171] Those skilled in the art will understand that embodiments of this application can be provided as methods, apparatus, or computer program products. Therefore, embodiments of this application can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, embodiments of this application can take the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0172] This application describes embodiments with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of this application. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in one or more blocks of the flowchart illustrations and / or one or more blocks of the block diagrams.

[0173] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0174] These computer program instructions may also be loaded onto a computer or other programmable data processing terminal equipment to cause a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable terminal equipment, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

[0175] Although preferred embodiments of the present application have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present application.

[0176] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.

[0177] The above provides a detailed description of a stand-alone storage engine and a data processing method and apparatus. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.

Claims

1. A single-machine storage engine, comprising a raw disk space management and allocation engine and a metadata engine, wherein: The raw disk space management and allocation engine is configured to divide the raw disk into multiple data blocks according to a preset number of bytes, allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space. The metadata engine is configured to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

2. The single-machine storage engine according to claim 1, wherein, When the raw disk space management and allocation engine is configured to allocate storage space for each data block according to the object data to be stored, it is configured as follows: Each data block is divided into multiple write units, and multiple cursors are used to allocate storage space for the object data to be stored on the bitmap of the data block. Each cursor allocates a preset number of write units according to the amount of the object data to be stored.

3. The single-machine storage engine according to claim 1, wherein, When the metadata engine is configured to build a trie in memory based on the name information of each object's data and the location information corresponding to the data block bitmap, the configuration is as follows: The metadata engine is configured to split the name information of each object data according to characters, construct a trie in sequence according to the split characters, store the corresponding character values ​​in the non-leaf nodes of the trie, and store the corresponding character values ​​and the position information in the leaf nodes of the trie.

4. The single-machine storage engine according to claim 3, wherein, The location information includes the data block identifier, the object data length, and the offset of the object data within the data block.

5. The single-machine storage engine according to claim 4, wherein, The metadata engine is configured to obtain the target name information of the object data to be read, and to traverse and query the trie according to the target name information to determine the target location information corresponding to the target name information. The raw disk space management and allocation engine is configured to determine the target data block among multiple data blocks based on the data block identifier in the target location information, determine the starting read position of the target object data based on the offset in the target location information, and read the target object data from the starting read position according to the object data length in the target location information.

6. The single-machine storage engine according to claim 1 further includes a snapshot module, wherein the snapshot module is configured to, after enabling snapshot, store the snapshot data generated based on the current trie and bitmap on the target disk and record the snapshot time.

7. The single-machine storage engine according to claim 6, wherein, The snapshot module is configured to generate operation logs based on the operations during data storage and trie construction, and delete the operation logs prior to the snapshot time after the snapshot is completed.

8. The single-machine storage engine according to claim 7, wherein, When the single-machine storage engine restarts, the snapshot module loads the snapshot data in the target disk into memory and replays the operation log.

9. The single-machine storage engine according to claim 1, wherein, The raw disk space management and allocation engine is configured to divide each data block into multiple write units and use at least two cursors to simultaneously allocate storage space for the object data to be stored from the beginning and end of the data block.

10. The single-machine storage engine according to claim 3, wherein, When constructing the trie, if the name information of the new object data to be inserted overlaps with the character portion of an original node in the trie, the metadata engine will split the original node based on the overlapping characters before inserting the new object data.

11. A data processing method applied to a single-machine storage engine, the method comprising: The raw disk is divided into multiple data blocks according to a preset number of bytes; For each data block, storage space is allocated according to the object data to be stored, and each object data is written to the corresponding storage space. Each data block uses a bitmap to record the storage space. A trie is constructed in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.

12. A data processing apparatus applied to a single-machine storage engine, the apparatus comprising: The hard disk partitioning module is configured to divide the raw disk into multiple data blocks according to a preset number of bytes. The object data writing module is configured to allocate storage space for each data block according to the object data to be stored, and write each object data into the corresponding storage space. Each data block uses a bitmap to record the storage space. The trie construction module is configured to use the metadata engine to construct a trie in memory based on the name information of each object data and the location information corresponding to the data block bitmap, so as to query and / or read the object data according to the trie.