Distributed reading method, device and server of data
By generating multiple shards for parallel processes in Elasticsearch and allocating files based on the file set and the number of parallel processes, the problems of low data reading efficiency and inability to read the latest data in real time are solved, achieving efficient data reading and system scalability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WEBANK (CHINA)
- Filing Date
- 2022-10-14
- Publication Date
- 2026-06-30
AI Technical Summary
In Elasticsearch, the number of shards is determined when the index is created. This causes the amount of data carried in a data block to increase as the amount of data increases, resulting in low data reading efficiency and the inability to read the latest data in real time.
By generating several shards for parallel processes, and allocating files to shards based on the file set and the number of parallel processes, distributed data reading is achieved. This avoids relying on the number of Elasticsearch shards for sharding and uses the InputFormat mechanism to dynamically adjust the number of shards, reducing the dependence on the Scroll snapshot mechanism.
It improves data reading efficiency, enables real-time reading of the latest data, enhances the scalability and flexibility of the distributed system, and avoids the problem of data updates failing to be read.
Smart Images

Figure CN115617749B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing, and more particularly to a distributed data reading method, apparatus, and server. Background Technology
[0002] With the development of technology, more and more services in the financial sector are shifting to electronic processing, and the traditional financial industry is gradually transforming into Fintech. The increase in electronic services has led to a rapid growth in the volume of electronic data, exhibiting a trend towards massive scale. Elasticsearch is a search engine widely used for handling massive amounts of data, enabling rapid data storage, searching, and analysis.
[0003] In existing technologies, Elasticsearch can include multiple indexes. Each index can include multiple data shards. Each data shard stores multiple data records. Reading each data shard corresponds to a task. The server can use a distributed approach to achieve parallel execution of multiple tasks, thereby enabling the scanning and reading of data in each data shard. Specifically, the server can use the Elasticsearch Scroll mechanism to implement the scanning of data in each task.
[0004] However, the number of shards in an Elasticsearch index can only be determined when the index is created. Therefore, as the amount of data increases, the amount of data a single shard carries also increases. The time it takes for the server to scan each shard also increases, leading to low data read efficiency. Summary of the Invention
[0005] This application provides a distributed data reading method, apparatus, and server to solve the problem of low data reading efficiency in the prior art.
[0006] Firstly, this application provides a distributed data reading method, including:
[0007] Obtain multiple file sets, each file set may include at least one file, and each file includes metadata of at least one document;
[0008] Based on the multiple sets of files and the number of parallel processes, generate a number of shards for parallel processes, each of which includes at least one file;
[0009] Each parallel process is allocated a shard so that the parallel process reads the document from the database based on the metadata of the document in the shard.
[0010] Secondly, this application provides a distributed data reading device, comprising:
[0011] The acquisition module is used to acquire multiple file sets, each file set may include at least one file, and each file includes the metadata of at least one document;
[0012] The reading module is configured to generate several shards for parallel processes based on the multiple file sets and the number of parallel processes, each shard including at least one file; and to allocate one shard to each parallel process so that the parallel process reads the document from the database based on the metadata of the document in the shard.
[0013] Thirdly, this application provides a server, including: a memory and a processor;
[0014] The memory is used to store computer programs; the processor is used to execute, according to the computer programs stored in the memory, a distributed data access method in the first aspect and any possible design of the first aspect.
[0015] Fourthly, this application provides a computer-readable storage medium storing a computer program, wherein when at least one processor of a server executes the computer program, the server executes the distributed data reading method of the first aspect and any possible design of the first aspect.
[0016] Fifthly, this application provides a computer program product comprising a computer program that, when at least one processor of a server executes the computer program, enables the server to perform the distributed data reading method of the first aspect and any possible design of the first aspect.
[0017] The distributed data reading method, apparatus, and server provided in this application improve data reading efficiency by acquiring multiple file sets from disk, each file set including at least one file, and each file including at least one document's metadata; determining the number of files allocated in each shard based on the quotient of the total number of files in each file set and the number of parallel processes; allocating the files in the file sets to several shards of parallel processes, with each shard including at least several allocated files; and allocating one shard to each parallel process after determining the number of shards, allowing the parallel process to read the document corresponding to the metadata in that shard after acquiring it. Furthermore, this application eliminates reliance on the number of Elasticsearch shards for sharding, allowing the number of shards to be flexibly adjusted according to the number of parallel processes, improving the scalability of the distributed system. Simultaneously, this application eliminates reliance on the scroll snapshot mechanism for data scanning, avoiding the problem of data updates not being readable after snapshot generation. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 A schematic diagram of a distributed data framework structure provided in an embodiment of this application;
[0020] Figure 2 A flowchart illustrating a distributed data reading method provided in one embodiment of this application;
[0021] Figure 3 This is a schematic diagram of a segmentation method provided in an embodiment of this application;
[0022] Figure 4 This is a schematic diagram of distributed data reading provided in one embodiment of this application;
[0023] Figure 5 A flowchart illustrating the calculation of an input format provided in one embodiment of this application;
[0024] Figure 6 A flowchart illustrating a distributed data reading method provided in one embodiment of this application;
[0025] Figure 7 This is a schematic diagram of memory region partitioning provided in an embodiment of this application;
[0026] Figure 8 This is a schematic diagram of the data result of a memory sub-region provided in an embodiment of this application;
[0027] Figure 9 An example diagram of a memory sub-region provided in an embodiment of this application;
[0028] Figure 10 This is a schematic diagram of a data writing process provided in one embodiment of this application;
[0029] Figure 11 An example diagram of data writing provided in one embodiment of this application;
[0030] Figure 12 An example diagram of a memory sub-region being written is provided in one embodiment of this application;
[0031] Figure 13 This application provides a schematic diagram of a memory sub-region flashing process according to an embodiment of the present application.
[0032] Figure 14A schematic diagram of the structure of a distributed data reading device provided in an embodiment of this application;
[0033] Figure 15 This is a schematic diagram of the hardware structure of a server provided in one embodiment of this application. Detailed Implementation
[0034] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0035] The terms "first," "second," "third," "fourth," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be used interchangeably where appropriate. For example, without departing from the scope of this document, first information can also be referred to as second information, and similarly, second information can also be referred to as first information.
[0036] Depending on the context, the word "if" as used here can be interpreted as "when," "when," or "in response to determination."
[0037] Furthermore, as used herein, the singular forms “a,” “one,” and “the” are intended to also include the plural forms, unless the context indicates otherwise.
[0038] It should be further understood that the terms “comprising” or “including” indicate the presence of features, steps, operations, elements, components, items, kinds, and / or groups, but do not exclude the presence, occurrence, or addition of one or more other features, steps, operations, elements, components, items, kinds, and / or groups.
[0039] The terms “or” and “and / or” as used herein are interpreted as inclusive, or mean any one or any combination thereof. Therefore, “A, B, or C” or “A, B, and / or C” means “any one of the following: A; B; C; A and B; A and C; B and C; A, B, and C”. Exceptions to this definition occur only when combinations of elements, functions, steps, or operations are inherently mutually exclusive in some way.
[0040] With the development of technology, more and more services in the financial sector are shifting to electronic processing, and the traditional financial industry is gradually transforming into Fintech. The increase in electronic services has led to a rapid growth in the volume of electronic data, exhibiting a trend towards massive expansion. With the arrival of the big data era, corporate business strategies and operational methods have gradually shifted from traditional experience-based decision-making to data-driven decision-making. In the daily data processing of electronic services, servers need to store the collected data in backend storage components. Servers can then perform multi-dimensional analysis of the data in the storage components, thereby providing data support for upper-level decision-making.
[0041] Elasticsearch is a search engine widely used for handling massive amounts of data, enabling fast storage, searching, and analysis of such data. During Elasticsearch usage, the server writes data to Elasticsearch's storage device. Later, the server can use a distributed computing engine to read and analyze the data in Elasticsearch. Currently, the industry generally adopts a multi-task parallel execution approach, using a distributed computing engine to scan the index data in Elasticsearch. Specifically, the server can use the Elasticsearch Scroll mechanism to scan the data in each task, thereby achieving data reading. Elasticsearch can include multiple indexes. Each index is equivalent to a database. Each index can include multiple data blocks (shards). Before multiple parallel distributed tasks are executed, the server needs to split the data in Elasticsearch into shards. Each distributed task corresponds to reading data from one shard.
[0042] In existing technologies, sharding logic is mainly implemented in two ways. First, the number of distributed tasks depends on the number of data shards specified when the Elasticsearch index is created. Data in an Elasticsearch index can be stored in multiple data shards corresponding to that index. Each data shard can store multiple data records. The server can implement distributed reading of Elasticsearch index data by following the sharding logic of reading one data shard per distributed task. Second, the server can utilize Elasticsearch's Slice feature to specify the maximum number of documents each distributed task can read before the distributed program executes. The server can determine the number of distributed tasks in the distributed computing program based on the quotient of the total number of documents and the maximum number of documents read per distributed task.
[0043] However, the number of shards in a single Elasticsearch index can only be determined when the index is created. If the number of shards needs to be adjusted later, the entire index must be rebuilt, which is very costly. Rebuilding the entire index is equivalent to reading all previously stored data and redistributing it to newly created shards. A fixed number of shards in the index leads to a situation where the amount of data in each shard increases as the index stores more data. This method of splitting distributed tasks based on the number of shards causes data read efficiency to decrease as the amount of data in each shard increases, resulting in slow data read speeds and low distributed read efficiency. Furthermore, Elasticsearch's Slice feature relies on the Scroll feature, which uses a aspect-based snapshot mechanism. Therefore, Elasticsearch distributed reads based on Slices begin with a scan, incurring additional performance overhead from the pagination mechanism itself, thus reducing distributed read efficiency. Simultaneously, because the Scroll feature uses a snapshot mechanism, once Elasticsearch initiates a scan and creates a snapshot, the distributed task will use that snapshot to complete the distributed data read. After a snapshot is created, if the data in the Elasticsearch index changes, the distributed task will be unable to read the latest changed data.
[0044] To address the aforementioned issues, this application proposes a distributed data reading method. In this application, the server designs an input format adapted to Elasticsearch. This input format can be referred to as InputFormat. The server can flexibly configure the amount of data read by each distributed task through InputFormat, thereby achieving better reading performance. InputFormat can contain multiple InputSplits. Each InputSplit can correspond to a distributed task. The server can determine the number of distributed tasks executed in parallel within InputFormat based on the configuration. Furthermore, the server can dynamically adjust the number of InputSplits based on this configuration, thereby supporting dynamic tuning of the distributed computing engine. This configuration can include the number of task execution nodes that can execute tasks in parallel within the server's cluster. Each InputSplit is a shard. The shard setting does not depend on Elasticsearch's data block or slice characteristics, thus enabling more flexible adjustment of the amount of data that each distributed task needs to scan. Simultaneously, during data scanning, the server can obtain the Elasticsearch index document ID based on the data content in the InputSplit. Then, the server can directly locate the corresponding data content based on this index document ID. This method allows servers to perform point-to-point data lookups without relying on snapshots from the Elasticsearch Scroll mechanism. This point-to-point lookup implementation enables the server to read the latest version of the data in real time, rather than the snapshot data at the time of the scan, thus improving data real-time performance.
[0045] In summary, this application addresses the problems of low data reading efficiency, poor scalability, and inability to read the latest data in indexed documents by modifying a commonly used distributed computing program for scanning Elasticsearch index data. Furthermore, by introducing the InputFormat mechanism, this application increases the flexibility of future cluster architecture expansion and avoids the performance degradation of Elasticsearch data reading as data is written, a problem present in existing technologies.
[0046] The technical solutions of this application will be described in detail below with specific embodiments. The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0047] Figure 1 This diagram illustrates a distributed data framework structure according to an embodiment of this application. Figure 1As shown, the server can receive write requests sent by the client. These write requests can include a document to be written and its metadata. The metadata describes the document to be written. The server can maintain a copy of the metadata in memory while simultaneously writing the document to the storage device.
[0048] The server can write this metadata into memory. This memory can include multiple memory sub-regions (MemoryRange). The server can write metadata into one of these memory sub-regions. The server can flush the data in a memory sub-region to disk, achieving data persistence. Multiple file collections (FileBuckets) can be set up on the disk. Each file collection can correspond to a memory sub-region. For example, data in memory sub-region 1 will be flushed to file collection 1, and data in memory sub-region 2 will be flushed to file collection 2. A single flush of a memory sub-region generates a file (file.dat) containing all the data in that sub-region. For example, the first flush of memory sub-region 1 can generate file 1 in file collection 1. The second flush of memory sub-region 1 can generate file 2 in file collection 1.
[0049] This storage device is for Elasticsearch. It can be a dedicated server for data storage, or it can be multiple data storage nodes in a cluster. The storage device can include multiple indexes (nodes). Each index is equivalent to a database. An index can include multiple data shards. Each data shard can store multiple data entries. Within a data shard, a document represents one data entry.
[0050] The server can generate multiple input splits based on multiple file sets on disk through direct mapping and merging. For example, multiple files in file set 1 can be directly mapped to split 1. Some files in file set 2 can be directly mapped to split 2. The remaining files after mapping in file sets 2 and 3 can be merged into split 4. These multiple splits can form the input format for this distributed read.
[0051] The server can allocate multiple shards from the input format to a distributed engine. This distributed engine can include multiple distributed nodes. Each distributed node can execute a distributed task. Each distributed task can be allocated a shard. The server determines the number of shards based on the number of distributed nodes in the distributed engine to ensure that each distributed node is allocated a shard, thus enabling distributed data reading. A distributed node can include three functions: reading shards, reading data, and executing business logic. Reading shards involves each distributed node retrieving its corresponding shard information from the input format after the server allocates the shards. This shard information may include the file corresponding to the data to be read. Reading data involves the distributed node reading the shard and determining the metadata in each file within that shard. The server can then determine the address of the corresponding document on the storage device based on the metadata. The distributed node can then read the document from the storage device based on this address. Business logic can be used to process data according to data requests.
[0052] The server can receive data requests sent by external devices. These data requests can be used to perform data queries, analysis, calculations, and other processing. After receiving a data request, the server can initialize the input format. Once the server has completed the initialization of the input format, the distributed engine can perform distributed data reading based on this format. Distributed nodes can also receive data requests and determine the corresponding business logic based on them. Distributed nodes can then process the read data according to this business logic. The distributed nodes can also send the processing results back to the external devices.
[0053] In this application, a server is used as the execution entity to perform a distributed data reading method according to the following embodiments. Specifically, the execution entity can be a server hardware device, a software application implementing the following embodiments on the server, a computer-readable storage medium on which the software application implementing the following embodiments is installed, or code implementing the software application according to the following embodiments.
[0054] Figure 2 A flowchart illustrating a distributed data reading method according to an embodiment of this application is shown. Figure 1 Based on the illustrated embodiments, as Figure 2 As shown, with the server as the execution entity, the method in this embodiment may include the following steps:
[0055] S101. Obtain multiple file sets, each file set may include at least one file, and each file includes the metadata of at least one document.
[0056] In this embodiment, the server can retrieve multiple file sets from the disk. The number of these file sets can be determined based on the number of memory sub-regions within the memory region. For example, such as... Figure 1 As shown, when a memory region includes three memory sub-regions, the disk can include three file sets. Each file set can include at least one file. Each file can consist of data that is flushed to the disk in one go from the memory sub-region corresponding to that file set. The memory sub-region includes metadata of the document to be written in the write request. When the data in a memory sub-region is full, the server can flush the data in that memory sub-region to the disk and generate a file. And / or, when the time interval between the last flush of a memory sub-region and the last flush reaches a first preset duration, the server can flush the data in that memory sub-region to the disk and generate a file. After the memory sub-region has finished flushing, the data in that memory sub-region is cleared. For example, multiple file sets can be obtained as follows: Figure 3 As shown in (a), this includes three file sets. File set 1 contains four files, file set 2 contains two files, and file set 3 contains one file. Each file is no larger than 2MB. Furthermore, the size of each file is determined based on the data size of its memory sub-region.
[0057] S102. Based on multiple file sets and the number of parallel processes, generate several partitions for parallel processes, each partition containing at least one file.
[0058] In this embodiment, the server can determine the number of shards based on the number of parallel processes. Each parallel process can correspond to a distributed node. These parallel processes can run on a single server. Alternatively, each parallel process can correspond to a task server for performing read tasks. This task server can be any server in the cluster other than the server corresponding to the master node. The server can distribute files from multiple file sets into several shards. Each shard can include at least one file. To distribute data read tasks more evenly, the number of files allocated to each shard should be kept as consistent as possible to improve the overall efficiency of distributed data read.
[0059] In one example, the partitioning process may specifically include the following steps:
[0060] Step 1: Determine the number of allocations based on the total number of files in multiple file sets and the number of parallel processes.
[0061] In this step, the server can sum the number of files in each file set to obtain the total number of files. To maintain a relatively even distribution of tasks, the server needs to keep the number of files in each shard roughly similar. The server can determine the approximate number of files that each shard needs to process on average based on the quotient of the total number of files and the number of parallel processes. Since the quotient may not be an integer, the server can round down to obtain this allocation number. This allocation number indicates the actual number of files allocated to each shard. Setting the number of files in each shard prevents a single parallel process from becoming a bottleneck due to data skew during distributed data reading. The formula for calculating the allocation number `single_parallelism_file_num` is as follows:
[0062]
[0063] single_parallelism_file_num=FLOOR(avg_parallelism_file_num)
[0064] Here, `avg_parallelism_file_num` indicates the approximate number of files that need to be processed per shard on average. k This indicates the number of files in the k-th file set. There are a total of N file sets. `total_parallelism` indicates the number of parallel processes. `single_parallelism_file_num` indicates the value obtained by rounding down `avg_parallelism_file_num`.
[0065] For example, regarding such Figure 3 For the file set shown in (a), when the number of parallel processes is 3, the allocation number single_parallelism_file_num can be calculated as Floor((4+2+1) / 3)=2. That is, the allocation number for each partition is 2.
[0066] Step 2: Sort the files in each file set according to their file size.
[0067] In this step, since files can be generated when the memory sub-region is full, or when the time interval since the last write reaches a first preset duration, the file sizes of the individual files in the file set may vary. The server can sort the files in the file set according to their file sizes. Specifically, this sorting can be done in descending order of file size. For example, when the memory sub-region is full, the generated file size can be 2MB. In this case, the size of each file in the file set will be less than or equal to 2MB.
[0068] In one example, a file may include metadata for multiple documents. This metadata may include the document size. The server can obtain the document size corresponding to each piece of metadata for each file. The server can then sum these document sizes to obtain the total file size. The server can then sort the files in each file set based on this file size.
[0069] For example, such as Figure 3 As shown in (b), in file set 1, file 1 is 2MB in size, file 2 is 1MB in size, file 3 is 1MB in size, and file 4 is 2MB in size. These four files, sorted from largest to smallest, are: file 1, file 4, file 2, file 3. In file set 2, file 5 is 1MB in size, and file 6 is 2MB in size. These two files, sorted from largest to smallest, are: file 6, file 5.
[0070] Step 3: Based on the sorted file sets and the number of allocations, allocate the files in the file sets to several slices in the parallel process.
[0071] In this step, the server can sequentially allocate files from the sorted file sets to various shards. To ensure even computation across multiple distributed tasks running in parallel, the proper splitting of files within the file sets is crucial. During document writing, files from the same file set are always written to disk from the same memory sub-region. Therefore, when allocating files to shards, priority is given to allocating files from the same file set. This allows a single parallel process to connect to as few cluster nodes as possible during data reading, thereby reducing connection overhead during the data reading process.
[0072] Specifically, the allocation process may include the following three steps:
[0073] Step 31: Based on the sorted file sets, allocate the files in the file sets to the partitions in order. Each partition includes several allocated files, and the allocated files in each partition come from the same file set.
[0074] In this step, the server can sequentially retrieve a number of files from each of the sorted file sets. The server can then allocate these files into shards. At this point, each shard contains the allocated files, and the files in each shard originate from the same file set. For example, ... Figure 3 As shown in (c), the allocation number is 2. The server can allocate files 1 and 4 from file set 1 to shard 1 based on their order. The server can also allocate files 6 and 5 from file set 2 to shard 2. Since the number of files in file set 3 is insufficient to allocate to a single shard, the server does not process the files in file set 3 in this step.
[0075] Step 32: When there are unallocated files in the file set, allocate the unallocated files to the remaining fragments according to the allocation number. Each fragment includes the allocation number of files, and the files in each fragment come from multiple file sets.
[0076] In this step, the server can retrieve unallocated files from the entire file set and reassemble them. The server can then allocate the reassembled files to the remaining shards. Each shard can contain several files. The files in each shard can come from multiple file sets.
[0077] In one example, the server can reorder the reassembled files based on their file size. This sorting can be from largest to smallest. The server can then distribute the sorted files across the remaining shards.
[0078] In another example, the server can sequentially retrieve unallocated files from various file sets. The server can then allocate these files into shards based on the order in which they were retrieved.
[0079] For example, such as Figure 3 As shown in (d), the number of allocations is 2, and the number of parallel processes is 3. The server can determine the remaining files 2 and 3 in file set 1. File 7 remains in file set 3. One fragment remains. The server can allocate files 2 and 7 to fragment 3.
[0080] Step 33: When there are unallocated files in the file set, allocate the unallocated files according to the file set corresponding to the unallocated files and the file set corresponding to the files in each segment.
[0081] In this step, because floor function is used when calculating the allocation number, after allocating files in the file set according to steps 31 and 32 above, there may still be several unallocated files. At this time, the server can retrieve these unallocated files and the corresponding second file set. The server can also calculate the number of files from the second file set in each partition. The server can then select the partition with the largest number of files in the second file set and allocate the unallocated file. For example, as... Figure 3 As shown in (e), the unallocated file is file 3, and the corresponding second file set is file set 1. In... Figure 3 Based on the sharding shown in (d), shard 1 includes 2 files from file set 1, shard 2 includes 0 files from file set 1, and shard 3 includes 1 file from file set 1. Therefore, as Figure 3 As shown in (e), the server chooses to allocate file 3 to fragment 1.
[0082] S103. Allocate a shard for each parallel process so that the parallel process can read documents from the database based on the metadata of the documents in the shard.
[0083] In this embodiment, after determining the number of shards for the parallel processes, the server can allocate one shard to each parallel process. The parallel process can then read the files within that shard after acquiring it. Each file can include metadata for multiple documents. This metadata contains data describing the document to be written.
[0084] In one example, the metadata may include the document number and the document address. The document address is the storage address of the document in the database. The server can directly retrieve the document from the database based on the document address. The document number is used to uniquely identify a document.
[0085] Once a parallel process acquires the shard, it can read the files within that shard and obtain multiple document addresses from those files. The parallel process can then use these document addresses to locate the corresponding documents in the database based on the reading order, thus enabling document retrieval. During this reading process, the parallel process reads only one document at a time, achieving point-to-point data lookup. Therefore, if document A is updated before the parallel process starts reading it, the parallel process can read the updated document A. In other words, the parallel process can obtain the latest document while reading it, rather than the version at the time the read operation is determined.
[0086] In one example, based on the execution principles of currently available distributed computing engines (Spark / Flink), the process of a server performing distributed reading and processing of data in Elasticsearch can be roughly divided into the following six steps:
[0087] Step 1: Initialize the distributed execution framework. This distributed framework may include a master control node (Controller) and multiple task execution nodes (TaskExecutors). The master control node can be the execution main server as described in the above embodiments. The task execution nodes can correspond to the parallel processes described in the above embodiments.
[0088] Step 2: The master node generates multiple shards based on the data to be read. Each shard is used to read a portion of the data to be read.
[0089] Step 3: The master node distributes the shards to the various task execution nodes. Each task execution node is responsible for executing one shard.
[0090] Step 4: After receiving the shard, the task execution node reads the data according to the information in the shard and executes the corresponding business logic calculation.
[0091] Step 5: After completing the task, each task execution node returns the execution result to the master node.
[0092] Step 6: After receiving the execution results returned by all task execution nodes, the master node completes the data processing operation.
[0093] As can be seen from the above steps, in step 2, the master node performs the shard calculation. Subsequently, the master node allocates the shards to each task execution node, and the task execution nodes read and process the data according to their corresponding shards. In this application, the generation of the input format essentially replaces step 2, implementing another dynamic shard calculation method. The process of implementing the above steps in this application can be as follows: Figure 4As shown, when the distributed program begins scanning data in Elasticsearch, the distributed controller can initialize the distributed runtime framework. The controller can retrieve files (file.dat) from various file buckets (files) flushed to disk. Each file can be up to 2MB in size. Based on the metadata in these file buckets, the controller initializes the input format, resulting in multiple input splits. Split 1 can include file 1 (file1.dat) and file 2 (file2.dat). Split 2 can include file 4 (file4.dat) and file 5 (file5.dat). Split 3 can include file 3 (file3.dat) and file 8 (file8.dat). Split 4 can include file 6 (file6.dat) and file 7 (file7.dat). The controller can then allocate these splits to the various distributed task executors. In this system, task execution nodes can acquire shard 1, shard 2, shard 3, and shard 4. Each task execution node can then retrieve the metadata stored within that shard. Based on this metadata, the task execution nodes can read the corresponding document from the Elasticsearch database. The Elasticsearch database can contain multiple indexes, each representing a separate database. Each index can contain at least one data block. After reading the document, the task execution nodes can perform corresponding calculations according to predefined business logic.
[0094] According to the method of this embodiment, the specific process of step 2 above can be as follows: Figure 5As shown. After the server begins constructing the input format, it can obtain the number of parallel processes. The server can obtain multiple file sets and sort the files in each file set in descending order of file size. The server can calculate the number of files to be allocated for each shard based on the total number of files in each file set and the number of parallel processes. The server determines whether the number of files in a single file set is greater than or equal to the allocation number. If so, the server extracts several files from the single file set to form a shard. Otherwise, the server can obtain files from other file sets to supplement that file set, thus forming a shard. The server can construct several shards for parallel processes in this way. Each shard includes several files to be allocated. The server can then determine whether there are any remaining unallocated files in the multiple file sets. If so, the server selects the shard with the largest number of files in the file set corresponding to the remaining unallocated files and allocates the remaining unallocated files. The server completes the construction of the input format.
[0095] The distributed data reading method provided in this application allows the server to retrieve multiple file sets from disk. Each file set may include at least one file. Each file includes metadata for at least one document. The server determines the number of files allocated in each shard based on the quotient of the total number of files in each file set and the number of parallel processes. The server can distribute the files in the file sets to several shards of parallel processes. Each shard includes at least a number of allocated files. After determining the number of shards for parallel processes, the server can allocate one shard to each parallel process. After retrieving the shard, the parallel process can read the document corresponding to the metadata in that shard. In this application, by retrieving the metadata of each document and storing the metadata on disk to generate a file set, the server can allocate shards based on this file set, thereby balancing the workload of each parallel process and improving data reading efficiency. Furthermore, since this application no longer relies on the number of Elasticsearch shards for sharding, the number of shards can be flexibly adjusted according to the number of parallel processes, improving the scalability of the distributed system. Simultaneously, this application no longer relies on the Scroll snapshot mechanism for data scanning, avoiding the problem that data updates after snapshot generation cannot be read.
[0096] Figure 6 This illustration shows a flowchart of a distributed data reading method according to an embodiment of this application. The server can perform actions such as... when it needs to read the database. Figures 2 to 5 The illustrated embodiment implements distributed data reading. When executing... Figures 2 to 5 Prior to the illustrated embodiments, as Figure 6As shown, with the server as the execution entity, the server can generate multiple file sets through the following steps. The method in this embodiment may include the following steps:
[0097] S201. Obtain the document to be written and its document number.
[0098] In this embodiment, the server can obtain write requests sent by external devices. The write request may include the document to be written and its metadata. The metadata includes at least the document number of the document to be written.
[0099] S202. Write the document number into the first memory sub-area according to the document number of the document to be written.
[0100] In this embodiment, the server may include a memory region. This memory region may include multiple memory sub-regions. For example, such as Figure 7 As shown, when the document number is an 8-digit hexadecimal value, this memory area can be used to store metadata corresponding to document numbers between 0x00000000 and 0xFFFFFFFF. This memory area can also be further divided into 16 sub-regions based on the document number. Different sub-regions can correspond to different document number ranges. For example, Figure 7 The 16 memory sub-regions can correspond to the 16 intervals [0x00000000, 0x10000000), [0x10000000, 0x20000000), ..., [0xF0000000, 0xFFFFFFFF]. The server can determine the interval corresponding to the document number based on the document number of the document to be written, thereby determining the first memory sub-region corresponding to the document to be written.
[0101] However, directly writing data based on document numbers can easily lead to uneven data distribution. For example, the range [0x00000000, 0x10000000] contains a large amount of data, while the range [0xF0000000, 0xFFFFFFFF] contains very little data. To address this issue, this application can hash the document number after acquisition, converting it into a hash value. The server can then use this hash value to write the metadata of the document to be written to the memory area. Using this hash algorithm, the amount of metadata stored in each memory sub-region can be more evenly distributed, thus ensuring that the data size of each data segment is similar in the later stages.
[0102] In one example, the server can be configured with a preset hash algorithm. This preset hash algorithm converts document numbers into a 32-bit hash value. When the number of documents reaches a certain scale, the metadata of each document will be evenly distributed between 0x00000000 and 0xFFFFFFFF. The server can then divide the numerical range corresponding to this 32-bit hash value into... Figure 7 The diagram shows 16 memory sub-regions. Each sub-region corresponds to a range of hash values. Alternatively, the preset hash algorithm can convert document numbers into hash values from 1 to 32. Figure 7 Each of the 16 memory sub-regions shown corresponds to a hash value. The process by which the server determines the first memory sub-region based on the hash value of the document number may specifically include:
[0103] Step 1: The server uses a preset hash algorithm to calculate the document hash value of the document number.
[0104] Step 2: The server determines the first memory sub-region corresponding to the memory sub-region corresponding to the document hash value based on the document hash value.
[0105] In one example, the memory sub-region is essentially a byte array. The default size of this byte array is 2MB. Alternatively, the user can configure the size of this byte array through the server. Each byte array can store multiple data objects (KeyData). Each data object represents a single piece of written metadata. Before being written to the byte array of the memory sub-region, the metadata is byte-divided. This metadata can be byte-divided according to the structure of the data object. Specifically, the data object can include five fields: offset, length, keyLength, keyData, and oprType. The offset records the offset of the data object within the byte array. This offset occupies 4 bytes. Setting this offset facilitates subsequent parsing of the data content. The keyLength records the total length of the byte-divided data object. This keyLength occupies 2 bytes. Setting this keyLength facilitates subsequent parsing of the disk metadata file content during scanning. The keyLength indicates the byte length of the specific metadata content. This keyLength occupies 2 bytes. The keyData content stores the specific content of the metadata. The keyData content can be 32 bytes long. This is used to store 32-bit file hash values. To improve the scalability of this data object structure, the length of this metadata content can also be specified through the `keyLength` field. The server can determine the actual read range of the `keyData` data based on `keyLength`. This allows the server to dynamically adjust the `KeyData` content, avoiding problems such as insufficient or wasted storage space caused by fixed-length storage. Operation types can be divided into two categories, corresponding to values 0 and 1 respectively. 0 represents inserting or updating Elasticsearch data, and 1 represents deleting Elasticsearch data. This operation type occupies 1 byte. Using this field can prevent invalid data access requests during subsequent scanning. The specific structure of this memory sub-region is as follows... Figure 8 As shown. For example, as Figure 9As shown, this memory region can include three memory sub-regions. Each memory sub-region can include a byte array. Specifically, the byte array in memory sub-region 1 can include one data object. This data object has an offset of 0, a length of 13, a metadata length of 4, metadata content of "0000", and an operation type of 1. The byte array in memory sub-region 3 can include two data objects. The first data object has an offset of 0, a length of 13, a metadata length of 4, metadata content of "1000", and an operation type of 1. The second data object has an offset of 13, a length of 13, a metadata length of 4, metadata content of "2001", and an operation type of 1.
[0106] In one example, the server needs to simultaneously record the metadata corresponding to the document to be written into the memory area and write the document to the Elasticsearch database. To ensure data consistency, the writing of metadata and the writing of the document to be written must both succeed or fail simultaneously. The server can accomplish this task by performing these two operations within the same transaction.
[0107] The specific steps for writing the document to be written to the database may include:
[0108] Step 1: Write the document to be written into the data block of the database.
[0109] In this step, the server can write the document to be written into a data block of the database according to the database's write order. Alternatively, the server can select a database and / or a data block based on information such as the document ID and document hash value of the document to be written. The server can then write the document to be written into the data block of that database.
[0110] Step 2: Obtain the document address of the document to be written in the database.
[0111] In this step, the server can obtain the address of the document to be written.
[0112] Step 3: Write the document address into the metadata corresponding to the document to be written.
[0113] In this step, the server can write the document address to the metadata. The server can write the document address to the metadata before writing it to the memory area. Alternatively, the server can write the document address to the corresponding metadata content field in the memory area after the metadata has been written to the memory area.
[0114] To ensure that the writing of metadata and the writing of the file to be written both succeed or fail simultaneously, the execution steps of this data writing process can be as follows: Figure 10 As shown. The server can obtain write requests. These write requests include the document to be written and its metadata. The server can write this metadata to a memory area. When the write is successful, the server continues to determine whether the memory area needs to be persisted. This persistence involves flushing the metadata in the memory area to disk. Otherwise, if the write fails, the server can continue to determine whether to rewrite. If rewriting is required, the server can rewrite the metadata by restarting the write transaction. Otherwise, the server can determine that the write failed and terminate the write transaction. When the metadata needs to be persisted, the server can create a snapshot of the memory sub-area containing the metadata and generate a flush flag. When the metadata does not need to be persisted, or after creating a snapshot of the memory sub-area, the server can write the document to be written to the database. The server can determine whether the document to be written was successfully written. When the document to be written is successfully written, the server can determine whether the flush flag exists. If the flush flag exists, the server can flush the metadata to disk based on the snapshot. Otherwise, if the document to be written fails, the server continues to determine whether to rewrite the document to be written. If the document to be written needs to be rewritten, the server will continue to check whether the write was successful after rewriting the document. Otherwise, if the document to be written does not need to be rewritten, the server can delete the current write transaction. After the server flushes the snapshot to disk, the server can check whether the flush was successful. If the flush was successful, the write transaction ends. Otherwise, the server will re-flush the snapshot to disk until the snapshot is successfully flushed. When the server determines that the write transaction needs to be deleted, the server needs to check whether the deletion was successful. If the deletion was successful, the write failed, and the write transaction ends. Otherwise, the server will continue to perform the deletion operation until the write transaction is deleted.
[0115] For example, such as Figure 11As shown, the system can include four documents to be written. The document numbers of these four documents can be 0000, 1000, 1001, and 2000, respectively. The server can write the metadata of these four documents to a memory area. The server can also write these four documents to a storage area. The server can write the metadata of the memory area to disk. Based on the document numbers of the four documents, they can be written to three files respectively. File 1 includes the document with document number 0000. File 2 includes the documents with document numbers 1000 and 1001. File 3 includes the document with document number 2000. The server can allocate these three documents to three shards. These three shards constitute the input format. The server can allocate these three shards to three parallel processes. These three parallel processes can read the corresponding documents from the storage area according to the shards.
[0116] S203. When the first memory sub-region meets the write conditions, generate the first file according to all the document numbers in the first memory sub-region. The first file is stored in the first file set on the disk.
[0117] In this embodiment, when the first memory sub-region meets the write conditions, the server triggers the write operation to disk for that sub-region. When the write operation is triggered, the server can create a new byte array in that memory sub-region. When new metadata needs to be written to that memory sub-region, the server can write that metadata to the new byte array. Simultaneously, the server can also generate a snapshot of that byte array in that memory sub-region. Based on this snapshot, the server writes the byte array to disk, generating a file. After the write operation is complete, the server can delete the byte snapshot from the existing sub-region. The server can repeatedly write in this way to write all the metadata within the range corresponding to that memory sub-region to disk, generating multiple files. These files are then aggregated to form a file set corresponding to that memory sub-region.
[0118] In one example, as metadata is written, the byte array of a certain memory sub-region will become full. When the byte array of a certain memory sub-region is full, the server can determine that the flush condition for that memory sub-region is met. At this time, the server can flush the byte array in that memory sub-region to disk. This flush operation ensures that there is enough remaining space in that memory sub-region to write metadata.
[0119] In another example, the server starts timing after completing a flush to a memory sub-region. When the timing reaches a first preset duration, the server can determine that the time interval between the previous flush and the current moment has reached the first preset duration. At this point, the server can flush the byte array in that memory sub-region to disk. This flush operation avoids the problem of metadata residency time caused by the memory sub-region being filled for a long time.
[0120] In one example, the server can generate a first file based on all document numbers in a first memory sub-region and write the first file to disk. For example, the process of the server flushing the byte array in the memory sub-region to disk can be as follows: Figure 12 As shown. Among them, Figure 12 (a) shows three memory sub-regions. Memory sub-region 1 contains document numbers ranging from [0000, 1000). The byte array of memory sub-region 1 stores one data object with document number 0000. Memory sub-region 2 contains document numbers ranging from [1000, 2000]. The byte array of memory sub-region 2 stores two data objects with document numbers 1000 and 1001 respectively. Memory sub-region 3 contains document numbers ranging from [2000, 3000]. The byte array of memory sub-region 3 stores one data object with document number 2000. Assume that the time interval between the current moment and the last write operation for each of these three memory sub-regions reaches the first preset duration. At this time, the server simultaneously performs a write operation on all three memory sub-regions. The specific steps of the server's write operation may include:
[0121] Step 1: Generate a memory snapshot and a new first memory sub-region based on all document numbers in the first memory sub-region.
[0122] In this step, the server can generate a memory snapshot of all document IDs in the first memory sub-region. After generating the memory snapshot, the server can generate a new byte array in that memory sub-region. Subsequent data stored in this first memory region will be stored in this new byte array. This memory snapshot generation effectively avoids data conflicts caused by the need to store data during the flush process. For example, as... Figure 12 As shown in (b), the server can generate a new byte array in each of the three memory sub-regions. This new byte array is empty. When a new document to be written is received, the server can write the metadata corresponding to that document into this new byte array.
[0123] Step 2: Generate the filename of the first file according to the preset naming rules and the current timestamp.
[0124] In this step, since memory snapshots of the same memory sub-region are all written to the same file set, the server can generate filenames based on preset naming rules and timestamps when creating files. Within each file set, the timestamp of each file is unique. Therefore, the server can use this naming method to avoid duplicate filenames within a file set, thus preventing file content from being overwritten due to identical filenames. At the same time, this filename setting also facilitates technical personnel in finding and retrieving the corresponding files.
[0125] The preset naming rules can be set in a configuration file or configuration information. Alternatively, the preset naming rules can also be set directly in the code.
[0126] This preset naming rule can be used to instruct the server to obtain fixed strings such as the memory address corresponding to the first memory sub-region, the identifier string corresponding to the first memory sub-region, and a preset string. Alternatively, this preset naming rule can also be used to instruct the server to generate a random string within a preset range. The server can also determine the concatenation method between the string and the timestamp based on this preset naming rule. For example, such as... Figure 12 As shown in (c), the string and the timestamp can be connected by a "-".
[0127] Step 3: Generate the first file based on the memory snapshot and write the first file into the first file set.
[0128] In this step, the server can generate and store the first file in the storage space corresponding to the first set of files on the disk, based on the memory snapshot. After the memory snapshot is flushed, the server can delete the memory snapshot from the memory sub-region. For example, ... Figure 12 As shown in (c), during the flushing phase, the server can flush memory snapshots to the corresponding file sets based on the memory sub-regions. A memory snapshot can be flushed to one file. Specifically, the memory snapshot of memory sub-region 1 can be flushed to file set 1, generating file 1. The memory snapshot of memory sub-region 2 can be flushed to file set 2, generating file 2. The memory snapshot of memory sub-region 3 can be flushed to file set 3, generating file 3.
[0129] Specifically, the flashing process for this memory sub-region is as follows: Figure 13As shown. The server writes metadata to a sub-region of memory. The server can generate a data object corresponding to this metadata based on the metadata. The server can determine the memory space required to write the data object. The server can obtain the remaining capacity of the byte array. The server can compare the remaining capacity with the memory space required by the data object. If the remaining capacity is sufficient, the server can write the data object to the byte array. Otherwise, it means the byte array is full, and the server can generate a snapshot of the byte array based on the current byte array and record a flush flag. After the snapshot is created, the server can create a new byte array. The server can write the data object to the new byte array. Alternatively, the server can trigger a flush operation periodically according to a scheduled task. When the flush operation is triggered, the server can generate a snapshot based on the byte array in the sub-region of memory and record a flush flag. The server can determine whether a flush flag exists. If a flush flag exists, the server can generate a file based on the memory snapshot. The server can write this file to a file collection on disk.
[0130] The distributed data reading method provided in this application allows the server to receive write requests sent by external devices. The server may include a memory region, which may include multiple memory sub-regions. Different memory sub-regions may correspond to different document number ranges. The server can determine the range corresponding to the document number of the document to be written, thereby determining the first memory sub-region corresponding to the document to be written. The server can write the metadata of the document to be written into the first memory sub-region. When the first memory sub-region meets the flushing conditions, the server triggers the operation of flushing the memory sub-region to disk. The server can generate a first file based on all document numbers in the first memory sub-region. This first file can be stored in a first file set on the disk. In this application, by generating multiple file sets on the disk, the documents to be written are organized, allowing the server to divide the data into fragments based on these multiple file sets, improving the flexibility of subsequent fragment division. Furthermore, in this application, the file set only includes the metadata of each document, resulting in a smaller data volume and effectively improving the fragment generation efficiency. In addition, the metadata may also include the document address of each document. After fragmentation is completed, the server can read the document by point-to-point query based on the metadata in the fragment, improving the real-time performance of the document.
[0131] Figure 14 This application provides a schematic diagram of the structure of a distributed data reading device according to an embodiment of the present application. Figure 14 As shown, the distributed data reading device 10 of this embodiment is used to implement the operation corresponding to the server in any of the above method embodiments. The distributed data reading device 10 of this embodiment includes:
[0132] The acquisition module 11 is used to acquire multiple file sets, each file set may include at least one file, and each file includes the metadata of at least one document;
[0133] The reading module 12 is used to generate several shards for parallel processes based on multiple file sets and the number of parallel processes, with each shard containing at least one file; and to allocate one shard to each parallel process so that the parallel process can read documents from the database based on the metadata of the documents in the shard.
[0134] In one example, module 12 is used specifically for:
[0135] The allocation number is determined based on the total number of files in multiple file sets and the number of parallel processes;
[0136] Sort the files in each file set according to their file size;
[0137] Based on the sorted file sets and the number of allocations, the files in the file sets are allocated to several slices in the parallel process.
[0138] In one example, module 12 is used specifically for:
[0139] Based on the sorted file sets, the files in the file sets are sequentially allocated to the shards. Each shard contains several allocated files, and the files in each shard come from the same file set.
[0140] When there are unallocated files in the file set, the unallocated files are allocated to the remaining fragments according to the allocation number. Each fragment includes the allocation number of files, and the files in each fragment come from multiple file sets.
[0141] When there are unallocated files in the file set, the unallocated files are allocated according to the file set corresponding to the unallocated files and the file set corresponding to the files in each segment.
[0142] In one example, module 11 is used specifically for:
[0143] Retrieve the document to be written and its metadata, which must include at least the document number.
[0144] Write the document number into the first memory sub-region based on the document number of the document to be written;
[0145] When the first memory sub-region meets the write conditions, the first file is generated based on all the document numbers in the first memory sub-region, and the first file is stored in the first file set on the disk.
[0146] In one example, the memory region includes multiple memory sub-regions, and module 11 is specifically used for:
[0147] Use a preset hash algorithm to calculate the document hash value of the document number;
[0148] The first memory sub-region is determined based on the document hash value.
[0149] In one example, the write conditions include the first memory sub-region being full and / or the time interval between the last write and the current time reaching a first preset duration.
[0150] In one example, module 11 is also used for:
[0151] Write the document to be written into the data block of the database;
[0152] Retrieve the document address of the document to be written in the database;
[0153] Write the document address into the metadata corresponding to the document to be written.
[0154] The distributed data reading device 10 provided in this application embodiment can execute the above method embodiment. Its specific implementation principle and technical effect can be found in the above method embodiment, and will not be repeated here.
[0155] Figure 15 A schematic diagram of the hardware structure of a server provided in an embodiment of this application is shown. Figure 15 As shown, the server 20 is used to implement the operations corresponding to the server in any of the above method embodiments. The server 20 in this embodiment may include: a memory 21, a processor 22, and a communication interface 24.
[0156] The server 20 can be the master node in a distributed cluster. The server 20 can control the task execution nodes (other servers) in the distributed cluster to achieve distributed data reading, distributed data processing, and other operations. The data in this application is also stored in a distributed database. Each database can correspond to an index in the server 20. Each database can include multiple data blocks. In this distributed cluster, the data storage node used to store data and the task execution node used to perform operations can correspond to different servers. Alternatively, there can be some overlap between the data storage node used to store data and the task execution node used to perform operations. That is, a single server can serve as both a data storage node and a task execution node.
[0157] Each node in this distributed cluster includes a memory. This memory may include high-speed random access memory (RAM) or non-volatile memory (NVM), such as at least one disk drive, or a USB flash drive, external hard drive, read-only memory, disk, or optical disc. The memory 21 in server 20 can also be used to store computer programs.
[0158] Each node in this distributed cluster includes a processor. This processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The processor 22 in server 20 executes the computer program stored in memory to implement the distributed data reading method described in the above embodiments. Combining the steps of the method disclosed in this invention, the distributed data reading method in the above embodiments can be directly implemented by a hardware processor, or by a combination of hardware and software modules within the processor.
[0159] Optionally, the memory in each node can be either independent or integrated with the processor. When the memory is a device independent of the processor, the node may also include a bus. This bus is used to connect the memory and the processor. This bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.
[0160] Each node in this distributed cluster includes a communication interface. This communication interface is used to enable communication between the nodes in the distributed cluster. Specifically, communication interface 24 in server 20 is used to send shard information to each task execution node, so that each task execution node can read the corresponding document from the data storage node based on the shard information. Communication interface 24 in server 20 can also be used to connect to external devices, obtain query requests sent by external devices, and return query results to external devices.
[0161] The server provided in this embodiment can be used to execute the above-described distributed data reading method. Its implementation and technical effects are similar, and will not be described again here.
[0162] This application also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, is used to implement the methods provided in the various embodiments described above.
[0163] The computer-readable storage medium can be a computer storage medium or a communication medium. A communication medium includes any medium that facilitates the transfer of a computer program from one location to another. A computer storage medium can be any available medium accessible to a general-purpose or special-purpose computer. For example, a computer-readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the computer-readable storage medium. Of course, the computer-readable storage medium can also be a component of the processor. The processor and the computer-readable storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the ASIC can reside in a user equipment. Of course, the processor and the computer-readable storage medium can also exist as discrete components in a communication device.
[0164] Specifically, the computer-readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random-Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The storage medium can be any available medium accessible to general-purpose or special-purpose computers.
[0165] This application also provides a computer program product comprising a computer program stored in a computer-readable storage medium. At least one processor of the device can read the computer program from the computer-readable storage medium, and the at least one processor executes the computer program to cause the device to implement the methods provided in the various embodiments described above.
[0166] This application also provides a chip including a memory and a processor. The memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that a device with the chip installed performs the methods described in the various possible implementations above.
[0167] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or modules may be electrical, mechanical, or other forms.
[0168] The modules can be physically separate, for example, installed in different locations within a single device, installed on different devices, distributed across multiple network units, or distributed across multiple processors. Alternatively, the modules can be integrated, for example, installed in the same device, or integrated into a single codebase. The modules can exist in hardware form, software form, or a combination of both. This application can select some or all of the modules to achieve the objectives of this embodiment based on actual needs.
[0169] When the various modules are implemented as integrated software functional modules, they can be stored in a computer-readable storage medium. The aforementioned software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.
[0170] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
[0171] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and not to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features. These modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.
Claims
1. A method of distributed reading of data, characterized by, The method includes: Obtain multiple file sets, each file set including at least one metadata file, each metadata file including at least one document's metadata, the metadata including the document address; The allocation number is determined based on the total number of files in the multiple file sets and the number of parallel processes; Sort the metadata files in each file set according to their file size; The metadata files in the file sets are allocated to several shards of the parallel process according to the sorted file sets and the allocation number as follows: Based on the sorted file sets, the metadata files in the file sets are sequentially allocated to the shards, each shard including several allocated metadata files, and the metadata files in each shard coming from the same file set; when there are unallocated metadata files in the file sets, the unallocated metadata files are allocated to the remaining shards according to the allocation number, each shard including several allocated metadata files, and the metadata files in each shard coming from multiple file sets; when there are unallocated metadata files in the file sets, based on the file set corresponding to the unallocated metadata files and the file set corresponding to the metadata files in each shard, the unallocated metadata files are allocated to the shard containing the largest number of metadata files from the file set corresponding to the unallocated metadata files, each shard including at least one metadata file; Each parallel process is allocated a shard so that it can read the shard, obtain the metadata of the document in each metadata file from the shard, read the document from the database according to the document address in the metadata, and execute business logic according to the document.
2. The method of claim 1, wherein, The acquisition of multiple file sets specifically includes: Obtain the document to be written and its metadata, wherein the metadata includes at least the document number; Write the document number into the first memory sub-region according to the document number of the document to be written; When the first memory sub-region meets the write conditions, a first metadata file is generated based on all document numbers in the first memory sub-region, and the first metadata file is stored in the first file set on the disk.
3. The method according to claim 2, characterized in that, The memory region includes multiple memory sub-regions. Specifically, writing the document number into the first memory sub-region according to the document number of the document to be written includes: The document hash value of the document number is calculated using a preset hash algorithm; Based on the document hash value, the first memory sub-region is determined, and the document number is written into the first memory sub-region.
4. The method according to claim 2, characterized in that, The step of generating a first metadata file based on all document IDs in the first memory sub-region, wherein the first metadata file is stored in a first file set on the disk, specifically includes: Generate a memory snapshot and a new first memory sub-region based on all document numbers in the first memory sub-region; Generate the filename of the first metadata file according to the preset naming rules and the current timestamp; Based on the memory snapshot, the first metadata file is generated and written into the first file set.
5. The method according to claim 2, characterized in that, The method further includes: Write the document to be written into a data block in the database; Obtain the document address of the document to be written in the database; Write the document address into the metadata corresponding to the document to be written.
6. A distributed data reading device, characterized in that, The device includes: The acquisition module is used to acquire multiple file sets, each file set including at least one metadata file, each metadata file including at least one document's metadata, the metadata including the document address; The reading module is configured to: determine the allocation number based on the total number of files in multiple file sets and the number of parallel processes; sort the metadata files in each file set according to their file size; allocate the metadata files in each file set to several shards of the parallel processes according to the sorted file sets and the allocation number, with each shard including at least one metadata file; allocate one shard to each parallel process so that the parallel process can read the shard, obtain the metadata of the document in each metadata file from the shard, and read the document from the database according to the document address in the metadata; The reading module is specifically configured to: sequentially allocate metadata files in each of the sorted file sets to the shards, wherein each shard includes several allocated metadata files, and the metadata files in each shard come from the same file set; when there are unallocated metadata files in the file sets, allocate the unallocated metadata files to the remaining shards according to the allocation number, wherein each shard includes several allocated metadata files, and the metadata files in each shard come from multiple file sets; when there are unallocated metadata files in the file sets, allocate the unallocated metadata files to the shard containing the largest number of metadata files from the file sets corresponding to the unallocated metadata files, according to the file sets corresponding to the unallocated metadata files and the file sets corresponding to the metadata files in each shard; The device is also used to: execute business logic based on the document.
7. A server, characterized in that, The server includes: a memory and a processor; the memory is used to store a computer program; the processor is used to implement the distributed data reading method as described in any one of claims 1 to 5 according to the computer program stored in the memory.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, which, when executed by a processor, is used to implement the distributed data reading method as described in any one of claims 1 to 5.