Data processing method and device, electronic equipment and storage medium
By optimizing the data access process using hash operations and memory pools on GPU servers, the problem of long processing time for sample data access requests in embodied intelligence models is solved, achieving more efficient data processing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- JD DIGITS HAIYI INFORMATION TECHNOLOGY CO LTD
- Filing Date
- 2026-02-14
- Publication Date
- 2026-06-30
AI Technical Summary
When the embodied intelligence model is deployed in a kilocalorie cluster, the processing time for sample data access requests is long, resulting in data access requests queuing and reducing data processing efficiency.
By using the CPU in the GPU server, a hash operation strategy is employed to select the target reading process from multiple candidate reading processes, and sample data access requests are processed in parallel. Combined with a memory pool and a prefetching mechanism, the data access process is optimized.
It improved the processing speed of sample data access requests, shortened the queuing time, and improved data processing efficiency.
Smart Images

Figure CN122309508A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing technology, and in particular to data processing methods, apparatus, electronic devices and storage media. Background Technology
[0002] Currently, the embodied intelligence model is deployed in a cluster of thousands of computers, involving hundreds of millions of sample data points. The sample data is stored and managed using the Hugging Face LeRobot v2.1 format. This format follows a "single episode, single file" design, where each episode has a small data volume and is stored independently.
[0003] Training embodied intelligence models requires a large amount of sample data, which in turn requires the data reading process to handle a large number of data access requests. The processing time for each individual data access request is relatively long, leading to queuing of data access requests and reducing data processing efficiency. Summary of the Invention
[0004] This application discloses a data processing method, apparatus, electronic device, and storage medium.
[0005] One embodiment of this application proposes a data processing method applied to a central processing unit (CPU) in a GPU server. The method includes: acquiring a sample data access request, the sample data access request including a target data index; selecting a target reading process from multiple candidate reading processes according to the target data index and a hash operation strategy; reading target sample data corresponding to the target data index from memory through the target reading process; and sending the target sample data to the requester of the sample data access request.
[0006] The data processing method of this application embodiment involves a central processing unit (CPU) in a GPU server acquiring a sample data access request, which includes a target data index. Based on the target data index and a hash operation strategy, a target reading process is selected from multiple candidate reading processes. The target reading process reads the target sample data corresponding to the target data index from memory. The target sample data is then sent to the requester of the sample data access request. The method utilizes multiple candidate reading processes to process different sample data access requests in parallel, which improves the processing speed of sample data access requests, shortens the queuing time of sample data access requests, and enhances data processing efficiency.
[0007] Another embodiment of this application proposes a data processing apparatus applied to a central processing unit (CPU) in a GPU server. The apparatus includes: a first acquisition module for acquiring a sample data access request, the sample data access request including a target data index; a selection module for selecting a target reading process from multiple candidate reading processes based on the target data index and a hash operation strategy; a first reading module for reading target sample data corresponding to the target data index from memory through the target reading process; and a sending module for sending the target sample data to the requester of the sample data access request.
[0008] The data processing apparatus of this application embodiment includes a central processing unit (CPU) in a GPU server. The CPU acquires a sample data access request, which includes a target data index. Based on the target data index and a hash operation strategy, it selects a target reading process from multiple candidate reading processes. The target reading process then reads the target sample data corresponding to the target data index from memory. Finally, it sends the target sample data to the requester of the sample data access request. The use of multiple candidate reading processes to process different sample data access requests in parallel can improve the processing speed of sample data access requests, shorten the queuing time of sample data access requests, and improve data processing efficiency.
[0009] Another embodiment of this application proposes an electronic device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the data processing method of the embodiments of this application.
[0010] Another embodiment of this application proposes a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the data processing method disclosed in the embodiments of this application.
[0011] Another embodiment of this application proposes a computer program product that, when executed by an instruction processor in the computer program product, implements the data processing method of the embodiments of this application.
[0012] Other effects of the above-mentioned alternative methods will be described below in conjunction with specific embodiments. Attached Figure Description
[0013] The accompanying drawings are provided for a better understanding of this solution and do not constitute a limitation of this application. Wherein:
[0014] Figure 1 This is a schematic flowchart of a data processing method according to an embodiment of this application.
[0015] Figure 2This is a schematic flowchart of a data processing method according to another embodiment of this application.
[0016] Figure 3 This is a schematic flowchart of a data processing method according to another embodiment of this application.
[0017] Figure 4 This is a schematic diagram illustrating the process of reading sample data in the candidate data process.
[0018] Figure 5 This is a schematic diagram of the data processing framework.
[0019] Figure 6 This is a schematic diagram of the structure of a data processing apparatus according to an embodiment of this application.
[0020] Figure 7 This is a block diagram of an electronic device according to an embodiment of the present application. Detailed Implementation
[0021] Embodiments of the present invention are described in detail below. Examples of these embodiments are illustrated in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and intended to explain this application, and should not be construed as limiting this application.
[0022] It should be noted that all information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data used for analysis, stored data, displayed data, etc.), and signals involved in this application are authorized by the user or fully authorized by all parties, and the collection, use, and processing of related data must comply with the relevant laws, regulations, and standards of the relevant countries and regions. The acquisition, transmission, storage, use, and processing of data in the technical solution of this application all comply with the relevant provisions of national laws and regulations.
[0023] It should be noted that in the embodiments of this application, certain software, components, models and other existing solutions in the industry may be mentioned. These should be regarded as exemplary and are only intended to illustrate the feasibility of implementing the technical solution of this application. However, it does not mean that the applicant has used or necessarily used the solution.
[0024] The data processing method, apparatus, electronic device, and storage medium of this application are described below with reference to the accompanying drawings.
[0025] Figure 1This is a schematic flowchart of a data processing method according to an embodiment of this application. It should be noted that the execution entity of the data processing method provided in this embodiment is a data processing device, which can be implemented by software and / or hardware. In this embodiment, the data processing device can be an electronic device, or can be configured in an electronic device to enable the electronic device to have data processing capabilities.
[0026] In this embodiment, the electronic device may include, but is not limited to, terminal devices, servers, server clusters, platforms, etc., and this embodiment does not specifically limit the electronic device. In the following embodiments, the data processing device is described using the Central Processing Unit (CPU) in a Graphics Processing Unit (GPU) server as an example.
[0027] like Figure 1 As shown, the data processing method may include: Step 101: Obtain the sample data access request, which includes the target data index.
[0028] In this embodiment of the application, taking an embodied intelligent robot as an example in the field of robotics, the embodied intelligent robot is equipped with an embodied intelligent model. The sample data involved in training the embodied intelligent model may include at least one of the following: sensor data, language command data, and robot motion data. The sensor data may include at least one of the following: robot body state data and environmental state data; the environmental state data may include environmental image data.
[0029] The sensors used to collect sensor data can include internal and external sensors. Internal sensors can be used to collect robot body state data, while external sensors can be used to collect environmental state data. Robot body state data includes, for example, joint angle data and end-effector coordinate data.
[0030] In this embodiment, the sample dataset can be stored in the memory of the GPU server. The memory can be, for example, a disk. The sample dataset can be stored in the memory on a task-by-task basis. For example, sample data from one task execution can be stored in a file; or, sample data from multiple task executions can be stored in a single file. The number of sample data points for one task execution can be multiple.
[0031] For example, consider storing sample data from multiple task executions in a single file. The file can include a data directory (e.g., the `data` directory), an index directory (e.g., the `meta` directory), and a vision directory (e.g., the `videos` directory). The `meta` directory stores the indexes of the sample data for each task execution. For instance, it shows the location and range of the sample data within the `data` directory. Searching the `data` directory based on this location and range allows you to find robot body state data, robot motion data, etc., at multiple time steps within that location and range. Similarly, searching the `videos` directory based on the same location and range allows you to find environmental image data at each time step within that location and range. Each time step corresponds to one row. The robot body state data, robot motion data, etc., at each time step correspond to the same row index.
[0032] In this embodiment of the application, the target data index may refer to at least one of the following: a file index, a sample data index, etc. For example, the sample data index may be the file index to which the sample data belongs plus the line index of the sample data within the file.
[0033] In this embodiment, there can be multiple sample data access requests. The CPU can concurrently acquire multiple sample data access requests.
[0034] Step 102: Select the target reading process from multiple candidate reading processes based on the target data index and hash operation strategy.
[0035] In this embodiment of the application, the process of the CPU executing step 102 may be, for example, performing a hash operation on the target data index according to the hash operation strategy to obtain a hash value; determining the first candidate reading process corresponding to the hash value among multiple candidate reading processes; and determining the first candidate reading process as the target reading process.
[0036] The hashing strategy is used to calculate data indexes, mapping them to a finite number of hash values. The number of hash values corresponds to the number of candidate read processes. There is a one-to-one correspondence between hash values and candidate read processes.
[0037] By setting a hash operation strategy, when the CPU receives multiple sample data access requests, it can distribute these requests to different candidate reading processes to achieve parallel processing of multiple sample data access requests, thereby improving the processing speed of multiple sample data access requests.
[0038] Step 103: Read the target sample data corresponding to the target data index from the memory through the target reading process.
[0039] In this embodiment, the number of Input / Output Operations Per Second (IOPS) of the memory in the GPU server is limited. To avoid exceeding the memory's IOPS limit, the number of memory accesses can be reduced by using an Object File System (OFS). Correspondingly, the CPU's execution of step 103 can, for example, involve sending the target data index to the target reading process via the OFS, instructing the target reading process to read the target sample data from the memory according to the target data index and store it in the OFS system; and querying the OFS system according to the target data index to obtain the target sample data from the OFS system.
[0040] Step 104: Send the target sample data to the party requesting the sample data access request.
[0041] The requester for sample data access can be, for example, a Data Loader Worker process.
[0042] The data processing method of this application embodiment involves a central processing unit (CPU) in a GPU server acquiring a sample data access request, which includes a target data index. Based on the target data index and a hash operation strategy, a target reading process is selected from multiple candidate reading processes. The target reading process reads the target sample data corresponding to the target data index from memory. The target sample data is then sent to the requester of the sample data access request. The method utilizes multiple candidate reading processes to process different sample data access requests in parallel, which improves the processing speed of sample data access requests, shortens the queuing time of sample data access requests, and enhances data processing efficiency.
[0043] based on Figure 1 Based on the previous embodiment, in order to further improve the processing efficiency of sample data access requests and further shorten the queuing time of sample data access requests, a memory pool can be set up. The memory pool can pre-store multiple candidate sample data and the candidate data index corresponding to the candidate sample data. For sample data access requests, the memory pool can be queried first to obtain the target sample data. Figure 2 This is a schematic flowchart of a data processing method according to another embodiment of this application. It should be noted that this embodiment is... Figure 1 Further refinement or optimization of the embodiments.
[0044] like Figure 2 As shown, the data processing method may include: Step 201: Obtain the sample data access request, which includes the target data index.
[0045] Step 202: Query the memory pool based on the target data index to determine whether the target data index exists in the memory pool.
[0046] In this embodiment, the CPU includes a memory pool for storing multiple candidate sample data and their corresponding candidate data indices. The memory pool can store and process multiple candidate sample data in units of time steps. For example, a single data entry may include candidate sample data from one time step. The candidate data index corresponding to a candidate sample data entry at one time step can be a file index plus a row index of the sample data within the file.
[0047] In one embodiment of this application, the multiple candidate sample data stored in the memory pool can be candidate sample data with high reading frequency. For example, candidate sample data with a reading frequency greater than or equal to a first frequency threshold. In another embodiment, the multiple candidate sample data stored in the memory pool can be batch candidate sample data with high reading frequency. For example, batch candidate sample data with a corresponding reading frequency greater than or equal to a second frequency threshold. The reading frequency for batch candidate sample data can be determined based on the reading records of each sample data within the batch candidate sample data.
[0048] In this embodiment, to improve the query efficiency of the memory pool and reduce the number of candidate sample data that need to be traversed in the memory pool, multiple memory regions can be set in the memory pool; there can be a correspondence between the multiple memory regions and the multiple candidate reading processes. This correspondence can be, for example, a one-to-one correspondence, a one-to-many correspondence, etc. Correspondingly, the process by which the CPU queries the memory pool based on the target data index can be, for example, as follows: determine the hash value corresponding to the target data index; determine the first memory region corresponding to the hash value; query the first memory region in the memory pool based on the target data index; determine whether the target data index exists in the first memory region; if the target data index exists in the first memory region, determine that the target data index exists in the memory pool; if the target data index does not exist in the first memory region, determine that the target data index does not exist in the memory pool.
[0049] Step 203: If the target data index exists in the memory pool, read the target sample data corresponding to the target data index from the memory pool.
[0050] In this embodiment of the application, when multiple memory regions are set in the memory pool, the target sample data corresponding to the target data index can be read from the memory region corresponding to the target data index in the memory pool.
[0051] The memory region corresponding to the target data index is the memory region corresponding to the hash value determined based on the target data index.
[0052] Step 204: If the target data index does not exist in the memory pool, select the target reading process from multiple candidate reading processes based on the target data index and the hash operation strategy.
[0053] Step 205: Read the target sample data corresponding to the target data index from the memory through the target reading process.
[0054] Step 206: Send the target sample data to the requester of the sample data access request.
[0055] For detailed explanations of steps 201, 205, and 206, please refer to [link / reference needed]. Figure 1 The contents of Example 101, steps 103 to 104 will not be described in detail here.
[0056] The data processing method of this application embodiment involves a central processing unit (CPU) in a GPU server. The CPU acquires a sample data access request, which includes a target data index. It then queries a memory pool based on the target data index to determine if the index exists. If the target data index exists in the memory pool, it reads the target sample data corresponding to the index from the memory pool. If the target data index does not exist in the memory pool, it selects a target reading process from multiple candidate reading processes based on the index and a hash operation strategy. The target reading process then reads the target sample data corresponding to the index from the memory. Finally, it sends the target sample data to the requester of the sample data access request. The method utilizes a memory pool that can pre-store multiple candidate sample data and their corresponding candidate data indices. By querying the memory pool first to obtain the target sample data for the sample data access request, the processing efficiency of the request can be further improved, and the queuing time for the request can be further shortened.
[0057] based on Figure 2 Based on the previous embodiment, in order to further improve the processing efficiency of sample data access requests, if the prefetching conditions are met, candidate sample data can be prefetched and stored to update the candidate sample data in the memory pool. Figure 3 This is a schematic flowchart of a data processing method according to another embodiment of this application. It should be noted that this embodiment is... Figure 1 Further refinement or optimization of the embodiments.
[0058] like Figure 3 As shown, the data processing method may include: Step 301: If the prefetching conditions are met, determine at least one candidate data index to be prefetched.
[0059] In this application embodiment, the prefetching conditions include at least one of the following: being in the model training initialization phase; the number of candidate data indices in the memory pool being less than or equal to a first quantity threshold; and the continuous duration during which candidate sample data in the memory pool has not been read being greater than or equal to a first duration.
[0060] Specifically, the number of candidate data indices in the memory pool gradually decreases as candidate sample data is read from the memory pool. If the number of candidate data indices in the memory pool is less than or equal to a first threshold, it indicates that the number of candidate sample data in the memory pool is insufficient and needs to be replenished. Therefore, candidate sample data corresponding to at least one candidate data index to be prefetched can be obtained and stored in the memory pool.
[0061] If the continuous duration during which candidate sample data in the memory pool remains unread is greater than or equal to a first duration, it indicates that some candidate sample data in the memory pool is not needed by the CPU and needs to be updated. Therefore, candidate sample data corresponding to at least one candidate data index to be prefetched can be obtained; and based on the obtained candidate sample data, the candidate sample data in the memory pool that has not been read for a longer period of time can be updated and overwritten.
[0062] In this embodiment of the application, the process by which the CPU determines at least one candidate data index to be prefetched can be, for example, determining whether a historical sample data access request has been obtained; if no historical sample data access request has been obtained, randomly selecting at least one candidate data index to be prefetched from each data index in the memory; if a historical data access request has been obtained, predicting at least one candidate data index to be prefetched based on the historical data index in the historical data access request.
[0063] In one example, the prefetching condition is that the model is in the initial training phase. During this phase, there are no historical sample data access requests, and at least one candidate data index to be prefetched can be randomly selected from the various data indices in the memory.
[0064] In another example, the prefetching condition can be any of the following: the number of candidate data indices in the memory pool is less than or equal to a first quantity threshold; the continuous duration during which candidate sample data in the memory pool has not been read is greater than or equal to a first duration. Correspondingly, the CPU can predict at least one candidate data index to be prefetched based on the historical data indexes in the historical data access requests.
[0065] Step 302: Obtain candidate sample data corresponding to at least one candidate data index.
[0066] In this embodiment of the application, the process of the CPU executing step 302 may be as follows: for at least one candidate data index, determine a first number of prefetch threads to be invoked and a second number of candidate data indices to be processed by each prefetch thread; invoke the first number of prefetch threads, and allocate at least one candidate data index to the first number of prefetch threads according to the second number, so as to obtain candidate sample data corresponding to at least one candidate data index.
[0067] Specifically, the CPU can determine the first number of prefetch threads to be invoked based on the maximum number of candidate data indices that each prefetch thread can process and the number of candidate data indices to be prefetched; based on the first number of prefetch threads to be invoked and the number of candidate data indices to be prefetched, it can determine the second number of candidate data indices that each prefetch thread needs to process; then, by combining the states of multiple prefetch threads, it can select the prefetch thread to be invoked from multiple prefetch threads; and then perform the prefetch thread invocation process.
[0068] In this embodiment, the prefetch thread's processing method for the allocated candidate data index includes: selecting a first candidate reading process from multiple candidate reading processes according to a hash operation strategy for the candidate data index; sending the candidate data index to the first candidate reading process through the OFS system to instruct the first candidate reading process to read candidate sample data from the memory according to the candidate data index and store it in the OFS system; and querying the OFS system according to the candidate data index to obtain candidate sample data in the OFS system.
[0069] By implementing the interaction between the prefetch thread and the candidate read process through the OFS system, the interaction between the prefetch thread and the candidate read process can be reduced, the number of memory operations can be reduced, and the number of IP operations can be prevented from exceeding the memory's IOPS limit.
[0070] Step 303: Store at least one candidate data index and the candidate sample data corresponding to the candidate data index into the memory pool.
[0071] Step 304: Obtain the sample data access request, which includes the target data index.
[0072] Step 305: Query the memory pool based on the target data index to determine whether the target data index exists in the memory pool.
[0073] Step 306: If the target data index exists in the memory pool, read the target sample data corresponding to the target data index from the memory pool.
[0074] Step 307: If the target data index does not exist in the memory pool, select the target reading process from multiple candidate reading processes based on the target data index and the hash operation strategy.
[0075] Step 308: Read the target sample data corresponding to the target data index from the memory through the target reading process.
[0076] Step 309: Send the target sample data to the party requesting the sample data access request.
[0077] For detailed explanations of steps 304 to 309, please refer to [link / reference needed]. Figure 1 The contents of steps 201 to 206 in the embodiment will not be described in detail here.
[0078] The data processing method of this application embodiment, wherein the central processing unit (CPU) in the GPU server, determines at least one candidate data index to be prefetched when prefetching conditions are met; obtains candidate sample data corresponding to the at least one candidate data index; stores the at least one candidate data index and the candidate sample data corresponding to the candidate data index in a memory pool; obtains a sample data access request, the sample data access request including a target data index; queries the memory pool according to the target data index to determine whether the target data index exists in the memory pool; if the target data index exists in the memory pool, reads the target sample data corresponding to the target data index from the memory pool; if the target data index does not exist in the memory pool, selects a target reading process from multiple candidate reading processes according to the target data index and a hash operation strategy; reads the target sample data corresponding to the target data index from the memory through the target reading process; and sends the target sample data to the requester of the sample data access request. Wherein, when prefetching conditions are met, prefetching and storing candidate sample data can be performed to update the candidate sample data in the memory pool, which can further improve the processing efficiency of the sample data access request.
[0079] To ensure a clear understanding of this application, the following will be combined with... Figures 4 to 5 The method of this embodiment is described by way of example.
[0080] Figure 4 This is a schematic diagram illustrating the process of reading sample data during the candidate data process. Figure 4 This may include the following steps: Step 401: Load the dataset and read meta / episodes.jsonl to obtain the "directory".
[0081] The meta / episodes.jsonl file is read from the index directory (e.g., the meta directory) based on the candidate data index.
[0082] Step 402: Select an episode (e.g., index #42).
[0083] Index #42 can be an index for a batch of sample data in a data directory (e.g., the data directory).
[0084] Step 403: Extract key location information: data line range + video file path.
[0085] Step 404, Parallel loading and association.
[0086] Among them, from data / Lines 1250-1450 of the .parquet file read the sequence of state and action values, and the video stream is read from videos / ... / episode_000042.
[0087] Here, "data" can refer to the "data" directory, i.e., the data directory. "Videos" can refer to the "videos" directory, i.e., the visual directory.
[0088] Step 405: At each time step, for example, time step t=1250, combine the state data observation.state from the 1250th row of the data table with the image observation.image from the 1250th frame of the video to form a complete observation.
[0089] Step 406: Output the data to the model for training or inference.
[0090] Figure 5 This is a schematic diagram of the data processing framework. Figure 5 In the process, the stat / open requests (i.e., sample data access requests) of the DataLoader worker in the kilo-level GPU cluster are submitted to the memory pool. When the memory pool is hit, it returns batch metadata (i.e., sample data). When the memory pool is not hit, the stat / open request is forwarded to the MDS-Shard (i.e., the target reading process) in the distributed MetaServer cluster through OFS unified storage to query the storage to obtain sample data. The DataLoader worker provides the obtained sample data to the GPU in its server for model training.
[0091] In the memory pool, the metadata prefetching thread (i.e., the prefetching thread) interacts with the MDS-Shard in the distributed MetaServer cluster through OFS unified storage to prefetch sample data and store it in the memory pool.
[0092] Corresponding to the data processing methods provided in the above embodiments, one embodiment of this application also provides a data processing apparatus. Since the data processing apparatus provided in this application corresponds to the data processing methods provided in the above embodiments, the implementation methods of the data processing methods are also applicable to the data processing apparatus provided in this embodiment, and will not be described in detail in this embodiment.
[0093] Figure 6 This is a schematic diagram of a data processing apparatus according to an embodiment of this application. It should be noted that the data processing apparatus can be implemented in software and / or hardware. In this embodiment, the data processing apparatus can be an electronic device, or can be configured within an electronic device. The electronic device in this embodiment may include, but is not limited to, terminal devices and servers, etc., and this embodiment does not specifically limit the type of electronic device.
[0094] like Figure 6 As shown, the data processing device 600 includes: a first acquisition module 601, a selection module 602, a first reading module 603, and a sending module 604.
[0095] The system includes a first acquisition module 601, which acquires a sample data access request, the sample data access request including a target data index; a selection module 602, which selects a target reading process from multiple candidate reading processes based on the target data index and a hash operation strategy; a first reading module 603, which reads the target sample data corresponding to the target data index from the memory through the target reading process; and a sending module 604, which sends the target sample data to the requester of the sample data access request.
[0096] In one embodiment of this application, the selection module 602 is specifically used to: perform a hash operation on the target data index according to the hash operation strategy to obtain a hash value; determine a first candidate reading process corresponding to the hash value among the plurality of candidate reading processes; and determine the first candidate reading process as the target reading process.
[0097] In one embodiment of this application, the first reading module 603 is specifically used to send the target data index to the target reading process through the Object File System (OFS) to instruct the target reading process to read the target sample data from the memory according to the target data index and store it in the OFS system; and to query the OFS system according to the target data index to obtain the target sample data in the OFS system.
[0098] In one embodiment of this application, the CPU is provided with a memory pool for storing multiple candidate sample data and candidate data indexes corresponding to the candidate sample data; the device further includes: a first determining module and a second reading module; the first determining module is used to query the memory pool according to the target data index to determine whether the target data index exists in the memory pool; the second reading module is used to read the target sample data corresponding to the target data index from the memory pool if the target data index exists in the memory pool.
[0099] In one embodiment of this application, the apparatus further includes: a second determining module, a second acquiring module, and a storage module; the second determining module is configured to determine at least one candidate data index to be prefetched when the prefetching conditions are met; the second acquiring module is configured to acquire candidate sample data corresponding to the at least one candidate data index; and the storage module is configured to store the at least one candidate data index and the candidate sample data corresponding to the candidate data index into the memory pool.
[0100] In one embodiment of this application, the prefetching conditions include at least one of the following: being in the model training initialization phase; the number of candidate data indices in the memory pool being less than or equal to a first quantity threshold; and the continuous duration during which candidate sample data in the memory pool has not been read being greater than or equal to a first duration.
[0101] In one embodiment of this application, the second determining module is specifically used to: determine whether a historical sample data access request has been obtained; if no historical sample data access request has been obtained, randomly select at least one candidate data index to be prefetched from each data index in the memory; if the historical data access request has been obtained, predict the at least one candidate data index to be prefetched based on the historical data index in the historical data access request.
[0102] In one embodiment of this application, the second acquisition module is specifically configured to: determine a first number of prefetch threads to be invoked and a second number of candidate data indices to be processed by each prefetch thread for the at least one candidate data index; invoke the first number of prefetch threads; and allocate the at least one candidate data index to the first number of prefetch threads according to the second number, so as to obtain candidate sample data corresponding to the at least one candidate data index.
[0103] In one embodiment of this application, the prefetch thread's processing method for the allocated candidate data index includes: selecting a first candidate reading process from multiple candidate reading processes according to a hash operation strategy for the candidate data index; sending the candidate data index to the first candidate reading process through the OFS system to instruct the first candidate reading process to read the candidate sample data from the memory according to the candidate data index and store it in the OFS system; and querying the OFS system according to the candidate data index to obtain the candidate sample data in the OFS system.
[0104] In one embodiment of this application, the target sample data includes at least one of the following: sensor data, language command data, and robot motion data; the sensor data includes at least one of the following: robot body state data and environmental state data; the environmental state data includes environmental image data.
[0105] The data processing apparatus of this application embodiment includes a central processing unit (CPU) in a GPU server. The CPU acquires a sample data access request, which includes a target data index. Based on the target data index and a hash operation strategy, it selects a target reading process from multiple candidate reading processes. The target reading process then reads the target sample data corresponding to the target data index from memory. Finally, it sends the target sample data to the requester of the sample data access request. The use of multiple candidate reading processes to process different sample data access requests in parallel can improve the processing speed of sample data access requests, shorten the queuing time of sample data access requests, and improve data processing efficiency.
[0106] According to embodiments of this application, this application also provides an electronic device and a readable storage medium.
[0107] Figure 7 This is a block diagram of an electronic device according to an embodiment of the present application.
[0108] like Figure 7 As shown, the electronic device includes: The memory 701, the processor 702, and computer instructions stored in the memory 701 and executable on the processor 702.
[0109] When processor 702 executes instructions, it implements the data processing method provided in the above embodiments.
[0110] Furthermore, electronic devices also include: Communication interface 703 is used for communication between memory 701 and processor 702.
[0111] Memory 701 is used to store computer instructions that can be executed on processor 702.
[0112] The memory 701 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk storage device.
[0113] The processor 702 is used to implement the data processing method of the above embodiments when executing a program.
[0114] If the memory 701, processor 702, and communication interface 703 are implemented independently, then the communication interface 703, memory 701, and processor 702 can be interconnected via a bus to complete communication between them. The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of representation, Figure 7 The bus is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.
[0115] Optionally, in a specific implementation, if the memory 701, processor 702, and communication interface 703 are integrated on a single chip, then the memory 701, processor 702, and communication interface 703 can communicate with each other through an internal interface.
[0116] The processor 702 may be a central processing unit (CPU), an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of this application.
[0117] This application also proposes a computer program product that implements the data processing method of the embodiments of this application when the instruction processor in the computer program product is executed.
[0118] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0119] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this application, "multiple" means at least two, such as two, three, etc., unless otherwise explicitly specified.
[0120] Any process or method description in the flowchart or otherwise herein can be understood as representing a module, segment, or portion of code comprising one or more executable instructions for implementing custom logic functions or processes, and the scope of the preferred embodiments of this application includes additional implementations in which functions may be performed not in the order shown or discussed, including substantially simultaneously or in reverse order depending on the functions involved, as should be understood by those skilled in the art to which embodiments of this application pertain.
[0121] The logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of computer-readable media include: an electrical connection having one or more wires (electronic device), a portable computer disk drive (magnetic device), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which programs can be printed, because programs can be obtained electronically, for example, by optically scanning the paper or other media, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.
[0122] It should be understood that various parts of this application can be implemented using hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented using software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.
[0123] Those skilled in the art will understand that all or part of the steps of the methods described in the above embodiments can be implemented by a program instructing related hardware. The program can be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
[0124] Furthermore, the functional units in the various embodiments of this application can be integrated into a processing module, or each unit can exist physically separately, or two or more units can be integrated into a module. The integrated module can be implemented in hardware or as a software functional module. If the integrated module is implemented as a software functional module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
[0125] The storage medium mentioned above can be a read-only memory, a disk, or an optical disk, etc. Although embodiments of this application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting this application. Those skilled in the art can make changes, modifications, substitutions, and variations to the above embodiments within the scope of this application.
Claims
1. A data processing method, characterized in that, The method, applied to a central processing unit (CPU) in a GPU server, includes: Obtain a sample data access request, wherein the sample data access request includes a target data index; Based on the target data index and hash operation strategy, a target reading process is selected from multiple candidate reading processes; The target sample data corresponding to the target data index is read from the memory through the target reading process; The target sample data is sent to the party requesting the sample data access request.
2. The method according to claim 1, characterized in that, The step of selecting a target reading process from multiple candidate reading processes based on the target data index and hash operation strategy includes: The target data index is hashed according to the hash operation strategy to obtain a hash value; Determine the first candidate read process among the plurality of candidate read processes that corresponds to the hash value; The first candidate reading process is determined as the target reading process.
3. The method according to claim 1, characterized in that, The step of reading the target sample data corresponding to the target data index from the memory through the target reading process includes: The target data index is sent to the target reading process via the Object File System (OFS) to instruct the target reading process to read the target sample data from the memory according to the target data index and store it in the OFS system; The target sample data in the OFS system is obtained by querying the OFS system according to the target data index.
4. The method according to claim 1, characterized in that, The CPU includes a memory pool for storing multiple candidate sample data and corresponding candidate data indices. Before selecting a target reading process from multiple candidate reading processes based on the target data index and a hash operation strategy, the method further includes: Query the memory pool based on the target data index to determine whether the target data index exists in the memory pool; If the target data index exists in the memory pool, the target sample data corresponding to the target data index is read from the memory pool.
5. The method according to claim 4, characterized in that, The method further includes: If the prefetching conditions are met, determine at least one candidate data index to be prefetched; Obtain candidate sample data corresponding to the at least one candidate data index; The at least one candidate data index and the candidate sample data corresponding to the candidate data index are stored in the memory pool.
6. The method according to claim 5, characterized in that, The prefetching conditions include at least one of the following: the model is in the initialization phase of training; the number of candidate data indices in the memory pool is less than or equal to a first quantity threshold; and the continuous duration during which candidate sample data in the memory pool has not been read is greater than or equal to a first duration.
7. The method according to claim 5, characterized in that, The determination of at least one candidate data index to be prefetched includes: Determine whether a historical sample data access request has been received; In the absence of a request to access historical sample data, at least one candidate data index to be prefetched is randomly selected from the various data indices in the memory. Upon receiving the historical data access request, the at least one candidate data index to be prefetched is predicted based on the historical data index in the historical data access request.
8. The method according to claim 5, characterized in that, The step of obtaining candidate sample data corresponding to the at least one candidate data index includes: For the at least one candidate data index, determine a first number of prefetch threads to be invoked and a second number of candidate data indices to be processed by each prefetch thread; The first number of prefetch threads are invoked, and the at least one candidate data index is allocated to the first number of prefetch threads according to the second number, so as to obtain candidate sample data corresponding to the at least one candidate data index.
9. The method according to claim 8, characterized in that, The prefetch thread's processing method for the allocated candidate data index includes: For candidate data indexes, the first candidate reading process is selected from multiple candidate reading processes based on a hash operation strategy; The candidate data index is sent to the first candidate reading process through the OFS system to instruct the first candidate reading process to read the candidate sample data from the memory according to the candidate data index and store it in the OFS system; The candidate sample data in the OFS system is obtained by querying the candidate data index.
10. The method according to claim 1, characterized in that, The target sample data includes at least one of the following: sensor data, language command data, and robot motion data; The sensor data includes at least one of the following: robot body state data and environmental state data; the environmental state data includes environmental image data.
11. A data processing apparatus, characterized in that, The device is used in a central processing unit (CPU) of a GPU server, and includes: The first acquisition module is used to acquire a sample data access request, wherein the sample data access request includes a target data index. The selection module is used to select a target reading process from multiple candidate reading processes based on the target data index and the hash operation strategy. The first reading module is used to read the target sample data corresponding to the target data index from the memory through the target reading process; The sending module is used to send the target sample data to the requester of the sample data access request.
12. An electronic device, characterized in that, include: A memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the computer program, implements the data processing method as described in any one of claims 1-10.
13. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the data processing method as described in any one of claims 1-10.
14. A computer program product, characterized in that, It includes a computer program that, when executed by a processor, implements the data processing method according to any one of claims 1-10.