Data importing method and system for distributed sequence list
A distributed sequence table and data import technology, which is applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of long data time and low efficiency, and achieve the effect of improving import speed and saving positioning time
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0029] figure 1 It is a flowchart of the data import method of the distributed sequence table according to the first embodiment of the present invention, such as figure 1 As shown, the data import method of the distributed sequence table includes:
[0030] Step S101: Use the Map function to convert the data to be imported into key-value pairs;
[0031] In each key-value pair, the key is the primary key of the distributed sequence table, and the value is the data content corresponding to the key; the data to be imported can be any form of data, such as text string, binary sequence, etc. The Map function receives the data to be imported and converts it into a number of key-value pairs for output. The key represents the key, and the value is the value, which represents the data content corresponding to the key.
[0032] Step S102: Sort the key-value pairs according to keys;
[0033] All the key-value pairs generated in step S101 are sorted according to the key, so as to ensure the glob...
Embodiment 2
[0045] In order to further improve the data import speed, the first embodiment may be further improved: including: sorting the input key value pairs of each Reduce function and then performing a merge operation; sampling and analyzing the original data. image 3 It is a flow chart of the data import method of the distributed sequence table described in this embodiment, such as image 3 As shown, the data import method of the further improved distributed sequence table includes:
[0046] Step S301, sampling and analyzing the data to be imported;
[0047] In order to fragment the keys in a balanced manner in step S305, and to obtain a relatively balanced load among the last written data storage files, before converting the to-be-imported data into key-value pairs, it may further include: Using a sampling function to sample and analyze the original data, the above method can provide a balanced segmentation interval reference for subsequent steps. For example, in step S305, the key-valu...
Embodiment 3
[0061] Figure 4 It is the structural block diagram of the data import system of the distributed sequence table described in this embodiment, such as Figure 4 As shown, the data import system of the distributed sequence table described in this embodiment includes:
[0062] The key-value pair conversion module 401 is used to convert the data to be imported into key-value pairs using the Map function;
[0063] In each key-value pair, the key is the primary key of the distributed sequence table, and the value is the data content corresponding to the key; the data to be imported can be any form of data, such as text string, binary sequence, etc. The Map function receives the data to be imported and converts it into a number of key-value pairs for output. The key represents the key, and the value is the value, which represents the data content corresponding to the key.
[0064] The sorting module 402 is configured to sort the key-value pairs generated by the key-value pair conversion mo...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com